Bibliography of: Genome

  1. Ellis, L.B., Hershberger, C.D., and Wackett, L.P.. "The University of Minnesota Biocatalysis/Biodegradation database: microorganisms, genomics and prediction." Nucleic Acids Res. 28 (1). 2000. pp. 377-9.
    [ .pdf ] [ PubMed ]

    The University of Minnesota Biocatalysis/Biodegradation Database (http://www.labmed.umn.edu/umbbd/ ) begins its fifth year having met its initial goals. It contains approximately 100 pathways for microbial catabolic metabolism of primarily xenobiotic organic compounds, including information on approximately 650 reactions, 600 compounds and 400 enzymes, and containing approximately 250 microorganism entries. It includes information on most known microbial catabolic reaction types and the organic functional groups they transform. Having reached its first goals, it is ready to move beyond them. It is poised to grow in many different ways, including mirror sites; fold prediction for its sequenced enzymes; closer ties to genome and microbial strain databases; and the prediction of biodegradation pathways for compounds it does not contain.

    Keywords: Biodegradation ; Catalysis ; *Databases Factual ; *Genome ; *Microbiology


  2. Kanehisa, M. and Goto, S.. "KEGG: kyoto encyclopedia of genes and genomes." Nucleic Acids Res. 28 (1). 2000. pp. 27-30.
    [ .pdf ] [ PubMed ]

    KEGG (Kyoto Encyclopedia of Genes and Genomes) is a knowledge base for systematic analysis of gene functions, linking genomic information with higher order functional information. The genomic information is stored in the GENES database, which is a collection of gene catalogs for all the completely sequenced genomes and some partial genomes with up-to-date annotation of gene functions. The higher order functional information is stored in the PATHWAY database, which contains graphical representations of cellular processes, such as metabolism, membrane transport, signal transduction and cell cycle. The PATHWAY database is supplemented by a set of ortholog group tables for the information about conserved subpathways (pathway motifs), which are often encoded by positionally coupled genes on the chromosome and which are especially useful in predicting gene functions. A third database in KEGG is LIGAND for the information about chemical compounds, enzyme molecules and enzymatic reactions. KEGG provides Java graphics tools for browsing genome maps, comparing two genome maps and manipulating expression maps, as well as computational tools for sequence comparison, graph comparison and path computation. The KEGG databases are daily updated and made freely available (http://www. genome.ad.jp/kegg/).

    Keywords: *Databases Factual ; Gene Expression ; *Genome ; Human ; Information Storage and Retrieval ; Proteins_genetics ; Proteins_metabolism


  3. Kanehisa, M., Goto, S., Kawashima, S., and Nakaya, A.. "The KEGG databases at GenomeNet." Nucleic Acids Res. 30 (1). 2002. pp. 42-6.
    [ .pdf ] [ PubMed ]

    The Kyoto Encyclopedia of Genes and Genomes (KEGG) is the primary database resource of the Japanese GenomeNet service (http://www.genome.ad.jp/) for understanding higher order functional meanings and utilities of the cell or the organism from its genome information. KEGG consists of the PATHWAY database for the computerized knowledge on molecular interaction networks such as pathways and complexes, the GENES database for the information about genes and proteins generated by genome sequencing projects, and the LIGAND database for the information about chemical compounds and chemical reactions that are relevant to cellular processes. In addition to these three main databases, limited amounts of experimental data for microarray gene expression profiles and yeast two-hybrid systems are stored in the EXPRESSION and BRITE databases, respectively. Furthermore, a new database, named SSDB, is available for exploring the universe of all protein coding genes in the complete genomes and for identifying functional links and ortholog groups. The data objects in the KEGG databases are all represented as graphs and various computational methods are developed to detect graph features that can be related to biological functions. For example, the correlated clusters are graph similarities which can be used to predict a set of genes coding for a pathway or a complex, as summarized in the ortholog group tables, and the cliques in the SSDB graph are used to annotate genes. The KEGG databases are updated daily and made freely available (http://www.genome.ad.jp/kegg/).

    Keywords: Computational Biology ; Computer Graphics ; *Databases Genetic ; *Databases Protein ; Gene Expression Profiling ; *Genome ; Human ; Information Storage and Retrieval ; Internet ; Macromolecular Systems ; Metabolism_genetics ; Multigene Family ; Protein Conformation ; Proteins_chemistry ; Proteins_genetics ; Proteins_metabolism ; Sequence Homology


  4. Karp, P.D. and Paley, S.M.. "Integrated access to metabolic and genomic data." J Comput Biol. 3 (1). 1996. pp. 191-212.
    [ .pdf ] [ .ps ] [ PubMed ]

    The EcoCyc system consists of a knowledge base (KB) that describes the genes and intermediary metabolism of Escherichia coli, and a graphical user interface (GUI) for accessing that knowledge. This paper addresses two problems: How can we create a GUI that provides integrated access to metabolic and genomic data? We describe the design and implementation of visual presentations that closely mimic those found in the biology literature, and that offer hypertext navigation among related entities, and multiple views of the same entity. We employ a frame knowledge representation system (FRS) called HyperTHEO to manage the EcoCyc knowledge base. Among the advantages of FRSs are an expressive data model for capturing the complexities of biological information, and schema-evolution capabilities that facilitate the constant schema changes that biological databases tend to undergo. HyperTHEO also includes rule-based inference facilities that are the foundation of expert systems, a constraint language for maintaining data integrity, and a declarative query language. A graphic KB editor and browser allow the EcoCyc developers to interactively inspect and modify this evolving KB.

    Keywords: *Artificial Intelligence ; Computer Communication Networks ; Computer Graphics ; Computers ; *Database Management Systems ; Escherichia coli_*genetics ; Escherichia coli_*metabolism ; *Genome ; Bacterial ; Programming Languages ; Systems Integration ; User-Computer Interface


  5. Karp, P.D., Riley, M., Paley, S.M., and Pellegrini-Toole, A.. "The MetaCyc Database." Nucleic Acids Res. 30 (1). 2002. pp. 59-61.
    [ .pdf ] [ PubMed ]

    MetaCyc is a metabolic-pathway database that describes 445 pathways and 1115 enzymes occurring in 158 organisms. MetaCyc is a review-level database in that a given entry in MetaCyc often integrates information from multiple literature sources. The pathways in MetaCyc were determined experimentally and are labeled with the species in which they are known to occur based on literature references examined to date. MetaCyc contains extensive commentary and literature citations. Applications of MetaCyc include pathway analysis of genomes, metabolic engineering and biochemistry education. MetaCyc is queried using the Pathway Tools graphical user interface, which provides a wide variety of query operations and visualization tools. MetaCyc is available via the World Wide Web at http://ecocyc.org/ecocyc/metacyc.html, and is available for local installation as a binary program for the PC and the Sun workstation, and as a set of flatfiles. Contact metacyc-info

    Keywords: Comparative Study ; Database Management Systems ; *Databases Protein ; Enzymes_chemistry ; Enzymes_*metabolism ; Genome ; Human ; Information Storage and Retrieval ; Internet ; *Metabolism


  6. Karp, P.D., Riley, M., Paley, S.M., and Pellegrini-Toole, A.. "EcoCyc: an encyclopedia of Escherichia coli genes and metabolism." Nucleic Acids Res. 24 (1). 1996. pp. 32-9.
    [ .pdf ] [ PubMed ]

    The encyclopedia of Escherichia coli genes and metabolism (EcoCyc) is a database that combines information about the genome and the intermediary metabolism of E.coli. It describes 2034 genes, 306 enzymes encoded by these genes, 580 metabolic reactions that occur in E.coli and the organization of these reactions into 100 metabolic pathways. The EcoCyc graphical user interface allows query and exploration of the EcoCyc database using visualization tools such as genomic map browsers and automatic layouts of metabolic pathways. EcoCyc spans the space from sequence to function to allow investigation of an unusually broad range of questions. EcoCyc can be thought of as both an electronic review article, because of its copious references to the primary literature, and as an in silico model of E.coli that can be probed and analyzed through computational means.

    Keywords: Computer Communication Networks ; *Databases Factual ; Enzymes_metabolism ; Escherichia coli_enzymology ; Escherichia coli_*genetics ; Escherichia coli_*metabolism ; *Genome ; Bacterial ; Information Storage and Retrieval ; Software ; User-Computer Interface


  7. Karp, P.D., Riley, M., Saier, M., Paulsen, I.T., Paley, S.M., and Pellegrini-Toole, A.. "The EcoCyc and MetaCyc databases." Nucleic Acids Res. 28 (1). 2000. pp. 56-9.
    [ .pdf ] [ PubMed ]

    EcoCyc is an organism-specific Pathway/Genome Database that describes the metabolic and signal-transduction pathways of Escherichia coli, its enzymes, and-a new addition-its transport proteins. MetaCyc is a new metabolic-pathway database that describes pathways and enzymes of many different organisms, with a microbial focus. Both databases are queried using the Pathway Tools graphical user interface, which provides a wide variety of query operations and visualization tools. EcoCyc and MetaCyc are available at http://ecocyc.PangeaSystems.com/ecocyc/

    Keywords: Database Management Systems ; *Databases Factual ; Escherichia coli_genetics ; Genome ; Bacterial


  8. Ogata, H., Goto, S., Sato, K., Fujibuchi, W., Bono, H., and Kanehisa, M.. "KEGG: Kyoto Encyclopedia of Genes and Genomes." Nucleic Acids Res. 27 (1). 1999. pp. 29-34.
    [ .pdf ] [ PubMed ]

    Kyoto Encyclopedia of Genes and Genomes (KEGG) is a knowledge base for systematic analysis of gene functions in terms of the networks of genes and molecules. The major component of KEGG is the PATHWAY database that consists of graphical diagrams of biochemical pathways including most of the known metabolic pathways and some of the known regulatory pathways. The pathway information is also represented by the ortholog group tables summarizing orthologous and paralogous gene groups among different organisms. KEGG maintains the GENES database for the gene catalogs of all organisms with complete genomes and selected organisms with partial genomes, which are continuously re-annotated, as well as the LIGAND database for chemical compounds and enzymes. Each gene catalog is associated with the graphical genome map for chromosomal locations that is represented by Java applet. In addition to the data collection efforts, KEGG develops and provides various computational tools, such as for reconstructing biochemical pathways from the complete genome sequence and for predicting gene regulatory networks from the gene expression profiles. The KEGG databases are daily updated and made freely available (http://www.genome.ad.jp/kegg/).

    Keywords: Computational Biology ; *Databases Factual ; Gene Expression ; *Genes ; *Genome ; Ligands ; Metabolism ; Sequence Homology