Baker, P.G., Brass, A., Bechhofer, S., Goble, C.A., Paton, N.W., and Stevens, R.. "TAMBIS--Transparent Access to Multiple Bioinformatics Information Sources." Proc Int Conf Intell Syst Mol Biol.
vol. 6.
1998.
pp. 25-34.
[ .pdf ] [ .ps ] [ PubMed ] [ WebSite ]
The TAMBIS project aims to provide transparent access to disparate biological databases and analysis tools, enabling users to utilize a wide range of resources with the minimum of effort. A prototype system has been developed that includes a knowledge base of biological terminology (the biological Concept Model), a model of the underlying data sources (the Source Model) and a 'knowledge-driven' user interface. Biological concepts are captured in the knowledge base using a description logic called GRAIL. The Concept Model provides the user with the concepts necessary to construct a wide range of multiple-source queries, and the user interface provides a flexible means of constructing and manipulating those queries. The Source Model provides a description of the underlying sources and mappings between terms used in the sources and terms in the biological Concept Model. The Concept Model and Source Model provide a level of indirection that shields the user from source details, providing a high level of source transparency. Source independent, declarative queries formed from terms in the Concept Model are transformed into a set of source dependent, executable procedures. Query formulation, translation and execution is demonstrated using a working example.
Keywords: Artificial Intelligence ; *Computational Biology ; Databases Factual ; User-Computer Interface
Baker, P.G., Goble, C.A., Bechhofer, S., Paton, N.W., Stevens, R., and Brass, A.. "An ontology for bioinformatics applications." Bioinformatics. 15
(6).
1999.
pp. 510-20.
[ .pdf ] [ .ps ] [ PubMed ] [ WebSite ]
MOTIVATION: An ontology of biological terminology provides a model of biological concepts that can be used to form a semantic framework for many data storage, retrieval and analysis tasks. Such a semantic framework could be used to underpin a range of important bioinformatics tasks, such as the querying of heterogeneous bioinformatics sources or the systematic annotation of experimental results. RESULTS: This paper provides an overview of an ontology [the Transparent Access to Multiple Biological Information Sources (TAMBIS) ontology or TaO] that describes a wide range of bioinformatics concepts. The present paper describes the mechanisms used for delivering the ontology and discusses the ontology's design and organization, which are crucial for maintaining the coherence of a large collection of concepts and their relationships. AVAILABILITY: The TAMBIS system, which uses a subset of the TaO described here, is accessible over the Web via http://img.cs.man.ac.uk/tambis (although in the first instance, we will use a password mechanism to limit the load on our server). The complete model is also available on the Web at the above URL.
Keywords: Classification ; *Computational Biology ; Databases Factual ; Expert Systems ; Models Biological
Ellis, L.B., Hershberger, C.D., Bryan, E.M., and Wackett, L.P.. "The University of Minnesota Biocatalysis/Biodegradation Database: emphasizing enzymes." Nucleic Acids Res. 29
(1).
2001.
pp. 340-3.
[ .pdf ] [ PubMed ]
The University of Minnesota Biocatalysis/Biodegradation Database (UM-BBD, http://umbbd.ahc.umn.edu/) provides curated information on microbial catabolic enzymes and their organization into metabolic pathways. Currently, it contains information on over 400 enzymes. In the last year the enzyme page was enhanced to contain more internal and external links; it also displays the different metabolic pathways in which each enzyme participates. In collaboration with the Nomenclature Commission of the International Union of Biochemistry and Molecular Biology, 35 UM-BBD enzymes were assigned complete EC codes during 2000. Bacterial oxygenases are heavily represented in the UM-BBD; they are known to have broad substrate specificity. A compilation of known reactions of naphthalene and toluene dioxygenases were recently added to the UM-BBD; 73 and 108 were listed respectively. In 2000 the UM-BBD is mirrored by two prestigious groups: the European Bioinformatics Institute and KEGG (the Kyoto Encyclopedia of Genes and Genomes). Collaborations with other groups are being developed. The increased emphasis on UM-BBD enzymes is important for predicting novel metabolic pathways that might exist in nature or could be engineered. It also is important for current efforts in microbial genome annotation.
Keywords: Bacteria_genetics ; Bacteria_metabolism ; Biodegradation ; Catalysis ; *Databases Factual ; Enzymes_genetics ; Enzymes_*metabolism ; Fungi_genetics ; Fungi_metabolism ; Information Storage and Retrieval ; Internet
Ellis, L.B., Hershberger, C.D., and Wackett, L.P.. "The University of Minnesota Biocatalysis/Biodegradation database: microorganisms, genomics and prediction." Nucleic Acids Res. 28
(1).
2000.
pp. 377-9.
[ .pdf ] [ PubMed ]
The University of Minnesota Biocatalysis/Biodegradation Database (http://www.labmed.umn.edu/umbbd/ ) begins its fifth year having met its initial goals. It contains approximately 100 pathways for microbial catabolic metabolism of primarily xenobiotic organic compounds, including information on approximately 650 reactions, 600 compounds and 400 enzymes, and containing approximately 250 microorganism entries. It includes information on most known microbial catabolic reaction types and the organic functional groups they transform. Having reached its first goals, it is ready to move beyond them. It is poised to grow in many different ways, including mirror sites; fold prediction for its sequenced enzymes; closer ties to genome and microbial strain databases; and the prediction of biodegradation pathways for compounds it does not contain.
Keywords: Biodegradation ; Catalysis ; *Databases Factual ; *Genome ; *Microbiology
Hofestadt, R. and Meineke, F.. "Interactive modelling and simulation of biochemical networks." Comput Biol Med. 25
(3).
1995.
pp. 321-34.
[ PubMed ]
The analysis of biochemical processes can be supported using methods of modelling and simulation. New methods of computer science are discussed in this field of research. This paper presents a new method which allows the modelling and analysis of complex metabolic networks. Moreover, our simulation shell is based on this formalization and represents the first tool for the interactive simulation of metabolic processes.
Keywords: *Biochemistry ; Cell Communication_physiology ; Databases Factual ; Enzymes_physiology ; Gene Expression_physiology ; Genes Regulator_physiology ; Genetic Diseases Inborn_enzymology ; Genetic Diseases Inborn_genetics ; Genetic Diseases Inborn_metabolism ; *Metabolism ; *Models Chemical ; *Models Genetic ; Probability ; *Software
Hofestadt, R. and Thelen, S.. "Quantitative modeling of biochemical networks." In Silico Biol. 1
(1).
1998.
pp. 39-53.
[ PubMed ] [ WebSite ]
Today different database systems for molecular structures (genes and proteins) and metabolic pathways are available. All these systems are characterized by the static data representation. For progress in biotechnology the dynamic representation of this data is important. The metabolism can be characterized as a complex biochemical network. Different models for the quantitative simulation of biochemical networks are discussed, but no useful formalization is available. This paper shows that the theory of Petrinets is useful for the quantitative modeling of biochemical networks.
Keywords: *Biochemistry ; Biotechnology ; Catalysis ; Computational Biology ; *Computer Simulation ; Databases Factual ; Glycolysis ; Models Biological ; Protein Engineering
Kanehisa, M. and Goto, S.. "KEGG: kyoto encyclopedia of genes and genomes." Nucleic Acids Res. 28
(1).
2000.
pp. 27-30.
[ .pdf ] [ PubMed ]
KEGG (Kyoto Encyclopedia of Genes and Genomes) is a knowledge base for systematic analysis of gene functions, linking genomic information with higher order functional information. The genomic information is stored in the GENES database, which is a collection of gene catalogs for all the completely sequenced genomes and some partial genomes with up-to-date annotation of gene functions. The higher order functional information is stored in the PATHWAY database, which contains graphical representations of cellular processes, such as metabolism, membrane transport, signal transduction and cell cycle. The PATHWAY database is supplemented by a set of ortholog group tables for the information about conserved subpathways (pathway motifs), which are often encoded by positionally coupled genes on the chromosome and which are especially useful in predicting gene functions. A third database in KEGG is LIGAND for the information about chemical compounds, enzyme molecules and enzymatic reactions. KEGG provides Java graphics tools for browsing genome maps, comparing two genome maps and manipulating expression maps, as well as computational tools for sequence comparison, graph comparison and path computation. The KEGG databases are daily updated and made freely available (http://www. genome.ad.jp/kegg/).
Keywords: *Databases Factual ; Gene Expression ; *Genome ; Human ; Information Storage and Retrieval ; Proteins_genetics ; Proteins_metabolism
Kawashima, T., Kawashima, S., Kanehisa, M., Nishida, H., and Makabe, K.W.. "MAGEST: MAboya gene expression patterns and sequence tags." Nucleic Acids Res. 28
(1).
2000.
pp. 133-5.
[ .pdf ] [ PubMed ]
MAGEST is a database for newly identified maternal cDNAs of the ascidian, Halocynthia roretzi, which aims to examine the population of the mRNAs. We have collected 3' and 5' tag sequences of mRNAs and their expression data from whole-mount in situ hybridi-zation in early embryos. To date, we have determined more than 2000 tag-sequences of H.roretzi cDNAs and input them into public databases. The tag sequences and the expression data as well as additional information can be obtained through MAGEST via the WWW at http://www.genome.ad.jp/magest/
Keywords: DNA Complementary ; *Databases Factual ; *Expressed Sequence Tags ; *Gene Expression ; Information Storage and Retrieval ; Urochordata_*genetics
Karp, P.D., Riley, M., Paley, S.M., and Pellegrini-Toole, A.. "EcoCyc: an encyclopedia of Escherichia coli genes and metabolism." Nucleic Acids Res. 24
(1).
1996.
pp. 32-9.
[ .pdf ] [ PubMed ]
The encyclopedia of Escherichia coli genes and metabolism (EcoCyc) is a database that combines information about the genome and the intermediary metabolism of E.coli. It describes 2034 genes, 306 enzymes encoded by these genes, 580 metabolic reactions that occur in E.coli and the organization of these reactions into 100 metabolic pathways. The EcoCyc graphical user interface allows query and exploration of the EcoCyc database using visualization tools such as genomic map browsers and automatic layouts of metabolic pathways. EcoCyc spans the space from sequence to function to allow investigation of an unusually broad range of questions. EcoCyc can be thought of as both an electronic review article, because of its copious references to the primary literature, and as an in silico model of E.coli that can be probed and analyzed through computational means.
Keywords: Computer Communication Networks ; *Databases Factual ; Enzymes_metabolism ; Escherichia coli_enzymology ; Escherichia coli_*genetics ; Escherichia coli_*metabolism ; *Genome ; Bacterial ; Information Storage and Retrieval ; Software ; User-Computer Interface
Karp, P.D., Riley, M., Paley, S.M., Pellegrini-Toole, A., and Krummenacker, M.. "EcoCyc: Enyclopedia of Escherichia coli Genes and Metabolism." Nucleic Acids Res. 25
(1).
1997.
pp. 43-51.
[ .pdf ] [ PubMed ]
The Encyclopedia of Genes and Metabolism (EcoCyc) is a database that combines information about the genome and the intermediary metabolism of Escherichia coli. It describes 2970 genes of E.coli, 547 enzymes encoded by these genes, 702 metabolic reactions that occur in E.coli and the organization of these reactions into 107 metabolic pathways. The EcoCyc graphical user interface allows scientists to query and explore the EcoCyc database using visualization tools such as genomic-map browsers and automatic layouts of metabolic pathways. EcoCyc spans the space from sequence to function to allow scientists to investigate an unusually broad range of questions. EcoCyc can be thought of as both an electronic review article because of its copious references to the primary literature, and as an in silicio model of E.coli metabolism that can be probed and analyzed through computational means.
Keywords: Amino Acid Sequence ; Base Sequence ; *Databases Factual ; Escherichia coli_*genetics ; Escherichia coli_*metabolism ; *Genes Bacterial ; User-Computer Interface
Karp, P.D., Riley, M., Paley, S.M., Pellegrini-Toole, A., and Krummenacker, M.. "Eco Cyc: encyclopedia of Escherichia coli genes and metabolism." Nucleic Acids Res. 27
(1).
1999.
pp. 55-8.
[ .pdf ] [ PubMed ]
The EcoCyc database describes the genome and gene products of Escherichia coli, its metabolic and signal-transduction pathways, and its tRNAs. The database describes 4391 genes of E.coli, 695 enzymes encoded by a subset of these genes, 904 metabolic reactions that occur in E.coli, and the organization of these reactions into 129 metabolic pathways. The EcoCyc graphical user interface allows scientists to query and explore the EcoCyc database using visualization tools such as genomic-map browsers and automatic layouts of metabolic pathways. EcoCyc has many references to the primary literature, and is a (qualitative) computational model of E. coli metabolism. EcoCyc is available at URL http://ecocyc. PangeaSystems.com/ecocyc/
Keywords: Classification ; *Databases Factual ; Enzymes_genetics ; Enzymes_metabolism ; Escherichia coli_*genetics ; Escherichia coli_*metabolism ; *Genes Bacterial ; Genome Bacterial ; Information Storage and Retrieval ; Internet ; Signal Transduction ; User-Computer Interface
Karp, P.D., Riley, M., Saier, M., Paulsen, I.T., Paley, S.M., and Pellegrini-Toole, A.. "The EcoCyc and MetaCyc databases." Nucleic Acids Res. 28
(1).
2000.
pp. 56-9.
[ .pdf ] [ PubMed ]
EcoCyc is an organism-specific Pathway/Genome Database that describes the metabolic and signal-transduction pathways of Escherichia coli, its enzymes, and-a new addition-its transport proteins. MetaCyc is a new metabolic-pathway database that describes pathways and enzymes of many different organisms, with a microbial focus. Both databases are queried using the Pathway Tools graphical user interface, which provides a wide variety of query operations and visualization tools. EcoCyc and MetaCyc are available at http://ecocyc.PangeaSystems.com/ecocyc/
Keywords: Database Management Systems ; *Databases Factual ; Escherichia coli_genetics ; Genome ; Bacterial
Kuffner, R., Zimmer, R., and Lengauer, T.. "Pathway analysis in metabolic databases via differential metabolic display (DMD)." Bioinformatics. 16
(9).
2000.
pp. 825-36.
[ .pdf ] [ PubMed ] [ WebSite ]
MOTIVATION: A number of metabolic databases are available electronically, some with features for querying and visualizing metabolic pathways and regulatory networks. We present a unifying, systematic approach based on PETRI nets for storing, displaying, comparing, searching and simulating such nets from a number of different sources. RESULTS: Information from each data source is extracted and compiled into a PETRI net. Such PETRI nets then allow to investigate the (differential) content in metabolic databases, to map and integrate genomic information and functional annotations, to compare sequence and metabolic databases with respect to their functional annotations, and to define, generate and search paths and pathways in nets. We present an algorithm to systematically generate all pathways satisfying additional constraints in such PETRI nets. Finally, based on the set of valid pathways, so-called differential metabolic displays (DMDs) are introduced to exhibit specific differences between biological systems, i.e. different developmental states, disease states, or different organisms, on the level of paths and pathways. DMDs will be useful for target finding and function prediction, especially in the context of the interpretation of expression data.
Keywords: *Algorithms ; Catalysis ; Computational Biology_*methods ; Computer Simulation ; *Data Display ; *Databases Factual ; Enzymes_genetics ; Enzymes_metabolism ; Glycolysis ; Metabolism_*physiology ; Mycoplasma_metabolism ; Yeasts_metabolism
Karp, P.D.. "Pathway databases: a case study in computational symbolic theories." Science. 293
(5537).
2001.
pp. 2040-4.
[ .pdf ] [ PubMed ]
A pathway database (DB) is a DB that describes biochemical pathways, reactions, and enzymes. The EcoCyc pathway DB (see http://ecocyc.org) describes the metabolic, transport, and genetic-regulatory networks of Escherichia coli. EcoCyc is an example of a computational symbolic theory, which is a DB that structures a scientific theory within a formal ontology so that it is available for computational analysis. It is argued that by encoding scientific theories in symbolic form, we open new realms of analysis and understanding for theories that would otherwise be too large and complex for scientists to reason with effectively.
Keywords: Artificial Intelligence ; *Computational Biology ; Culture Media ; *Databases Factual ; Escherichia coli_enzymology ; Escherichia coli_*genetics ; Escherichia coli_growth and development ; Escherichia coli_*metabolism ; *Genome Bacterial ; Internet ; Software
Ogata, H., Goto, S., Sato, K., Fujibuchi, W., Bono, H., and Kanehisa, M.. "KEGG: Kyoto Encyclopedia of Genes and Genomes." Nucleic Acids Res. 27
(1).
1999.
pp. 29-34.
[ .pdf ] [ PubMed ]
Kyoto Encyclopedia of Genes and Genomes (KEGG) is a knowledge base for systematic analysis of gene functions in terms of the networks of genes and molecules. The major component of KEGG is the PATHWAY database that consists of graphical diagrams of biochemical pathways including most of the known metabolic pathways and some of the known regulatory pathways. The pathway information is also represented by the ortholog group tables summarizing orthologous and paralogous gene groups among different organisms. KEGG maintains the GENES database for the gene catalogs of all organisms with complete genomes and selected organisms with partial genomes, which are continuously re-annotated, as well as the LIGAND database for chemical compounds and enzymes. Each gene catalog is associated with the graphical genome map for chromosomal locations that is represented by Java applet. In addition to the data collection efforts, KEGG develops and provides various computational tools, such as for reconstructing biochemical pathways from the complete genome sequence and for predicting gene regulatory networks from the gene expression profiles. The KEGG databases are daily updated and made freely available (http://www.genome.ad.jp/kegg/).
Keywords: Computational Biology ; *Databases Factual ; Gene Expression ; *Genes ; *Genome ; Ligands ; Metabolism ; Sequence Homology