Bibliography of Year: 2003

  1. Bader, G.D. and Hogue, C.W.. "An automated method for finding molecular complexes in large protein interaction networks." BMC Bioinformatics. 4 (1). 2003. pp. 2.
    [ .pdf ] [ PubMed ] [ WebSite ]

    Background Recent advances in proteomics technologies such as two-hybrid, phage display and mass spectrometry have enabled us to create a detailed map of biomolecular interaction networks. Initial mapping efforts have already produced a wealth of data. As the size of the interaction set increases, databases and computational methods will be required to store, visualize and analyze the information in order to effectively aid in knowledge discovery. Results This paper describes a novel graph theoretic clustering algorithm, "Molecular Complex Detection" (MCODE), that detects densely connected regions in large protein-protein interaction networks that may represent molecular complexes. The method is based on vertex weighting by local neighborhood density and outward traversal from a locally dense seed protein to isolate the dense regions according to given parameters. The algorithm has the advantage over other graph clustering methods of having a directed mode that allows fine-tuning of clusters of interest without considering the rest of the network and allows examination of cluster interconnectivity, which is relevant for protein networks. Protein interaction and complex information from the yeast Saccharomyces cerevisiae was used for evaluation. Conclusion Dense regions of protein interaction networks can be found, based solely on connectivity data, many of which correspond to known protein complexes. The algorithm is not affected by a known high rate of false positives in data from high-throughput interaction techniques. The program is available from ftp://ftp.mshri.on.ca/pub/BIND/Tools/MCODE.


  2. Ellis, L.B., Hou, B.K., Kang, W., and Wackett, L.P.. "The University of Minnesota Biocatalysis/Biodegradation Database: post-genomic data mining." Nucleic Acids Res. 31 (1). 2003. pp. 262-5.
    [ .pdf ] [ PubMed ]

    The University of Minnesota Biocatalysis/Biodegradation Database (UM-BBD, http://umbbd.ahc.umn.edu/) provides curated information on microbial catabolism and related biotransformations, primarily for environmental pollutants. Currently, it contains information on over 130 metabolic pathways, 800 reactions, 750 compounds and 500 enzymes. In the past two years, it has increased its breath to include more examples of microbial metabolism of metals and metalloids; and expanded the types of information it includes to contain microbial biotransformations of, and binding interactions with many chemical elements. It has also increased the ways in which this data can be accessed (mined). Structure-based searching was added, for exact matches, similarity, or substructures. Analysis of UM-BBD reactions has lead to a prototype, guided, pathway prediction system. Guided prediction means that the user is shown all possible biotransformations at each step and guides the process to its conclusion. Mining the UM-BBD's data provides a unique view into how the microbial world recycles organic functional groups. UM-BBD users are encouraged to comment on all aspects of the database, including the information it contains and the tools by which it can be mined. The database and prediction system develop under the direction of the scientific community.


  3. Yeh, I., Karp, P.D., Noy, N.F., and Altman, R.B.. "Knowledge acquisition, consistency checking and concurrency control for Gene Ontology (GO)." Bioinformatics. 19 (2). 2003. pp. 241-8.
    [ PubMed ] [ WebSite ]

    Motivation: A critical element of the computational infrastructure required for functional genomics is a shared language for communicating biological data and knowledge. The Gene Ontology (GO; http://www.geneontology.org) provides a taxonomy of concepts and their attributes for annotating gene products. As GO increases in size its ongoing construction and maintenance becomes more challenging. In this paper, we assess the applicability of a Knowledge Base Management System (KBMS), Protege-2000, to the maintenance and development of GO. Results: We transferred GO to Protege-2000 in order to evaluate its suitability for GO. The graphical user interface supported browsing and editing of GO. Tools for consistency checking identified minor inconsistencies in GO and opportunities to reduce redundancy in its representation. The Protege Axiom Language proved useful for checking ontological consistency. The PROMPT tool allowed us to track changes to GO. Using Protege-2000, we tested our ability to make changes and extensions to GO to refine the semantics of attributes and classify more concepts. Availability: Gene Ontology in Protege-2000 and the associated code are located at http://smi.stanford.edu/projects/helix/gokbms/. Protege-2000 is available from http://protege.stanford.edu. Contact: russ.altman