We have created PhenomicDB, a multi-species genotype/phenotype database, by merging public genotype/phenotype data from a wide range of model-organisms and Homo sapiens. Up to now this data has been available in distinct organism-specific databases: OMIM, MGI, WormBase, FlyBase, CYGD, ZFIN, DictyBase, SGD, and MAtDB. We brought this wealth of data into a single integrated resource by coarse-grained semantic mapping of the phenotypic data fields, by including common gene indexes (NCBI Gene), and by the use of associated orthology relationships (HomoloGene). PhenomicDB is thought as a first step towards comparative phenomics and will improve the understanding of the gene functions by combining the knowledge about phenotypes from several organisms. PhenomicDB has to compromise between data depth, as available in the source databases, and data compatability. It is not intended to compete with the much more dedicated primary source databases but tries to compensate its partial loss of depth by linking back to the primary sources. The basic functional concept of PhenomicDB is an integrated meta-search-engine for phenotypes. Users should be aware that comparison of genotypes or even phenotypes between organisms as different as yeast and man can have serious scientific hurdles. Nevertheless finding that the phenotype of a given mouse gene is described as “similar to psoriasis” and at the same time that the human ortholog has been described as a gene causing skin defects can lead to novelty and interesting hypotheses. Similarly, a gene involved in cancer in mammalian organisms could show a proliferation phenotype in a lower organism such as yeast and thus, give further insights to a researcher.
References to the original data sources:
Hamosh, A., A. F. Scott, J. Amberger, C. Bocchini, D. Valle and V.
A. McKusick (2002). "Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and
genetic disorders. " Nucleic Acids Res 30(1): 52-5.
Blake JA, Richardson JE, Bult CJ, Kadin JA, Eppig JT, and the members
of the Mouse Genome Database Group. 2003. MGD: The Mouse Genome Database. Nucleic
Acids Res 31: 193-195.
Todd W. Harris, Nansheng Chen, Fiona Cunningham, Marcela Tello-Ruiz,
Igor Antoshechkin, Carol Bastiani, Tamberlyn Bieri, Darin Blasiar, Keith Bradnam,
Juancarlos Chan, Chao-Kung Chen, Wen J. Chen, Paul Davis,Eimear Kenny, Ranjana
Kishore, Daniel Lawson, R aymond Lee, Hans-Michael Muller, Cecilia Nakamura,Philip
Ozersky, Andrei Petcherski, Anthony Rogers, Aniko Sabo, Erich M. Schwarz, Kimberly
Van Auken,Qinghua Wang, Richard Durbin, John Spieth, Paul W. Sternberg and
Lincoln D. Stein (2004). WormBase: a multi-species resource for nematode biology
and genomics Nucleic Acids Research 32:D411-D417.
The FlyBase Consortium (2003). The FlyBase database of the Drosophila
genome projects and community literature. Nucleic Acids Research 31:172-175.
Schoof H, Zaccaria P, Gundlach H, Lemcke K, Rudd S, Kolesov G, Arnold
R, Mewes HW, Mayer KF. MIPS Arabidopsis thaliana Database (MAtDB): an integrated
biological knowledge resource based on the first complete plant genome. Nucleic
Acids Res. 2002 Jan 1;30(1):91-3.
Mewes HW, Amid C, Arnold R, Frishman D, Guldener U, Mannhaupt G, Munsterkotter
M, Pagel P, Strack N, Stumpflen V, Warfsmann J, Ruepp A. MIPS: analysis and
annotation of proteins from whole genomes. Nucleic Acids Research Jan 1;32
Database issue:D41-4 (2004)
Wheeler, D. L., D. M. Church, R. Edgar, S. Federhen, W. Helmberg, T. L. Madden, J. U. Pontius, G. D. Schuler, L. M. Schriml, E. Sequeira, T. O. Suzek, T. A. Tatusova and L. Wagner (2004). "Database resources of the National Center for Biotechnology Information: update." Nucleic Acids Res 32 Database issue: D35-40.
Pruitt, K. D. and D. R. Maglott (2001). "RefSeq and LocusLink: NCBI gene-centered resources." Nucleic Acids Res 29(1): 137-40.
Sprague, J., Doerry, E., Douglas, S. and Westerfield, M. (2001). The Zebrafish Information Network (ZFIN): a resource for genetic, genomic and developmental research. Nucleic Acids Res. 29, 87-90.