Database Links | Print |

DNA Sequence Databases

Protein Sequence and Structure Databases

  • The ExPASy (Expert Protein Analysis System) proteomics server of the Swiss Institute of Bioinformatics

    (SIB) is dedicated to the analysis of protein sequences and structures as well as 2-D PAGE.

  • UniProt/TrEMBL - Translated EMBL. UniProt/TrEMBL is a computer-annotated protein sequence database complementing the UniProt/Swiss-Prot Protein Knowledgebase.

  • Protein Information Resource (PIR), located at Georgetown University Medical Center (GUMC), is an integrated public bioinformatics resource that supports genomic and proteomic research and scientific studies

  • UniProt - Universal Protein Resource UniProt (Universal Protein Resource) is the world's most comprehensive catalog of information on proteins. It is a central repository of protein sequence and function created by joining the information contained in Swiss-Prot, TrEMBL, and PIR.

  • UniRef - The UniProt NREF (Non-redundant REFerence) database.

    The two major objectives of UniRef are:(i) to facilitate sequence merging in UniProt, and (ii) to allow faster and more informative sequence similarity searches.

  • Protein Data Bank - The PDB is the single worldwide repository for the processing and distribution of 3-D structure data of large molecules of proteins and nucleic acids.

  • NCBI Structures Group - maintains MMDB, a database of macromolecular 3D structures, as well as tools for their visualization and comparative analysis. MMDB, the Molecular Modeling Database, contains experimentally determined biopolymer structures obtained from the Protein Data Bank (PDB).

Carbohydrate Structure

EST Databases

  • Espressed Sequence Tags Databases - dbEST (Nature Genetics 4:332-3;1993) is a division of GenBank that contains sequence data and other information on "single-pass" cDNA sequences, or Expressed Sequence Tags, from a number of organisms.

  • TIGR Gene Indices

  • BodyMap - BodyMap is a data bank of expression information of human and mouse genes, novel or known, in various tissues or cell types and various timings. Human Bodymapping was started in 1991 (PMID: 1345164; UI:94258199), and Mouse Mapping was started to cover the difficult materials, such as embryo and developping brain, in 1993 (PMID: 8863742; UI: 97017141).

  • Genome Survey Sequences Databases (dbGSS) - The GSS division of GenBank is similar to the EST division, with the exception that most of the sequences are genomic in origin, rather than cDNA (mRNA).

SNP and STS Databases

  • Single Nucleotide Polymorphism (dbSNP) - NCBI. In collaboration with the National Human Genome Research Institute, The National Center for Biotechnology Information has established the dbSNP database to serve as a central repository for both single base nucleotide subsitutions and short deletion and insertion polymorphisms.

  • Human Genome Variation Database (HGVbase) - The objective of HGVbase (the Human Genome Variation Database) is to provide an accurate, high utility and ultimately fully comprehensive catalog of normal human gene and genome variation, useful as a research tool to help define the genetic component of human phenotypic variation. All records are highly curated and annotated, ensuring maximal utility and data accuracy.

  • Sequence Tagged Sites (dbSTS) - NCBI. dbSTS is an NCBI resource that contains sequence and mapping data on short genomic landmark sequences or Sequence Tagged Sites (Olsen, et al., 1989).

Genome Databases

  • Genomes at NCBI - Genomics is a new and fascinating area of biology, enabled through the large-scale DNA sequencing efforts of many public and private organizations.

  • Genomes Online Database (GOLD) - GOLD: Genomes Online Database, is a World Wide Web resource for comprehensive access to information regarding complete and ongoing genome projects around the world.

  • TIGR's Genome Projects - TIGR's Genome Projects are a collection of curated databases containing DNA and protein sequence, gene expression, cellular role, protein family, and taxonomic data for microbes, plants and humans.

  • Ensembl Genome Browser- Ensembl is a joint project between EMBL - EBI and the Sanger Institute to develop a software system which produces and maintains automatic annotation on metazoan genomes.

  • ARKdb - The ARKdb database system provides comprehensive public repositories for genome mapping data from farmed and other animal species, providing a resource similar in function to that offered by GDB or MGD for human or mouse mapping data respectively.

  • UCSC Genome Bioinformatics - This site contains the reference sequence for the human and C. elegans genomes and working drafts for the chimpanzee, mouse, rat, chicken, Fugu, Drosophila, C. briggsae, yeast, and SARS genomes. It also shows the CFTR (cystic fibrosis) region in 13 species.

  • Saccharomyces Genome Database (SGD) - SGDTM is a scientific database of the molecular biology and genetics of the yeast Saccharomyces cerevisiae, which is commonly known as baker's or budding yeast.

  • TAIR - The Arabidopsis Information Resource (TAIR) provides a comprehensive resource for the scientific community working with Arabidopsis thaliana, a widely used model plant.

  • Gramene is a curated, open-source, Web-accessible data resource for comparative genome analysis in the grasses.

  • WormBase - The Biology and Genome of C. elegans.

  • FlyBase - A Database of the Drosophila Genome.

  • Mouse Genome Informatics (MGI) - Mouse Genome Informatics (MGI) provides integrated access to data on the genetics, genomics, and biology of the laboratory mouse.

  • The Genome Database (GDB) - An international collaboration in support of the Human Genome Project.

  • Gene Lynx - GeneLynx is a portal to a collection of hyperlinks for each gene within three different genomes.