1 / 9

GENBANK, SWISSPROT AND OTHERS

GENBANK, SWISSPROT AND OTHERS. As Problem Sources for CSE 549 Andriy Tovkach Genetics. GENBANK OVERVIEW. Consists of EMBL, NCBI and DDBJ Started 10 years ago Exponential growth ( graph ) On Saturday, the 7 th – 20.2 billion bases. FILE FORMAT. Header Features Sequence ( see files ).

karen-levy
Download Presentation

GENBANK, SWISSPROT AND OTHERS

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. GENBANK, SWISSPROT AND OTHERS As Problem Sources for CSE 549 Andriy Tovkach Genetics

  2. GENBANK OVERVIEW • Consists of EMBL, NCBI and DDBJ • Started 10 years ago • Exponential growth (graph) • On Saturday, the 7th – 20.2 billion bases

  3. FILE FORMAT • Header • Features • Sequence (see files)

  4. FASTA FORMAT • Single line description begins with > • Followed by sequence data • Can be both protein or DNA

  5. ENTREZ as RETRIEVAL SYSTEM • PubMed – 12 million citations from life science journals • Nucleotide – collection of DNA sequences • Protein – protein sequences from SwissProt • Genome – genomes of over 800 organisms • Also Structure, PopSet, Taxonomy, OMIM

  6. PROTEIN DATABASES • SWISS-PROT • EBI – TREMBL • NCBI – GENPEPT (already in history)

  7. GENOME DATABASES • SGD: • homepage • example 1.1 • example 1.2 • Wormbase • Ensembl Human Genome Browser

  8. CONCLUSIONS • Sequencing projects produce a lot of data • These data have at least to be structured in the databases • Ideally all sequences need high-quality human annotation • That’s why computer scientists are welcome in biology

  9. LITERATURE • Genebank presentation by Manpreet Katari (CSE 549, Fall 2000) • Thomas Lengauer (Ed.) Bioinformatics – From Genomes to Drugs • Entrez website • Google

More Related