270 likes | 366 Views
Class 3 2009. European Resources Protein Focused. Protein Databases. EBI – European Bioinformatics Institute http://www.ebi.ac.uk/. What is the difference between dealing with nucleotide DBs and protein DBs?. Name & description Gene encoded from Organism Function (only one?)
E N D
Class 3 2009 European Resources Protein Focused
Protein Databases EBI – European Bioinformatics Institute http://www.ebi.ac.uk/
What is the difference between dealing with nucleotide DBs and protein DBs?
Name & description Gene encoded from Organism Function (only one?) Enzyme? Ligands? PTMs? Interactions? Biological processes. Structure. Sequence. Localization More... Protein information
Protein DB -short history Pre-UniProt Swiss-Prot: created in July 1986; since 1987, a collaboration of the SIB and the EMBL/EBI; TrEMBL: created at the EBI in 1996 as a computer-annotated protein sequence database supplementing Swiss-Prot. It was introduced to deal with the increased data flow from genome projects
PIR EBI SIB
The three-layered approach • The UniProt Archive (UniParc) • UniProtKB + all other protein sequences publicly available • Completeness • The UniProt Reference Clusters (UniRef) • Non-redundant views of UniProtKB + selected UniParcsets • Speed • The UniProt Knowledgebase (UniProtKB) • Central database of annotated protein sequences and functional information • UniProtKB/Swiss-Prot + UniProtKB/TrEMBL
Protein DBs • Swiss-Prot - manually annotated. • TrEMBL – translated EMBL, automatically annotated. • UniProtKB – The UniProt Knowledge • UniParc – The Achieve pf UniProt • PIR - Protein Information Resource • UniRef – The UniProt Reference Clusters • PDB – Protein Data Bank – structure • PRIDE – Resource for experimental proteomics (not in this class)
Databases growth www.genome.jp/en/db_growth.html
Protein DBs • Swiss-Prot - manually annotated 2005- ~100,000 2009 - ~400,000
. • TrEMBL – translated EMBL, automatically annotated.
Protein Names Different DBs – different accessions
Protein DBs • Swiss-Prot - manually annotated. • TrEMBL – translated EMBL, automatically annotated. • UniProtKB – The UniProt Knowledge • UniParc – The Achieve pf UniProt • PIR - Protein Information Resource • UniRef – The UniProt Reference Clusters • PDB – Protein Data Bank – structure • PRIDE – Resource for experimental proteomics (not in this class)
More in UniProt a complete annotated protein sequence database
Protein DBs • Swiss-Prot - manually annotated. • TrEMBL – translated EMBL, automatically annotated. • UniProtKB – The UniProt Knowledge • UniParc – The Achieve pf UniProt • PIR - Protein Information Resource • UniRef – The UniProt Reference Clusters • PDB – Protein Data Bank – structure • PRIDE – Resource for experimental proteomics (not in this class)
What’s in UniProt? http://beta.uniprot.org/
PIR – Protein Information Resource Integrated Protein Literature, Information and Knowledge Protein Family Classification System Integrated Protein Knowledgebase