1.52k likes | 1.72k Views
A Proteomics Toolkit:. UniProt, InterPro and IntAct Databases at the EBI. Hinxton,U.K. EMBL. GenBank. EBI (EMBL). NCBI (NIH). DDBJ. CIB (NIG). European Bioinformatics Institute. (http://www.ebi.ac.uk/). Created as part of the EMBL in 1992
E N D
A Proteomics Toolkit: UniProt, InterPro and IntAct Databases at the EBI
EMBL GenBank EBI (EMBL) NCBI (NIH) DDBJ CIB (NIG) European Bioinformatics Institute (http://www.ebi.ac.uk/) • Created as part of the EMBL in 1992 • To house EMBL Nucleotide Sequence Data Library established in 1980 Today, 3 databases accept primary nucleotide data:
European Bioinformatics Institute (http://www.ebi.ac.uk/) EMBL-EBI maintains the world’s most comprehensive range of molecular databases
Nucleotide Sequence Database Automatic Annotation of Genomes Alternative Transcript Diversity ArrayExpress Alternative Splicing Database Protein Sequence Database Molecular Structure Database Database of Protein Families and Domains Chemical Entities of Biological Interest Enzyme Database Protein Interaction Database Gene Ontology Database of Biological Processes
Roles of Public Domain Databases To provide stable, long-term sources of basic information To react in the long-term for the needs of the community To act as repositories for published information To bridge the gap between multiple data sources
Protein Databases UniProtDatabase of Protein Sequences InterPro Database of Protein Families and Domains IntAct Database of Protein Interactions
UniProt A central repository of protein sequence and function World's most comprehensive catalogue of information on proteins Based on the original work of PIR, Swiss-Prot and TrEMBL Funded mainly by NIH
protein sequencing Met-Gln-Pro-Glu-Glu-Gly-Thr-Gly-Trp-Leu-Leu-Glu-Val-Gln-Gln- Met-Gly-Arg-Gly-Arg-Cys-Val-Gly-Pro-Ser-Leu-Gln-Glu-Trp-Arg- Swiss-Prot annotation EMBL CGCTGTGATAGCGCTGATCGTGATGCGTATGCAGGTCGT CGCGCCTGTACGCTGAACGCTCGTGACGTGTAGTGCGCG nucleotide sequencing
UniProt TrEMBL PSD annotation + translated EMBL annotation PIR Swiss-Prot EMBL EBI CGCTGTGATAGCGCTGATCGTGATGCGTATGCAGGTCGT CGCGCCTGTACGCTGAACGCTCGTGACGTGTAGTGCGCG nucleotide sequencing
UniProt 3 Components: • UniProt Knowledgebase(UniProt) • UniProt Reference Clusters (UniRef) • UniProt Archive (UniParc)
UniProt 3 Components: • UniProt Knowledgebase(UniProt) • Central repository for annotated protein sequences • UniProt Reference Clusters (UniRef) • UniProt Archive (UniParc)
UniProt 3 Components: • UniProt Knowledgebase(UniProt) • Central repository for annotated protein sequences • Swiss-Prot: non-redundant, manually annotated • TrEMBL: redundant, automatically annotated • UniProt Reference Clusters (UniRef) • UniProt Archive (UniParc)
UniProt 3 Components: • UniProt Knowledgebase(UniProt) • Central repository for annotated protein sequences • Swiss-Prot: non-redundant, manually annotated • TrEMBL: redundant, automatically annotated • UniProt Reference Clusters (UniRef) • Combines related sequences for speed searching • UniProt Archive (UniParc)
UniProt 3 Components: • UniProt Knowledgebase(UniProt) • Central repository for annotated protein sequences • Swiss-Prot: non-redundant, manually annotated • TrEMBL: redundant, automatically annotated • UniProt Reference Clusters (UniRef) • Combines related sequences for speed searching • UniRef100, UniRef90, UniRef50 • UniProt Archive (UniParc)
UniProt 3 Components: • UniProt Knowledgebase(UniProt) • Central repository for annotated protein sequences • Swiss-Prot: non-redundant, manually annotated • TrEMBL: redundant, automatically annotated • UniProt Reference Clusters (UniRef) • Combines related sequences for speed searching • UniRef100, UniRef90, UniRef50 • UniProt Archive (UniParc) • Comprehensive repository for history of sequences
2D-gel Electrophoresis ANU-2DPAGE Aarhus/Ghent-2DPAGE COMPLUYEAST-2DPAGE ECO2DPAGE HSC-2DPAGE MAIZE-2DPAGE OGP PHCI-2DPAGE PMMA-2DPAGE Rat-heart-2DPAGE Siena-2DPAGE SWISS-2DPAGE Sequence EMBL/GenBank/DDBJ PIR Organism-Specific AGD dbSNP DictyBase EcoGene EchoBASE FlyBase GeneDB_Spombe GeneFarm Genew Gramene HIV H-InvDB LegioList Leproma ListiList MaizeDB MGD MypuList OMIM PhotoList Reactome RGD SagaList SGD StyGene SubtiList TAIR TIGR TubercuList WormBase WormPep ZFIN Databases cross-referenced in UniProt Domains, Sites, Families Gene3D HAMAP InterPro PANTHER Pfam PIRSF PRINTS ProDom PROSITE SMART TIGRFAM UniProt Explicit Links Miscellaneous Ensembl GermOnline Gene Ontology MEROPS PTM GlycoSuiteDB PhosSite Structure HSSP PDB MSD Molecular Interaction IntAct TRANSFAC
Searching UniProt Search tools include: • Text Search • Power Search • Blast, Fasta and MPsrch • Links to extra search services (including SRS) http://www.ebi.uniprot.org/index.shtml
http://www.ebi.uniprot.org/index.shtml • Text-based searching • Logical operators ‘&’ (and), ‘|’ (or) • (Wildcards and numerical operators not allowed) • Text Search – keyword queries • Power Search – can search for specific entry lines • Warehouse Search – link query to other databases
Each linked to the UniProt entry Text Search Results
Sequence-based searching • BLAST, Fasta, MPsrch
View alignments Identity score UniProt entry Sequence Search Results
Use Venn diagrams to combine, intersect, or subtract multiple data sets Build complex data sets
UniProt/Swiss-Prot entry for human ubiquitin-protein ligase E3 mdm2
Merged entries: • Remove redundancy • Can still be searched Some literature search engines pull synonyms from UniProt for more complete searching
Summary of nucleotide data upon which entry is originally based Structural data associated with entry protein
All the interactions with entry protein IntAct Database
Literature citation used for curation Taxonomic Reference Experimental information Experimental name Experimental technique: co-immunoprecipitation Links to interacting protein Interaction information
Displays interactions graphically IntAct Database
View all GO interactions involving MDM2 View all 7 interactions involving MDM2
Expand graph to see network surrounding one protein Expand graph to see entire network View all InterPro entries associated with MDM2
View all proteins in a network associated with a specific GO term View interactions associated with both MDM2 and p53