1 / 33

Today’s menu: UniProt - SwissProt/TrEMBL PROSITE Pfam Gene Onltology

Tutorial 7. Protein and Function Databases. Today’s menu: UniProt - SwissProt/TrEMBL PROSITE Pfam Gene Onltology. Glossary.

ahorner
Download Presentation

Today’s menu: UniProt - SwissProt/TrEMBL PROSITE Pfam Gene Onltology

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Tutorial 7 Protein and Function Databases Today’s menu: UniProt - SwissProt/TrEMBL PROSITE Pfam Gene Onltology

  2. Glossary Domain- A structural unit which can be found in multiple protein contexts.Motif- A short unit found outside globular domains.Repeat- A short unit which is unstable in isolation but forms a stable structure when multiple copies are present.Family- A collection of related proteins.

  3. UniProt http://www.uniprot.org/ The Universal Protein Resource (UniProt) is a central Repository of protein sequence, function,classification,and cross reference. It was created by Joining the information contained in Swiss-Prot and TrEMBL.

  4. Hypothetical proteins Characterized proteins

  5. Pfam • http://pfam.sanger.ac.uk/ • Pfam is a database of multiple alignments of protein domains or conserved protein regions.

  6. One more example

  7. Description Structure info Gene Ontology Links

  8. What kind of domains can we find in Pfam? Trusted Domains Repeats and Motifs Fragment Domains Nested Domains Disulfide bonds Important residues (e.g active sites) Trans membrane domains

  9. What kind of domains can we find in Pfam? Context domains: are those that despite not scoring above the family threshold are expected to be real, based on the other domains found in the protein. Signal peptides: (indicate a protein that will be secreted) Low complexity regions Coiled Coils: (two or three alpha helices that wind around each other)

  10. http://www.expasy.org/tools/scanprosite ProSite is a database of protein domains and motifsthat can be searched by either regular expression patterns or sequence profiles.

  11. Search Results Domains architecture

  12. PRATT Make a pattern from FASTA format sequences inorder to query Prosite http://www.expasy.ch/tools/pratt/

  13. Greed, Overlap and Include Search A-x(1,3)-A on ABACADAEAFA

  14. Gene Ontology (GO) http://www.geneontology.org/ • It is a database of biological processes, • molecular functions and cellular components. • GO does not contain sequence information nor gene • or protein description. • GO is linked to gene and protein databases. • The GO database is structured as a tree

  15. Three principal branches http://www.geneontology.org/amigo/

  16. GO structure is a Directed Acyclic Graph

  17. Important: note what is the source of the GO entry

  18. GO sources ISS Inferred from Sequence/Structural Similarity IDA Inferred from Direct Assay IPI Inferred from Physical Interaction TAS Traceable Author Statement NAS Non-traceable Author Statement IMP Inferred from Mutant Phenotype IGI Inferred from Genetic Interaction IEP Inferred from Expression Pattern IC Inferred by Curator ND No Data available IEA Inferred from electronic annotation

  19. Interpro http://www.ebi.ac.uk/interpro/

More Related