380 likes | 431 Views
Tutorial 7. Protein and Function Databases. Today’s menu: UniProt - SwissProt/TrEMBL PROSITE Pfam Gene Onltology. Hypothetical proteins. Characterized proteins. UniProt. http://www.uniprot.org/. The Universal Protein Resource (UniProt) is a central Repository of protein sequence,
E N D
Tutorial 7 Protein and Function Databases Today’s menu: UniProt - SwissProt/TrEMBL PROSITE Pfam Gene Onltology
Hypothetical proteins Characterized proteins
UniProt http://www.uniprot.org/ The Universal Protein Resource (UniProt) is a central Repository of protein sequence, function,classification,and cross reference. It was created by Joining the information contained in Swiss-Prot and TrEMBL.
Pfam • http://pfam.sanger.ac.uk/ • Pfam is a database of multiple alignments of protein domains or conserved protein regions.
Description Structure info Gene Ontology Links
What kind of domains can we find in Pfam? Trusted Domains Repeats and Motifs Fragment Domains Nested Domains Disulfide bonds Important residues (e.g active sites) Trans membrane domains
What kind of domains can we find in Pfam? Context domains: are those that despite not scoring above the family threshold are expected to be real, based on the other domains found in the protein. Signal peptides: (indicate a protein that will be secreted) Low complexity regions Coiled Coils: (two or three alpha helices that wind around each other)
http://www.expasy.org/tools/scanprosite ProSite is a database of protein domains and motifsthat can be searched by either regular expression patterns or sequence profiles.
Search Results Domains architecture
PRATT Make a pattern from FASTA format sequences inorder to query Prosite http://www.expasy.ch/tools/pratt/
Greed, Overlap and Include Search A-x(1,3)-A on ABACADAEAFA
Gene Ontology (GO) http://www.geneontology.org/ • It is a database of biological processes, • molecular functions and cellular components. • GO does not contain sequence information nor gene • or protein description. • GO is linked to gene and protein databases. • The GO database is structured as a tree
Three principal branches http://www.geneontology.org/amigo/
GO structure is a Directed Acyclic Graph
GO sources ISS Inferred from Sequence/Structural Similarity IDA Inferred from Direct Assay IPI Inferred from Physical Interaction TAS Traceable Author Statement NAS Non-traceable Author Statement IMP Inferred from Mutant Phenotype IGI Inferred from Genetic Interaction IEP Inferred from Expression Pattern IC Inferred by Curator ND No Data available IEA Inferred from electronic annotation
Interpro http://www.ebi.ac.uk/interpro/
Exercize 1. Find the accession number of the gene PRP in Human using Uniprot ? What is this gene?
2. Use the accession number to search PFAM ? What domains did you find ?
3. a. Double Click on the Prion domain? b. Choose the “Alignments” option from the left tool bar. c. Press the “View” bottom to see the alignments. d. Copy the alignments using the following manuals:
4. Use the alignments as input for PRATT http://www.expasy.ch/tools/pratt/ To find a motif to scan PROSITE. How many results did you find using PROSITE?