1 / 19

Readings for this week

Readings for this week. Gogarten et al Horizontal gene transfer….. Francke et al. Reconstructing metabolic networks…. Sign up for meeting next week for proposal feedback/progress checkup. Inferring protein function. By genomic context…………. Inferring protein function. By homology…….

teryl
Download Presentation

Readings for this week

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Readings for this week Gogarten et al Horizontal gene transfer…..Francke et al. Reconstructing metabolic networks….. Sign up for meeting next week for proposal feedback/progress checkup

  2. Inferring protein function By genomic context………….

  3. Inferring protein function By homology……

  4. COGs—Clusters of Orthologous Groups (Eukaryotic versions are KOGs) Identified using all-all against all sequence comparisons on collection of complete genomes. Includes genes with orthologous and paralogous relationshipsCOGS are grouped into large scale functional categories

  5. Domains--Conserved structural entities with distinctive secondary structure content and an hydrophobic core Example: Protein kinase domain Looking at Parts of Proteins Motifs-- A pattern of amino acids that is conserved across many proteins and confers a particular function on the protein.Example: Zinc finger CX2-4C....HX2-4H

  6. How to identify domains? PFAM—Protein Families Database Based on Hidden Markov Models (HMM) statistical probability models of multiple sequence alignmentsUses a seed alignment of manually curated alignments (PFAM-A)Based on these alignments a Position Specific Scoring Matrix (PSSM) is created

  7. Position Specific Scoring Matrix (PSSM)

  8. PFAM—Protein Families Database Searching a protein against PFAM results in an E value with meaning similar to BLAST evalues (the probability that a sequence would score that well for that domain by chance)

  9. Other Protein Databases SMART—uses HMMs, focus is signalling and regulatory proteins (tend to be more divergent than enzymes)TIGR FAMs– TIGR curated alignments used to generated HMMs, one advantage is names should be functionally accurate for all proteins they representPRINTS—not HMM based, uses “fingerprints” of conserved motifs Ecumenical solution—InterPro—collection of multiple databases under one umbrella

  10. Still more kinds of BLAST PSI-BLAST– Position Specific Iterated BLAST Use to: find members of a protein family or build a custom position-specific score matrix most sensitive BLAST program, making it useful for finding very distantly related proteins or new members of a protein family 1st round: Standard BLASTP search, then a PSSM is built with all hits with E values better than inclusion threshold 2nd round: PSSM is used to evaluate the alignment in this search. Additional hits better than inclusion threshold are incorporated into an updated PSSM 3rd + rounds: as second round. Search reaches convergence when no new hits are found. Can save PSSM for use in later searching

  11. Still more kinds of BLAST PHI-BLAST– Pattern Hit Initiated BLAST Find proteins similar to the query around a given pattern Must enter both a query sequence containing the pattern AND a pattern to search on Example Pattern: (easy) FGELA (harder) [LIVMF]-G-E-x-[GAS]-[LIVM]-x(5,11)-R-[STAQ]-A-x-[LIVMA]-x-[STACV] Matching peptide: FGELALMYNTPRAATIVA

  12. Enzyme Nomenclature EC Numbers: A hierachical classification scheme for enzymes enzymes are named and classified according to the reactions they catalyze • Oxidoreductases • Transferases • Hydrolases • Lyases • Isomerases • Ligases

  13. Putting it all together…. KEGG– Kyoto Encyclopedia of Genes and Genomes Collection of manually drawn metabolic/cellular pathway maps, based on most up to date biochemical information Metabolic maps are strongest feature--use EC numbered enzymes as key players, allowing pathways of different genomes to be easily mapped based on their predetermined EC content Also has a growing collection of signalling/cellular process maps

More Related