590 likes | 741 Views
Outline. Homology. Function annotation transfer. Domains. Protein-family Databases. Profile-HMMs. Pfam database. How to build a new ( Pfam ) protein family. EMBO Workshop, Cape Town, 2014. H omology. EMBO Workshop, Cape Town, 2014. Definition:
E N D
Outline • Homology • Function annotation transfer • Domains • Protein-family Databases • Profile-HMMs • Pfam database • How to build a new (Pfam) protein family EMBO Workshop, Cape Town, 2014
Homology EMBO Workshop, Cape Town, 2014
Definition: Two proteins are homologous if they share a common ancestor, i.e. they are evolutionary related EMBO Workshop, Cape Town, 2014
B A C C B B A B A A homologous homologous homologous homologous homologous Symmetric AND Transitive
Detecting homology EMBO Workshop, Cape Town, 2014
Sequence similarity Human: 1 MGLSDGEWQLVLNVWGKVEADIPGHGQEVLIRLFKGHPETLEKFDKFKHLKSEDEMKASE 60 MGLSDGEWQLVLNVWGKVEAD GHGQEVLI LFK HPETL KFDKFK LKSE MK SE Mouse: 1 MGLSDGEWQLVLNVWGKVEADLAGHGQEVLIGLFKTHPETLDKFDKFKNLKSEEDMKGSE 60 Human: 61 DLKKHGATVLTALGGILKKKGHHEAEIKPLAQSHATKHKIPVKYLEFISECIIQVLQSKH 120 DLKKHG TVLTALG ILKKKG H AEI PLAQSHATKHKIPVKYLEFISE II VL H Mouse: 61 DLKKHGCTVLTALGTILKKKGQHAAEIQPLAQSHATKHKIPVKYLEFISEIIIEVLKKRH 120 Human: 121 PGDFGADAQGAMNKALELFRKDMASNYKELGFQG 154 GDFGADAQGAM KALELFR D A YKELGFQG Mouse: 121 SGDFGADAQGAMSKALELFRNDIAAKYKELGFQG 154 By excess similarity (see Pearson CurrProtocBioinformatics 2013) Statistical significance (e.g. E-values) EMBO Workshop, Cape Town, 2014
Structural similarity 2G2X: 1 MAYWLMKSEPDELSIEALARLGEARWDGVRNYQARNFLRAMSVGDEFFFYH-----SSCP 55 MAYWL D W Y N VGD Y 2P5D: 4 MAYWLCITNEDNWKVIKEKKI----WGVAERY--KNTINKVKVGDKLIIYEIQRSGKDYK 57 2G2X: 56 QPGIAGIARITRAAYPD------PTALDPESHY 82 P I G Y D PT P 2P5D: 58 PPYIRGVYEVVSEVYKDSSKIFKPTPRNPNEKF 90 Excess sequence similarity? EMBO Workshop, Cape Town, 2014
Structural similarity 2P5D 2G2X EMBO Workshop, Cape Town, 2014
Structural similarity 2P5D 2G2X EMBO Workshop, Cape Town, 2014
Structural similarity 2P5D 2G2X Z-score = 12.2 RMSD = 2.9 Lali = 122 %id =20 DALI: http://ekhidna.biocenter.helsinki.fi/dali_lite/start
Genome context http://www.microbesonline.org EMBO Workshop, Cape Town, 2014
Genomic context http://www.microbesonline.org EMBO Workshop, Cape Town, 2014
Genomic context http://www.microbesonline.org EMBO Workshop, Cape Town, 2014
Origins of homology in proteins EMBO Workshop, Cape Town, 2014
Origin of homology in proteins • Speciation (orthology) • Gene duplication (paralogy) • Horizontal gene transfer (xenology) • Whole genome duplication (ohnology) • Gametology EMBO Workshop, Cape Town, 2014
Orthology Myoglobin: Serves as a reserve supply of oxygen and facilitates the movement of oxygen within muscles. EMBO Workshop, Cape Town, 2014
Origin of protein homology • Speciation (orthology) • Gene duplication (paralogy) • Horizontal gene transfer (xenology) • Whole genome duplication (ohnology) • Gametology EMBO Workshop, Cape Town, 2014
Paralogy Myoglobin: Serves as a reserve supply of oxygen and facilitates the movement of oxygen within muscles. Hemoglobin: Oxygen-transport protein in red-blood cells of vertebrates EMBO Workshop, Cape Town, 2014
C B Myo Hemo Myo Ancestral Globin A EMBO Workshop, Cape Town, 2014
C B Myo Hemo Myo Ancestral Globin A EMBO Workshop, Cape Town, 2014
C B Hemo Hemo Myo Myo Hemo Myo Ancestral Globin A EMBO Workshop, Cape Town, 2014
Origin of protein homology • Speciation (orthology) • Gene duplication (paralogy) • Horizontal gene transfer (xenology) • Whole genome duplication (ohnology) • Gametology, Synology EMBO Workshop, Cape Town, 2014
Homology: why bother? Slide courtesy of Alex Mitchell (EMBL-EBI) EMBO Workshop, Cape Town, 2014
Homology: why bother? Structure (homology modeling) Homology Function? EMBO Workshop, Cape Town, 2014
Protein function(s) Schubert et al. Nat. Struct. Biol. 5 (1998) EMBO Workshop, Cape Town, 2014
The Gene Ontology (GO) • A way to capture biological knowledge in a written and computable form • A set of concepts • and their relationships • to each other www.ebi.ac.uk/QuickGO Slide courtesy of Alex Mitchell (EMBL-EBI) EMBO Workshop, Cape Town, 2014
GO: 3 ontologies in 1 • protein kinase activity • insulin receptor activity An elemental activity or task or job 1. Molecular Function 2. Biological Process A commonly recognised series of events • cell division • mitochondrion • mitochondrial matrix • mitochondrial inner membrane 3. Cellular Component Where a gene product is located Slide courtesy of Alex Mitchell (EMBL-EBI) EMBO Workshop, Cape Town, 2014
Protein Families EMBO Workshop, Cape Town, 2014
Globins in Human http://www.studyblue.com/notes/note/n/exam-3/deck/8955883
Definition: We call ‘family’ a group of evolutionary related proteins or protein regions EMBO Workshop, Cape Town, 2014
Why protein families? A P EMBO Workshop, Cape Town, 2014
Why protein families? Human: 1 MGLSDGEWQLVLNVWGKVEADIPGHGQEVLIRLFKGHPETLEKFDKFKHLKSEDEMKASE 60 MGLSDGEWQLVLNVWGKVEAD GHGQEVLI LFK HPETL KFDKFK LKSE MK SE Mouse: 1 MGLSDGEWQLVLNVWGKVEADLAGHGQEVLIGLFKTHPETLDKFDKFKNLKSEEDMKGSE 60 Human: 61 DLKKHGATVLTALGGILKKKGHHEAEIKPLAQSHATKHKIPVKYLEFISECIIQVLQSKH 120 DLKKHG TVLTALG ILKKKG H AEI PLAQSHATKHKIPVKYLEFISE II VL H Mouse: 61 DLKKHGCTVLTALGTILKKKGQHAAEIQPLAQSHATKHKIPVKYLEFISEIIIEVLKKRH 120 Human: 121 PGDFGADAQGAMNKALELFRKDMASNYKELGFQG 154 GDFGADAQGAM KALELFR D A YKELGFQG Mouse: 121 SGDFGADAQGAMSKALELFRNDIAAKYKELGFQG 154 EMBO Workshop, Cape Town, 2014
Why protein families? Human: 1 MGLSDGEWQLVLNVWGKVEADIPGHGQEVLIRLFKGHPETLEKFDKFKHLKSEDEMKASE 60 MGLSDGEWQLVLNVWGKVEAD GHGQEVLI LFK HPETL KFDKFK LKSE MK SE Mouse: 1 MGLSDGEWQLVLNVWGKVEADLAGHGQEVLIGLFKTHPETLDKFDKFKNLKSEEDMKGSE 60 Human: 61 DLKKHGATVLTALGGILKKKGHHEAEIKPLAQSHATKHKIPVKYLEFISECIIQVLQSKH 120 DLKKHG TVLTALG ILKKKG H AEI PLAQSHATKHKIPVKYLEFISE II VL H Mouse: 61 DLKKHGCTVLTALGTILKKKGQHAAEIQPLAQSHATKHKIPVKYLEFISEIIIEVLKKRH 120 Human: 121 PGDFGADAQGAMNKALELFRKDMASNYKELGFQG 154 GDFGADAQGAM KALELFR D A YKELGFQG Mouse: 121 SGDFGADAQGAMSKALELFRNDIAAKYKELGFQG 154 EMBO Workshop, Cape Town, 2014
B C A D P E H F G EMBO Workshop, Cape Town, 2014
We can detect functionally important residues EMBO Workshop, Cape Town, 2014
We can detect functionally important residues EMBO Workshop, Cape Town, 2014
We have a window open on evolutionary diversity Human: 1 MGLSDGEWQLVLNVWGKVEADIPGHGQEVLIRLFKGHPETLEKFDKFKHLKSEDEMKASE 60 MGLSDGEWQLVLNVWGKVEAD GHGQEVLI LFK HPETL KFDKFK LKSE MK SE Mouse: 1 MGLSDGEWQLVLNVWGKVEADLAGHGQEVLIGLFKTHPETLDKFDKFKNLKSEEDMKGSE 60 Human: 61 DLKKHGATVLTALGGILKKKGHHEAEIKPLAQSHATKHKIPVKYLEFISECIIQVLQSKH 120 DLKKHG TVLTALG ILKKKG H AEI PLAQSHATKHKIPVKYLEFISE II VL H Mouse: 61 DLKKHGCTVLTALGTILKKKGQHAAEIQPLAQSHATKHKIPVKYLEFISEIIIEVLKKRH 120 Human: 121 PGDFGADAQGAMNKALELFRKDMASNYKELGFQG 154 GDFGADAQGAM KALELFR D A YKELGFQG Mouse: 121 SGDFGADAQGAMSKALELFRNDIAAKYKELGFQG 154 EMBO Workshop, Cape Town, 2014
Example (using homology for protein annotation) EMBO Workshop, Cape Town, 2014
H. influenzae protein (3M71) 1.20 Å Chen et al. Nature 467 (2010) TUM, January 2013 EMBO Workshop, Cape Town, 2014 New York Consortium on Membrane Protein Structure (NYCOMPS)
Thomine and Barbier-BrygooNature 467:1058-59 (2010) EMBO Workshop, Cape Town, 2014
Thomine and Barbier-BrygooNature 467:1058-59 (2010) EMBO Workshop, Cape Town, 2014
Chen et al. Nature 467 (2010) EMBO Workshop, Cape Town, 2014
Chen et al. Nature 467 (2010) EMBO Workshop, Cape Town, 2014