390 likes | 526 Views
Alex Bateman , Rob Finn, Jaina Mistry, John Tate William Mifsud, Matthew Bashton, Nicola Kerrison, David Waterfield, Simon Moxon, Lachlan Coin, Dave Studholme, Corin Yeats, Ewan Birney, Kevin Howe, Nina Mian, Lorenzo Cerrutti, Sam Griffiths-Jones, Mhairi Marshall,
E N D
Alex Bateman, Rob Finn, Jaina Mistry, John Tate William Mifsud, Matthew Bashton, Nicola Kerrison, David Waterfield, Simon Moxon, Lachlan Coin, Dave Studholme, Corin Yeats, Ewan Birney, Kevin Howe, Nina Mian, Lorenzo Cerrutti, Sam Griffiths-Jones, Mhairi Marshall, Richard Durbin, Sean Eddy, Erik Sonnhammer.
Summary • Introduction to Pfam • Interacting Pfam • Pfam Clans
Pfam contains: SEED alignment representative members Profile-HMM HMMer-2.0 Search UniProt FULL alignment Manually curated Automatically made
Pfam contains: SEED alignment representative members Profile-HMM HMMer-2.0 Search UniProt FULL alignment Manually curated Automatically made
Ins3 Ins2 Ins1 E(Di) Mat3 Mat2 B(Di) Mat4 Mat1 Del3 Del2 Del4 Del1 Profile-hidden Markov model “What makes HMMs so popular is that the name is so tantalising. Something is hidden and we’re finding it, and we have a Russian name to help us.” David Lipman, Science 273:590 (1996)
Pfam contains: SEED alignment representative members Profile-HMM HMMer-2.0 Search UniProt FULL alignment Manually curated Automatically made
Coverage • Retire sometime between Sept 2012 and May 2033!
Summary • Introduction to Pfam • Interacting Pfam • Pfam Clans
Summary • Introduction to Pfam • Interacting Pfam • Pfam Clans
Protein Pfam classification Protein fold, etc.
Protein Pfam classification Protein fold, etc. Family
Protein Pfam classification Clan Protein fold, etc. Family
Clan # domains Increase in domains EGF 6,834 15% Cupin 2,059 33% αβ - hydrolase 5,070 22% Rossmann 26,000 23% Ion channel 2,390 23% Clans increase coverage
Where do Clans come from • SCOP • Literature • PRC • Overlaps • Almost overlap?
SCOOP Simple COmparison of Output Program Q68FW8.1 BIN3_RAT BRIDGING INTEGRATOR 3. 48.8 6.5e-09 1 Q4SUL0.1 Q4SUL0_TETNG CHROMOSOME UNDETERMINED SCAF1 47.8 1.3e-08 1 Q9U8G7.1 P29_ECHGR HYDATID DISEASE DIAGNOSTIC ANTIG 45.9 4.7e-08 1 Q9NQY0.1 BIN3_HUMAN BRIDGING INTEGRATOR 3. 45.3 7.5e-08 1 Q53HW0.1 Q53HW0_HUMAN BRIDGING INTEGRATOR 3 VARIANT 43.8 2e-07 1 Q4X124.1 Q4X124_ASPFU BAR DOMAIN PROTEIN. 42.3 6e-07 1 Q5PPJ9.1 Q5PPJ9_RAT SH3-DOMAIN GRB2-LIKE ENDOPHILIN 33.6 0.00024 1 Q5B0R1.1 Q5B0R1_EMENI HYPOTHETICAL PROTEIN. 32.1 0.0007 1 Q54NT2.1 Q54NT2_DICDI HYPOTHETICAL PROTEIN. 28.9 0.0016 1 Q54IA6.1 Q54IA6_DICDI HYPOTHETICAL PROTEIN. 24.9 0.0026 1 Q4ILG8.1 Q4ILG8_GIBZE HYPOTHETICAL PROTEIN. 23.4 0.0031 1 Q2H0J0.1 Q2H0J0_CHAGB HYPOTHETICAL PROTEIN. 21.9 0.0038 1 Q7PQ28.2 Q7PQ28_ANOGA ENSANGP00000003419 (FRAGMENT) 18.6 0.0056 1 Q6CDU5.1 Q6CDU5_YARLI YARROWIA LIPOLYTICA CHROMOSOM 17.2 0.0066 1 Q6GLN1.1 Q6GLN1_XENLA LOC443708 PROTEIN (FRAGMENT). 9.0 0.018 1 Q5B6S6.1 Q5B6S6_EMENI HYPOTHETICAL PROTEIN. 7.5 0.021 1 Q6BUW9.1 Q6BUW9_DEBHA SIMILAR TO CA2976|IPF14676 CA 6.0 0.026 1 Q9CZV7.1 Q9CZV7_MOUSE 10 DAYS EMBRYO WHOLE BODY CDN 4.6 0.03 1 Q54MM9.1 Q54MM9_DICDI HYPOTHETICAL PROTEIN. 4.4 0.031 1 Q4WGC9.1 Q4WGC9_ASPFU RHO GUANYL NUCLEOTIDE EXCHANG 4.3 0.031 1 Q3ZTT7.1 Q3ZTT7_HUMAN SH3 DOMAIN BINDING PROTEIN 1. 3.8 0.033 1 Q6ZT62.1 Q6ZT62_HUMAN CDNA FLJ44925 FIS, CLONE BRAM 3.8 0.033 1 Q6CWD9.1 Q6CWD9_KLULA SIMILARITIES WITH CA|CA1531|C 3.4 0.035 1 Q50SW1.1 Q50SW1_ENTHI SH3 DOMAIN PROTEIN. 1.5 0.044 1 Q5KKC1.1 Q5KKC1_CRYNE CYTOPLASM PROTEIN, PUTATIVE. 0.8 0.048 1 Q55VW8.1 Q55VW8_CRYNE HYPOTHETICAL PROTEIN. 0.8 0.048 1 Q5A473.1 Q5A473_CANAL HYPOTHETICAL PROTEIN. 0.0 0.053 1 Q84Y94.1 Q84Y94_BRARP 14-3-3-LIKE PROTEIN (FRAGMENT -1.4 0.058 1 Q50TU0.1 Q50TU0_ENTHI SH3 DOMAIN PROTEIN. -1.9 0.066 1 Q5APM4.1 Q5APM4_CANAL HYPOTHETICAL PROTEIN. -3.1 0.076 1 Q5AQ66.1 Q5AQ66_CANAL HYPOTHETICAL PROTEIN. -3.1 0.076 1 Q8I1D4.1 Q8I1D4_DROVI RHOGAP92B-PA. -3.6 0.081 1 Q59EQ2.1 Q59EQ2_HUMAN TYROSINE 3-MONOOXYGENASE/TRYP 47.2 1.9e-08 1 Q8MWE2.1 Q8MWE2_PENMO 14-3-3 ZETA-LIKE TYPE II (FRA 46.0 4.6e-08 1 Q8MWE3.1 Q8MWE3_PENMO 14-3-3 ZETA-LIKE TYPE I (FRAG 45.2 8.1e-08 1 Q5D9G8.1 Q5D9G8_SCHJA HYPOTHETICAL PROTEIN. 38.6 7.5e-06 1 Q207N5.1 Q207N5_ICTPU TYROSINE 3-MONOOXYGENASE/TRYP 38.4 8.5e-06 1 Q2F831.1 Q2F831_HUMAN TYROSINE 3-MONOOXYGENASEA/TRY 35.7 5.5e-05 1 Q7RBZ5.1 Q7RBZ5_PLAYO 14-3-3 PROTEIN (FRAGMENT). 28.5 0.0036 1 Q5TX33.1 Q5TX33_ANOGA ENSANGP00000026241 (FRAGMENT) 22.7 0.0096 1 Q8BWN0.1 Q8BWN0_MOUSE ADULT PANCREAS ISLET CELLS CD 14.0 0.041 1 Q4ABV7.1 Q4ABV7_BRARP 4D11_30. 7.5 0.12 1 Q23GC1.1 Q23GC1_TETTH HYPOTHETICAL PROTEIN. 4.6 0.2 1 Q42058.2 Q42058_ARATH PROTEIN KINASE INHIBITOR (FRA 2.8 0.26 1 Q5NVL5.1 Q5NVL5_PONPY HYPOTHETICAL PROTEIN DKFZP459 -6.0 1.2 1 Q7RA01.1 Q7RA01_PLAYO 14-3-3 PROTEIN 3. -10.3 2.3 1 Q53RR5.1 Q53RR5_HUMAN HYPOTHETICAL PROTEIN YWHAQ (F -12.3 3.3 1 Q4VY19.1 Q4VY19_HUMAN TYROSINE 3-MONOOXYGENASE/TRYP -12.4 3.3 1 Q84Y94.1 Q84Y94_BRARP 14-3-3-LIKE PROTEIN (FRAGMENT -14.4 4.6 1 Q42053.1 Q42053_ARATH PROTEIN KINASE INHIBITOR (FRA -17.7 8.2 1 Q9XEW3.1 Q9XEW3_9ROSI 14-3-3 PROTEIN (FRAGMENT). -20.6 13 1 Q8ILM3.1 Q8ILM3_PLAF7 HYPOTHETICAL PROTEIN. -22.4 18 1
SCOOP Simple COmparison of Output Program Q84Y94.1 Q84Y94_BRARP 14-3-3-LIKE PROTEIN (FRAGMENT -14.4 4.6 1 Q84Y94.1 Q84Y94_BRARP 14-3-3-LIKE PROTEIN (FRAGMENT -1.4 0.058 1
Novel relationships • DMA related to CUE • DUF442 related to tyr phosphatase • DUF970 related to P-II • DUF316 related to Trypsin (but inactive) • DUF283 related to dsrm
Domains in RNAi AF1318 PAZ DEAD RNAse 3 RNAse 3 DUF283 dsrm Dicer
Conclusions - 1 • Pfam helps to keep up with the deluge • iPfam contains detailed Protein interactions • Pfam Clans provides a higher level of classification & coverage