310 likes | 447 Views
Challenges and accomplishments in molecular prediction. Yanay Ofran. accumulation of data not knowledge. >70 million (as of 4-2007). Off chart since 1997. DNA. RNA. Protein. Structure. Function. Central dogma – it’s all in the sequence. - Structure - Function - Interaction. PDB.
E N D
Challenges and accomplishments in molecular prediction Yanay Ofran
accumulation of data not knowledge >70 million (as of 4-2007) Off chart since 1997
DNA RNA Protein Structure Function Central dogma – it’s all in the sequence
PDB similar structure dissimilar structure Annotation transfer: structure Rost (1999) Protein Engineering 12: 85-94
35% 150aa Structure prediction by homology P1 P2 >P2 METLILTQEEVESLISMDEAMNAVEEAFRLYALGKAQMPPKV YLEFEKGDLRAMPAHLMGYAGLKWVNSHPGNPDKGLPTVMAL MILNSPETGFPLAVMDATYTTSLRTGAAGGIAAKYL >P1 MEDLVSVGITHKEAEVEELEKARFESDEAVRDIVESFGLSGS VLLQTSNRVEVYASGARDRAEELGDLIHDDAWVKRGSEAVRH LFRVASGLESMMVGEQEILRQVKKAYDRAARLGTLDEALKIV FRRAINLGKRAREETRISEGAVSI Score = 83.2 bits (205), Expect = 9e-17 Identities = 18/101 (X%), Positives = 36/101 (35%), Gaps = 2/101 (1%) Query: 111 AAGGIAAKYLARKNSSVFGFIGCGTQAYFQLEALRRVFDIGEVKAYDVREKAAKKF 170 AA +A + L + +G G ++L + V + + A + Sbjct: 153 AAVELAERELGSLHDKTVLVVGAGEMGKTVAKSLVD-RGVRAVLVANRTYERAVEL 211 Query: 171 EDRGISASVQPAEEASRCDVLVTTTPSRKPVVKAEWVEEGT 211 + + +R DV+V+ T + PV+ + V E Sbjct: 212 GGEAVRFDE-LVDHLARSDVVVSATAAPHPVIHVDDVREAL 251
40% 50aa Structure prediction by homology P2 P1 >P2 MLELLPTAVEGVSQAQITGRPEWIWLALGTALMGLGTLYFLV KGMGVSDPDAKKFYAITTLVPAIAFTMYLSMLLGYGLTMVPF GGEQNPIYWARYADWLFTTPLLLLDLALLVDADQGTILALVG ADGIMIGTGLVGALTK >P1 MEDLVSVGITHKEAEVEELEKARFESDEAVRDIVESFGLSGS VLLQTSNRVEVYASGARDRAEELGDLIHDDAWVKRGSEAVRH LFRVASGLESMMVGEQEILRQVKKAYDRAARLGTLDEALKIV FRRAINLGKRAREETRISEGAVSI Score = 33.9 bits (77), Expect = 0.068 Identities = 14/58 (y%), Positives = 28/58 (48%), Gaps = 2/58 (3%) Query: 178 SVQPAEEASRCDVLVTTTPSRKPVVKAEWVEEGTHINAIGADGPGKQELD-VEILKKA 234 + EE ++ D+LV T + +VK EW++ G + G + ++ E ++A Sbjct: 198 TAHLDEEVNKGDILVVATGQPE-MVKGEWIKPGAIVIDCGINYKVVGDVAYDEAKERA 254
Structure prediction from sequence Liu & Rost (2002) Bioinformatics 18: 922-933
Annotation Transfer Annotation transfer: Function E val %seq id. Hssp val Rost et al. (2003) CMLS 60:2637-2650
Annotation transfer: interaction Protein A and protein B bind each other. Do A’ and B’, their respective homologues, interact as well? Mika et al. (2006) PLoS CB
Annotation Transfer Limit of annotation transfer Seq ID Blind annotation transfer 100% structure Function interaction Ab initio 0%
Annotation Transfer Template and model
Annotation Transfer Limit of annotation transfer
Annotation Transfer Some methods can do it
Local vs. non local interaction Levinthal “Paradox”: • A protein with 100 amino acid has ~1048 possible conformations => calculation unfeasible. • Let’s assumes (generously): A protein can sample 1014 structures per second. It would take this protein about 1034 seconds ~ 1026 years to try out all the possible conformations. (Time since the big bang ~1010 years).
Annotation Transfer Low RMSD Witsow & Piatigorsky (1999) Science
Annotation Transfer High RMSD Chymotrypsin(5cha) Subtilin(5sic)
Annotation Transfer Challenges for next CASP • modeling the structure of single-residue mutants. • modeling structure changes associated with specificity changes within protein families. • devise scoring functions that will reliably pick the most accurate models from a set of candidate structures produced by current new fold methods.
Annotation Transfer Olympic games of predictions Structure – CASP Interaction – CAPRI Function – AFP, CASP
Combine interaction + seq. analysis to predict function Li et al (2005) Nature biotech
Combine interaction + seq. analysis to predict function Li et al (2005) Nature biotech
Predicting DNA binding sites Ofran et al (2007) in press
Predicting DNA binding sites Ofran et al (2007) in press
c-Myb + C/EBPβ bound to DNA