220 likes | 379 Views
Find A Homolog in Protein Structure Database ?. Homology Modeling. YES. Secondary Structure Prediction. NO. Homology Modeling from Swiss-Model. Malate dehydrogenase (14 MDH) sequence SEPIRVLVTGAAGQIAYSLLYSIGNGSVFGKDQPIILVLLDITPMMGVLD
E N D
Find A Homolog in Protein Structure Database ? Homology Modeling YES Secondary Structure Prediction NO
Homology Modeling from Swiss-Model Malate dehydrogenase (14 MDH) sequence SEPIRVLVTGAAGQIAYSLLYSIGNGSVFGKDQPIILVLLDITPMMGVLD GVLMELQDCALPLLKDVIATDKEEIAFKDLDVAILVGSMPRRDGMERKDL LKANVKIFKCQGAALDKYAKKSVKVIVVGNPANTNCLTASKSAPSIPKEN FSCLTRLDHNRAKAQIALKLGVTSDDVKNVIIWGNHSSTQYPDVNHAKVK LQAKEVGVYEAVKDDSWLKGEFITTVQQRGAAVIKARKLSSAMSAAKAIC DHVRDIWFGTPEGEFVSMGIISDGNSYGVPDDLLYSFPVTIKDKTWKIVE GLPINDFSREKMDLTAKELAEEKETAFEFLSSA
Finding Appropriate Template from Structure Database High Smallest Poisson Probability Sequences Producing High-scoring Segment Pairs: Score P(N) N 1692 2.9e-230 1 610 1.9e-81 1 604 1.2e-80 1 295 1.2e-70 4 79 0.0014 1 68 0.0012 3 11 BMD: Muscular Dystrophin, Becker types
Using Magic Fit to Align Two Sequences 14MDH 1 S EP IRVLVTG AAGQIAYSLL YS IGNGSVFG KDQP I ILVLL DITPMMGVLD 11BMD 1 KAPVRVAVTG AAGQIGYSLL FRIAAGEMLG KDQPV ILQLL EIPQAMKALE 14MDH 51 GVLM ELQDCA LPLLKDVIAT DKEE I AFKDL DVA ILVGSMP RRDGMERKDL 11BMD 51 GVVMELEDCA FPLLAGLEAT DDPDVAFKDA DYALLVGAAP RKAGMERRDL 14MDH 101 LKANVKIFKC QGAALDKYAK KSVKV IVVGN PANTNCLTAS KSAPS I PKEN 11BMD 101 LQVNGKIFTE QGRALAEVAK KDVKVLVVGN PANTNALIAY KNAPGLNPRN 14MDH 151 FSC LTRLDHN RAKAQ IALKL GVTSDDVKNV I I WGNHSSTQ YPDVNHAKVK 11BMD 151 FTAMTRLDHN RAKAQLAKKT GTGVDR IRRM TVWGNHSSTM FPDLF HAEVD 14MDH 201 LQAKEVGVYE AVKDDSWLKG EFITTVQQRG AAVIKARKLS SAMSAAKAIC 11BMD 201 GRP - - - - ALE LVDME -WYEK VFIPTVAQRG AA I IQARGAS SAASAANAAI 14MDH 251 DHVRDI W –FG TPEG E FVSMG I ISDGNSYGV PDDLLYSFPV TIKDKTWK I V 11BMD 246 EH IRD - WALG TPEGDWVSMA VPSQGE –YGI PEGIVYSFPV TAKDGAYRVV 14MDH 300 EGLP INDFSRE KMDLTAKELA EEKE TAF EFL SSA 11BMD 294 EGLEINEFARK RME ITAQ ELL DEMEQVKALG LI Length = 326, Score = 610 (278.7 bits), Expect = 1.9e-81, P = 1.9e-81, Identities = 178 / 326 (54.6%)
Modifying Sequence Alignment 14MDH 1 S EPIRVLVTG AAGQIAYSLL YS IGNGSVFG KDQPI ILVLL DITPMMGVLD 11BMD 1 KAPVRVAVTG AAGQIGYSLL FRIAAGEMLG KDQPV ILQLL EIPQAMKALE 14MDH 51 GVLM ELQDCA LPLLKDVIAT DKEE I AFKDL DVA ILVGSMP RRDGMERKDL 11BMD 51 GVVMELEDCA FPLLAGLEAT DDPDVAFKDA DYALLVGAAP RKAGMERRDL 14MDH 101 LKANVKIFKC QGAALDKYAK KSVKV IVVGN PANTNCLTAS KSAPSI PKEN 11BMD 101 LQVNGKIFTE QGRALAEVAK KDVKVLVVGN PANTNALIAY KNAPGLNPRN 14MDH 151 FSC LTRLDHN RAKAQ IALKL GVTSDDVKNV I I WGNHSSTQ YPDVNHAKVK 11BMD 151 FTAMTRLDHN RAKAQLAKKT GTGVDR IRRM TVWGNHSSTM FPDLF HAEVD 14MDH 201 LQAKEVGVYE AVKDDSWLKG EFITTVQQRG AAVIKARKLS SAMSAAKAIC 11BMD 201 GRP - - - -ALE LVDME -WYEK VFIPTVAQRG AA I IQARGAS SAASAANAAI 14MDH 251 DHVRD I WFGT PEGE F VSMG I I SDGNSYGVP DDLLYSFPVT I KDKTWK IVE 11BMD 246 EH I RDWALGT PEGDWVSMAV PSQGE –YG IP E GIVYSFPVT AKDGAYRVVE 14MDH 301 GLPINDF SRE KMDLTAKELA EEKETA F E FL SSA 11BMD 295 GLEINEFARK RME I TAQELL DEMEQVKALG LI
Obtaining Atomic Coordinates of The Model ATOM 1 C ACE A 0 11.590 2.938 35.017 1.00 45.90 14B 5 ATOM 2 O ACE A 0 12.581 2.371 35.517 1.00 28.75 14B 6 ATOM 3 CH3 ACE A 0 10.179 2.477 35.417 1.00 36.75 14B 7 ATOM 4 N SER A 1 11.648 3.946 34.081 1.00 49.10 14 341 ATOM 5 CA SER A 1 12.901 4.557 33.573 1.00 52.42 14 342 ATOM 6 C SER A 1 12.733 5.624 32.482 1.00 48.48 14 343 ATOM 7 O SER A 1 13.238 5.432 31.363 1.00 57.03 14 344 ATOM 8 CB SER A 1 13.990 3.553 33.162 1.00 41.45 14 345 ATOM 9 OG SER A 1 15.105 3.679 34.039 1.00 42.59 14 346 ATOM 10 N GLU A 2 12.073 6.774 32.772 1.00 37.72 14 347 ATOM 11 CA GLU A 2 11.948 7.788 31.721 1.00 20.88 14 348 ATOM 12 C GLU A 2 12.042 9.235 32.169 1.00 28.31 14 349
The First Model 14 MDH 11 BMD
Refining The Model 14 MDH 11 BMD
Models & Real Structure First Model Refined Model Real 14 MDH Structure
Comparison of Backbone Structures Yellow real 14 MDH structure Blue refined model Green 11BMD (template)
3-D Structure Docked with Substrate 11BMD (template) 14MDH In presence of reduced NAD (NADH) In presence of oxidized NAD (NAD+)
Secondary Structure Prediction from PDB Secondary Structure Prediction Attributes • Deduces the most likely position of alpha-helices and beta-strands • Confirms structural or functional relationships when sequence • similarity is weak • Determines guidelines for rational selection of specific mutants for • further laboratory study
Alpha helices have a periodicity of 3.6, which means that for helices with one face buried in the protein core, and the other exposed to solvent, will have residues at positions i, i+3, i+4 & i+7, will lie on one face of the helix.
Beta strands that are half buried in the protein core will tend to have hydrophobic residues at positions i, i+2, i+4, i+8 etc, and polar residues at positions i+1, i+3, i+5, etc.
Beta strands that are completely buried usually contain a run of hydrophobic residues, since both faces are buried in the protein core.
Other Important Secondary Structures • Loop regions • Often join combinations of -helices and -sheets • May participate in forming active sites/binding sites • Usually found on exterior of proteins (H-bond with solvent, H2O) • Rich in charged and polar hydrophilic residues • Usually have irregular structure • Insertions and deletions are most likely to occur in these regions • Hairpin • - Generally 2 to 5 residues long • - 70% are shorter than 7 residues • - Type I ; residue 2 is always G • - Type II; residue 1 is always G
Example for Secondary Structure Prediction Flavodoxin Chain A (FCA) Sequence KIGLFYGTQTGVTQTIAESIQQEFGGESIVDLNDIANADASDLNA YDYLIIGCPTWNVGELQSDWEGIYDDLDSVNFQGKKVAYFGAG DQVGYSDNFQDAMGILEEKISSLGSQTVGYWPIEGYDFNESKAV RNNQFVGLAIDEDNQPDLTKNRIKTWVSQLKSEFGL
FCA Secondary Structure 1 AKIGLFYGTQ TGVTQTIAES IQQEFGGESI VDLNDIANAD ASDLNAYDYL EEEEEE S SSHHHHHHHH HHHHHTTTTT EEEEEGGGTT GGGGGGSEE 51 IIGCPTWNVG ELQSDWEGIY DDLDSVNFQG KKVAYFGAGD QVGYSDNFQD EEEE EETTT EE HHHHHHH GGGGGS TT EEEEEEE TTTTTTTTTH 101 AMGILEEKIS SLGSQTVGYW PIEGYDFNES KAVRNNQFVG LAIDEDNQPD HHHHHHHHHH HTT EE E ESTT S TTEETTEESS EEE TTTTHH 151 LTKNR I KT WV SQLKS E FGL HHTHHHHHHH HHHH HHTTT • The assignments are: • Helix • H=helix • G=310 helix • I=pi helix • Beta • B=residue in isolated beta bridge • E=extended beta strand • Turns and Bends • T=hydrogen bonded turn • S=bend
Diagram of FCA Secondary Structure Summary: 3 sheets, 11 strands, 8 helices, 20 beta turns, 2 beta hairpins,