220 likes | 377 Views
COMPUTATIONAL ENGINEERING OF BIONANOSTRUCTURES RAM SAMUDRALA ASSOCIATE PROFESSOR UNIVERSITY OF WASHINGTON How can we design peptides and proteins capable of interacting with inorganic substrates with specific selectivity and affinity?. MOTIVATION.
E N D
COMPUTATIONAL ENGINEERING OF BIONANOSTRUCTURES RAM SAMUDRALA ASSOCIATE PROFESSOR UNIVERSITY OF WASHINGTON How can we design peptides and proteins capable of interacting with inorganic substrates with specific selectivity and affinity?
MOTIVATION The functions necessary for life are undertaken by proteins. Protein function is mediated by protein three-dimensional structure. A number of semi-accurate computational methodologies have been developed for the analysis and modelling of the sequences and structures of naturally occurring proteins. We can harness these knowledge- and biophysics-based computational methodologies to design peptides and proteins capable of interacting inorganic substrates with specific affinity and selectivity. Goal is to develop generalised computational techniques to construct molecular building blocks based on peptides and proteins that can be easily assembled to design higher order structures. Applications in the area of medicine, nanotechnology, and biological computing.
BACKGROUND: STRUCTURE T0290 – peptidyl-prolyl isomerase from H. sapiens T0288 – PRKCA-binding from H. sapiens 2.2 Å Cα RMSD for 93 residues (25% identity) 0.5 Å Cα RMSD for 173 residues (60% identity) T0332 – methyltransferase from H. sapiens T0364 – hypothetical from P. putida 2.0 Å Cα RMSD for 159 residues (23% identity) 5.3 Å Cα RMSD for 153 residues (11% identity) Liu/Hong-Hung/Ngan
BACKGROUND: FUNCTION Ion binding energy prediction with a correlation of 0.7 Calcium ions predicted to < 0.05 Å RMSD in 130 cases Meta-functional signature accuracy Meta-functional signature for DXS model from M. tuberculosis Wang/Cheng
BACKGROUND: INTERACTION Transcription factor bound to DNA promoter regulog model from S. cerevisiae Prediction of binding energies of HIV protease mutants and inhibitors using docking with dynamics BtubA/BtubBinterolog model from P. dejongeii (35% identity to eukaryotic tubulins) McDermott/Wichadakul/Staley/Horst/Manocheewa/Jenwitheesuk/Bernard
APPLICATION: DRUG DISCOVERY Computionally predicted broad spectrum human herpesvirus protease inhibitors is effective in vitro against members from all three classes and is comparable or better than anti-herpes drugs CMV HSV KHSV Our protease inhibitor acts synergistically with acylovir (a nucleoside analogue that inhibits replication) and it is less likely to lead to resistant strains compared to acylovir HSV HSV Lagunoff/Jenwitheesuk
BACK TO THE RELEVANT QUESTION How can we using this knowledge to design peptides and proteins capable of interacting with inorganic substrates with specific selectivity and affinity?
KNOWLEDGE-BASED DESIGN Proteins that are evolutionarily related generally have similar sequences, structures, and functions. We hypothesised that this applies to experimentally discovered peptides capable of binding to inorganic substrates. We then examined similarity of sequences between experimentally discovered peptides and random peptide sequences using standard sequence comparison tools. Random peptide sequences most similar to a particular group of experimentally discovered peptides were considered to possess the same functional property. Some examples of experimentally discovered peptides (from Mehmet Sarikaya’s group): Hydroxyapatite binders: MLPHHGA TTTPNRA PVAMPHW Quartz binders: RLNPPSQMDPPF QTWPPPLWFSTS LTPHQTTMAHFL Oren/Tamerler/Sarikaya
OPTIMISATION OF SCORING MATRICES (QUARTZ) We perturbed the PAM 250 scoring matrix systematically to produce a higher strong-strong self-similarity and lower strong-weak cross-similarity score, and backtested the predictive power of the new QUARTZ I matrix. Oren/Tamerler/Sarikaya
EXPERIMENTAL VERIFICATION (QUARTZ) Three sets of experiments were performed by Mehmet Sarikaya’s group to validate the computationally designed sequences. Oren/Tamerler/Sarikaya
DESIGN OF SECOND GENERATION MATRICES Oren/Tamerler/Sarikaya
KNOWLEDGE-BASED DESIGN (HA) HA12 (12 aa linear) 49: 16S, 20M, 13W HA7 (7 aa constrained) 56: 12S, 27M, 17W • S P T K P T P P R S S Q • T S T N Y W L Y S S E S • V P F Q F K V T G D P L • A F S Q L K G F Y S R Y • E F Y T P T G L P P G R • H T V N R S M D V P G V • N T P A H A N A D F F D • A S G A K P W T S D L H • I P M T P S Y D S H I L • H A P Y K S H V W T E Q • A F A Y R D N L S M H P • L L A D T T H H R P W T • H W G E I P S R L S L P • L D T Q F I K P P Q K S • S V A A L F R H V P G H • N G W W T A S P G V P M • W K W L Y D L V T P T I • N E Y Y I H Q V H P P T • G E E L G N R L A R I T • S Q P F W M L S R V L A • D L F S V H W P P L K A • A T S H L H V R L P S R • T L V P K N E T P L S S • L S A A S H L H T S S S • I I P S Q Q Q S L M A P • Q I P S Y W P R G P G G • S S L H A L H P F G A V • Q S T T V L H A S P T L • K L P Y A L E L S G T V • K F L S L P P P T R S G • V A S P E R T S P A F P • E S A Q L N R T L Q L P • I D M S R L E S Y T L P • N H Q G V L S V H G S L • H Y L P K N V R T S L Q • T L P S P L A L L T V H • M L P H H G A • T T T P N R A • P V A M P H W • N N N Y S R H • P K D A V P A • P S F D N G F • Q L I P V S N • L T Q S D H P • H S P S N P S • R T N Q P Q K • D P Q Y G Q H • N S G S R H H • T P P H H Q P • H Q H N M K I • M H P T H T T • H P A T I E D • S G Q I S L L • S G S P V P N • D N T S D M V • S S W Q R L R • Q N K D F Q K • H Q E S H P P • P H H H H Q P • S N Y F A E M • Q S S H S F L • A I N D T N Q • P T T P N E Q • S M K V P S S • S V E E R G S • N E S F T G A • Y P T Q T T D • I Y E V N T E • S P Q T P S R • S D N T V R Y • S M I P P Y R • V L T P T Q S • R P I V H H Q • M W R D S K P • H Q T H H P Q • T G L Q N S S • L S P K P Q L • N P G F A Q A • G I G Q P Q A • M I F L R V V • T A H A M L Y • H L P I P S A • M G A G R A A • S I H S R D T • T F H K W P S • S T W I P E F • P S S P L Q S • H L H Q Q N T • Q L Q L L Q S • R T T P S Y H • T T H Q E A P • Y P P R S N T • L S P L H Q L N S S V N • S P S M L T S M W P N T • N L P S P L I P A S S P • S L S P T R S L Y E A T • N I S D T L N R S R W K • Q S Y S S M L Y P S P F • A Q S Q M M S A Q F R P • E L L A P R G S L N T G • T T N S H E F P P G Q S • Y D E I L G A A P S L K • T P G E Y L R L A T G R • G A Q Q L N S M H P E H • R P L E S R T P L Y L P Oren/Tamerler/Sarikaya
BACKTESTING (HA) HA12 (12 aa linear) HA7 I HA12 I HA_7 (7 aa constrained) HA7 I HA12 I Oren/Tamerler/Sarikaya
CASE STUDY: AMELOGENIN • Principal protein involved in enamel formation. • Multifunction protein • Mineralization. • Signaling. • Adhesion to process matrix. • Physical protein-protein interactions. • Never been crystallised (irregular / unstable?). • Most proteins with non-repeating sequence are active in globular form. • Many proteins fold into globular form upon interaction with substrate / interactor. • Assumption of linear and globular forms. • Start with protein structure prediction.
CASE STUDY: AMELOGENIN STRUCTURE Predicted five models (typical for CASP). Annotate structure with experimental and simulation evidence to find best predicted globular structure and infer function.
CASE STUDY: AMELOGENIN FUNCTION Signal Region Exon 4 MGTWILFACLLGAAFAMPLPPHPGSPGYINLSYEKSHSQAINTDRTALVLTPLKWYQSMIRQPYPSYGYEPMGGWLHHQIIPVLSQQHPPSHTLQPHHHLPVVPAQQPVA 1 10 20 30 40 50 60 70 80 90 100 110 PQQPMMPVPGHHSMTPTQHHQPNIPPSAQQPFQQPFQPQAIPPQSHQPMQPQSPLHPMQPLAPQPPLPPLFSMQPLSPILPELPLEAWPATDKTKREEVD 120 130 140 150 160 170 180 190 200 210 Horst/Oren/Cheng/Wang
CASE STUDY: AMELOGENIN – WHAT IT DOES MGTWILFACLLGAAFAMPLPPHPGSPGYINLSYEKSHSQAINTDRTALVLTPLKWYQSMIRQPYPSYGYEPMGGWLHHQIIPVLSQQHPPSHTLQPHHHLPVVPAQQPVA 1 10 20 30 40 50 60 70 80 90 100 110 PQQPMMPVPGHHSMTPTQHHQPNIPPSAQQPFQQPFQPQAIPPQSHQPMQPQSPLHPMQPLAPQPPLPPLFSMQPLSPILPELPLEAWPATDKTKREEVD 120 130 140 150 160 170 180 190 200 210 1. PV 2. HPPSHTLQPHHHLPVV 3. VPGHHSMTPTQH 1. LFACLLGAAFAMPLP 2. PGYINLSYEKSHSQAINTDRTA 3. LPPLFSMQPLSPILPELPLEAWPAT MOUSE AMELOGENIN STRUCTURAL ANALYSIS Model 1 Model 2 Model 3 Model 4 Model 5 Horst/Oren/Cheng/Wang
CASE STUDY: AMELOGENIN – HA BINDING • Sequences derived from amelogenin: • HTLQPHHHLPVV (12) • VPGHHSMTPTQH (12) • LFACLLGAAFAMPLP (15) • HPPSHTLQPHHHLPVV (16) • PGYINLSYEKSHSQAINTDRTA (22) • LPPLFSMQPLSPILPELPLEAWPAT (25) • HPPSHTLQPHHHLPVVPAQQPVAPQQPMMPVPGHHSMTPTQH (42) Oren/Tamerler/Sarikaya
BIOPHYSICS-BASED DESIGN Strong hydroxyapatite binding region Strong quartz binding region Active site Characterise sequences and structures of naturally occurring proteins in terms of their total similarity scores using different scoring matrices. This will produce a database of sequences with predicted and known structures with specific selectivity and affinity to different inorganics. This database can be analysed for atom-atom preferences, torsion angle preferences, and other characteristics to define energy functions and move sets for performing protein structure simulations. We will combine this with our all-atom energy function capable of handling inorganics and our protein structure simulation software. Design higher order protein-like scaffolds with specific functionalities:
ACKNOWLEDGEMENTS People: Ersin Emre Oren Jeremy Horst Samudrala group Mehmet Sarikaya and his group Candan Tamerler-Behar and her group Support from: National Institutes of Health National Science Foundation Kinship Foundation (Searle Scholars Program) Defense University Research Initiative on NanoTechnology Genetically Engineered Materials Science and Engineering Center Puget Sound Partners in Global Health (Gates Foundation) UW Technologies Initiative UW Technology Gap Research Fund Washington Research Fund