1 / 65

Computational Protein Design: A problem in combinatorial optimization

Computational Protein Design: A problem in combinatorial optimization. CSE 549 Guest Lecture September 17, 2009 David Green Applied Mathematics & Statistics. What is a protein?. Polymers (chains) of amino acids. There are 20 different amino acids that can be part of the chain.

inara
Download Presentation

Computational Protein Design: A problem in combinatorial optimization

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Computational Protein Design:A problem in combinatorial optimization CSE 549 Guest Lecture September 17, 2009 David Green Applied Mathematics & Statistics

  2. What is a protein? • Polymers (chains) of amino acids. • There are 20 different amino acids that can be part of the chain. • Machines of the cell. • It’s proteins that do most of the work involved in life!

  3. Polymers of amino acids. • Amino acids link to form polypeptides. • There is a backbone of constant composition. • There are side chains that vary.

  4. The twenty amino acids. • AA side chains vary from: • Big to small. • Non-polar (all C and H) to polar. • Positive to negative. • Flexible to rigid.

  5. The machinery of life. • Protein sensors (receptors) are responsible for all the senses (sight, smell, taste, touch, hearing). • Enzymes are proteins the catalyze chemical reactions, like the ones that convert food to energy. • Specialized structural proteins make skin elastic, and make the lens of the eye work. • Muscles are primarily composed of proteins that combine structural and enzymatic parts to make a machine.

  6. Why design proteins? • New sensors based on biology. • Proteins have been engineered to detect TNT (explosive) and sarin (nerve gas). • Proteins are used as treatments for many diseases. • Protein engineering has helped improve proteins that are given to cancer patients on radiation or chemo-therapy. • Work in the Green lab is on-going to design proteins for use as anti-HIV prophylatics. • Many nanotechnology applications that haven’t even been considered yet!

  7. Where do proteins come from? • The genome contains instructions for every protein in a cell. • A few HUGE molecules of DNA. • Each gene is the code for one protein. • There are ~30,000 genes in humans. • Genes are expressed through an intermediate molecule, RNA. • Many copies of each protein can be made.

  8. The Central Dogma of Molecular Biology. • Then proteins do the work!

  9. How do proteins work? • Proteins fold into a unique 3-dimensional structure. • The amino acid sequence of a protein dictates it’s structure. • The function of a protein is controlled by it’s structure.

  10. Many polymers are long, unstructured chains. • Polyethylene • Is made of long chains of the same monomers. • Adopts a random mesh of inter-weaving strands. • This structure gives us PLASTIC!

  11. DNA has the same structure for every sequence. • The “double-helix” is a great structure for storing and replication information.

  12. Protein structures are well-defined and diverse! • One chain or many. • Elongated or globular. • Many forms of symmetry (or none).

  13. What does a protein look like? • Cyanovirin – A protein that inhibits the entry of HIV into human cell.

  14. What does a protein look like? • The atoms of a protein form a compact, well-packed cluster.

  15. What does a protein look like? • A protein can be thought of as a nearly solid object.

  16. What does a protein look like? • Simplified cartoons make the structure easier to see.

  17. What does a protein look like? • The path of the backbone of a protein is called it’s “fold”.

  18. What does a protein look like? • Different types of amino acids are found all along the protein chain.

  19. What does a protein look like? • Each amino acid has a side chain that protrudes from the backbone.

  20. What does a protein look like? • Many proteins bind other molecules, like the sugar molecules here.

  21. What does a protein look like? • Binding interfaces are usually a close fit of two complementary surfaces.

  22. What does a protein look like? • The core of a protein is key in keeping a stable structure.

  23. Many side chains fill the core.

  24. The core is well packed …

  25. … with groups from all along the chain.

  26. Each side chain fits perfectly.

  27. What is a protein? • A protein is a complicated three-dimensional structure, made up by an amazing 3-D jigsaw puzzle of interlocking amino acids. • Amino acids pack together not just geometrically, but with complementary chemical groups as well. • Proteins move too, but we’ll ignore that for now.

  28. How can we design one?!? • Choose a fold (path of the backbone). • Pack the core with the right set of amino acids to achieve the desired fold. • Choose other amino acids to achieve the desired function (such as binding to a target molecule, or getting the right molecular motions).

  29. Structure prediction is a forward problem. • Given a protein sequence, what is the structure that it will adopt (fold to)? • This is a VERY hard problem, and it not yet fully solved. • Prediction is difficult because you are stuck with what nature gives you.

  30. Protein design is an inverse problem. • Given a desired 3-dimensional protein structure, what is a sequence that will fold to that structure? • We have the freedom to add constraints that simplify the problem. • As a result, methods for protein design have had many successes. • Pabo. Nature301: 200 (1981). • Drexler. PNAS 78: 5275-5278 (1981).

  31. A designed sequence should fold according to design. • ANY sequence which folds to the correct target structure (and carries out the desired function) can be considered a successful design • There is more than one right answer, unlike in prediction!

  32. Choosing a backbone fold. • The structure dictates the function, and a big part of structure is the fold. • We still don’t really know how to choose the “best” fold. • Instead, we just borrow from nature – redesign a natural protein to do something new.

  33. Zinc finger proteins bind DNA.

  34. A Zinc ion holds them together. • The protein will not fold if zinc is not present. • The protein only binds DNA when it is folded. • A group at Caltech set out to design a zinc finger that doesn’t need zinc!

  35. 1997: The first fully automated protein design! • Dahiyat and Mayo. Science287: 82-87 (1997).

  36. Designing function. • Making a molecule bind is like designing a the core – we want to make the interface between the two pieces complementary. • Other functions are a lot trickier … and we don’t have good ways to solve them yet, but we’re on our way.

  37. 2003: A Duke group designs a set of protein sensors. • Looger, Dwyer, Smith and Hellinga. Nature423: 185-190 (2003).

  38. Protein design is a BIG problem. • The zinc finger is one of the smallest protein domains … about 30 amino acids long. • How many different 30 amino acid polypeptides are there? • Choose from any of 20 amino acids at each position. • Total sequences = 2030 = 1x1039 • Mass of earth = 6x1027 g • Mass of a grain of sand ~ 1x10-3 g • A billion earths’ worth of sand grains • Enumeration of possible states is beyond impossible — must take advantage of need to achieve complementary interactions between amino acids.

  39. Many different structures are possible. • An arginine and a glutamate interact.

  40. Many different structures are possible. • An arginine and a glutamate interact.

  41. Many different structures are possible. • An arginine and a glutamate interact.

  42. Many different structures are possible. • An arginine and a glutamate interact in several different conformations.

  43. Really Big!!! • Amino-acid side chains are flexible. • But not every shape (conformation) is equal. • Each amino acid has a set of preferred conformations (rotamers). • 1 to 80 per amino acid. • Instead of choosing from 20 amino acids … we need to choose from ~400 (at least) amino acid rotamers! • Total structures = 40030 = 1x1078 • (approx. number of atoms in the universe!!!!!)

  44. Packing side chains – a puzzle. • How do you solve a jigsaw puzzle? • Impossible to try all combinations of piece placement • Unique ways of placing N pieces on a grid is (4N)(N!) • For N=100, (1.6x1060)(9.3x10157)= 1.5x10218 • Trying each piece one by one is better, but still infeasible • Number of iterative tries for a N piece puzzle is: • For N=100, 1.37x106

  45. Packing side chains – be smart. • How do you solve a jigsaw puzzle? • Group pieces by colors and patterns. • Iterate over matching of pieces that are complementary • Shape is important. • The pattern must also match.

  46. Pattern matching in proteins? • What does it mean for two amino acids in the core of a protein to “match”? • Must fit close together (but not too close) Steric complementarity. • Neighboring atoms must have complementary charges (neutral likes neutral, positive likes negative)  Electrostatic complementarity.

  47. Steric fit: Lennard-Jones potential. • Van der Waals attraction between atoms at moderate distances. • Repulsion of atoms from one another at short distances. • If atoms are not nearby, the energy between them will be very close to zero. • The total score of the “goodness” of fit in a molecule is the sum of the energy for every pair of atoms.

  48. Electrostatic fit: Coulomb’s Law • Atoms in molecules can be thought of as having tiny charges on them, even if the total charge on a molecule is zero. • Coulomb’s Law describes the energy of how two charges interact. • The overall electrostatic fit is calculated by adding up the energy of all pairs of atoms. • Like charges give a positive value. • Opposite charges give a negative. • Neutral (zero charge) groups don’t matter.

  49. The total energy describes the fitness of a structure. • Van der Waals + Coulomb’s Law, for every pair of atoms, and all added up. • Negative energies are favorable, positive energies unfavorable. • Nature works to MINIMIZE energy.

  50. Protein Design as a Discrete Conformational Search Position 1 Position 2 Position 3 Conformational states of system

More Related