1 / 26

The Side-Chain Positioning Problem

The Side-Chain Positioning Problem. Carl Kingsford Princeton University. Joint work with Bernard Chazelle and Mona Singh. R. V. C. R. Proteins. Many functions: Structural, messaging, catalytic, … Sequence of amino acids strung together on a backbone

katelynn
Download Presentation

The Side-Chain Positioning Problem

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Side-Chain Positioning Problem Carl Kingsford Princeton University Joint work with Bernard Chazelle and Mona Singh

  2. R V C R Proteins • Many functions: Structural, messaging, catalytic, … • Sequence of amino acids strung together on a backbone • Each amino acid has a flexible side-chain • Proteins fold. Function depends highly on 3D shape

  3. Backbone Side-chains Protein Structure

  4. Side-chain Positioning Problem • Given: • • fixed backbone • • amino acid sequence • Find the 3D positions for the side-chains that minimize the energy of the structure • Assume lowest energy is best IILVPACW…

  5. Side-chain Positioning Applications • Homology-modeling: Use known backbone of similar protein to predict new structure Unknown:KNVACKNGQTNCYQSYSTMSITDCRETGSSKYPNCAYKTTQANKHII NV CKNG NCYS S + ITDCR G+SKYPNC YKT+ KHII Known:ENVTCKNGKKNCYKSTSALHITDCRLKGNSKYPNCDYKTSDYQKHII

  6. Rotamers • Each amino acid has some number of statistically preferred side-chain positions • These are called rotamers • Continuum of positions is well approximated by rotamers 3 rotamers of Arginine

  7. An Equivalent Graph Problem V1 • For protein with p side-chains: • p-partite graph: • • part Vifor each side-chain i • • node u for each rotamer • • edge {u,v} if u interacts with v • Weights: • • E(u) = self-energy • • E(u,v) = interaction energy • n nodes rotamer V2 interaction position

  8. Feasible Solution V1 • Feasible solution: one node from each part • cost(feasible) = cost of induced subgraph • Hard to approximate within a factor of cn • wherenis the # of nodes rotamer V2 interaction position

  9. Energy of a protein conformation is the sum of several energy terms No -inequality + - A B Determining the Energy 0 electrostatics van der Waals bond lengths bond angles hydrogen bonds dihedral angles

  10. Plan of Attack • Formulate as a quadratic integer program • Relax into a semidefinite program • Solve the SDP in polynomial time • Round solution vectors to choice of rotamers

  11. for each posn j, node v Quadratic Integer Program min subject to for each posn j

  12. Relax Into Vector Program Use xu = xu2 for to write as pure quadratic program Variables n-dimensional vectors (  ) minimize subject to for each posn j for each node v, posn j

  13. Rewrite As Semidefinite Program X  (xuv) is PSD  xuv = xuTxv minimize subject to for each posn j for each node v, posn j

  14. Insert a new position with a single node. No edges, no node cost. xu0 V0 Vi xvv flow constraints sum of edge variables adjacent to a node equals that node variable Constraints & Dummy Position xuv Vj position constraints sum of the node variables in each position is 1

  15. Geometry of the Solution Vectors

  16. Geometry of Solution Vectors Lemma. Proof. . Let Simple algebra shows that: • Length of y is 1 • Length of xu0 is 1 • Length of projection of y onto xu0 is 1

  17. Each solution vector lies on a sphere of radius ½ centered at xu0/2: a2 = xu0 a xu O Solution Vectors Lie on a Sphere because Note. Length of projection of xu onto xu0 is the length of vector xu squared.

  18. How do we round the solution of the SDP relaxation? Convert fractional solutions into feasible 0/1 solutions • Projection rounding • Perron-Frobenius rounding

  19. Since , the xuu give a probability distribution at X = Projection Rounding at each position. Pick node u with probability xuu xuu = length of the projection onto xu0. xu0 xu xv O

  20. xv xu yu yv Because xu are on a sphere, Drift for Projection Rounding • Drift   expected difference between fractional & rounded solutions. • Comes entirely from pairwise interactions. • In fact, uv = E(u,v)(xuv – Pr[uv]) By Cauchy-Schwartz,

  21. = 0 1 0 0 1 0 0 0 1 0 0 1 0 0 0 q=  = 1  = 1  = 1  = 1 q needs to contain probability distributions for each position. How do we choose q? Perron-Frobenius Rounding •   0/1 characteristic n-vector of optimal solutionOptimal integral X*  T  rank(X*) = 1 • Idea: Approximate fractional X by a rank 1 matrix qqTWant to sample from , but settle forq

  22. Possible Choices for q Lemma. Any nonnegative vector q with L1-norm p in the image space of X contains the required set of probability distributions. Proof. X = WTW, where W = [x1x2 … xn]. Let 1i characteristic vector for position i Suppose q = Xy for some y. Then, The final value is independent of i each position sums to 1.

  23. where Take A Choice for q By spectral decomposition z1 is in the image space of X. By Perron-Frobenius theorem for nonnegative matrices  q ≥ 0. By Lemma, q contains the needed probability distributions.

  24. Computational Results • 30 random graphs •  60 nodes, 15 positions •  edge probability ½ •  weights uniformly from [0,1] Compare solutions from Simple LP SDP Fractional Projection rounded Perron-Frobenius rounded

  25. Future Work • Can the rounding schemes be applied to other problems? • Can the semidefinite program be sped up? • ─ Can only routinely solve graphs with ≤ 120 nodes (reasonable protein problems contain 1000 to 5000 nodes) • ─ xuv ≥ 0 constraints are the bottleneck • Can the requirement of a fixed backbone be relaxed? • We’ve worked quite a bit with real proteins using a LP approach Seems an SDP formulation might be useful

  26. More Information • The Side-Chain Positioning Problem: A Semidefinite Programming Formulation with New Rounding Schemes, B. Chazelle, C. Kingsford, M. Singh, Proc. ACM FCRC'2003, Principles of Computing and Knowledge: Paris Kanellakis Memorial Workshop (2003). http://www.cs.princeton.edu/~carlk/papers.html

More Related