1 / 21

A network-based representation of protein fold space

A network-based representation of protein fold space. Spencer Bliven. Qualifying Examination. 6/6 / 2011. Overview. Background & Motivation Preliminary Research Proposed Future Research. Fold Space. What protein folds ar e possible? Discrete or Continuous? Both? Neither ?

anne
Download Presentation

A network-based representation of protein fold space

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A network-based representation of protein fold space Spencer Bliven Qualifying Examination 6/6/2011

  2. Overview • Background & Motivation • Preliminary Research • Proposed Future Research

  3. Fold Space • What protein folds are possible? • Discrete or Continuous? Both? Neither? • What portion of fold space is utilized by nature? • Long debated questions. Why? • Understanding of structure-function relationship • Protein design/engineering • Protein evolution • Classification

  4. Previous Work β • Orengo, Flores, Taylor, Thornton. Protein Eng (1993) vol. 6 (5) pp. 485-500 • Holm and Sander. J Mol Biol (1993) vol. 233 (1) pp. 123-38 • Holm and Sander. Science (1996) vol. 273 (5275) pp. 595-603 • Shindyalov and Bourne. Proteins (2000) vol. 38 (3) pp. 247-60 • Hou, Sims, Zhang, Kim. PNAS (2003) vol. 100 (5) pp. 2386-90 • Taylor. Curr Opin Struct Biol (2007) vol. 17 (3) pp. 354-61 • Sadreyevet al. Curr Opin Struct Biol (2009) vol. 19 (3) pp. 321-8 α/β α α+β

  5. Why can we do better? • More structures • Sampling of globular folds “saturated” • Few novel folds being discovered • Geometric arguments for saturation of small protein folds • Recent all-vs-all computation • Cluster sequence to 40% identity • 17,852 representative (updated weekly) • 189 million FATCAT rigid-body alignments 73503 http://www.rcsb.org/pdb/statistics/contentGrowthChart.do?content=total&seqid=100 Accessed 5/31/2011

  6. Structural Similarity Graph • Nodes: PDB chains,non-redundant to 40% • Edges: FATCAT-rigid alignments • “Significant” edges: • p<0.001 • Length > 25 • Coverage > 50 • Hierarchically cluster to reduce complexity in visualization a b a/b a+b Multi Membrane Small

  7. Agreement with SCOP

  8. Continuity • Skolnick claims ≤ 7 intermediates between any proteins • We observe network diameter=15 • Can find interesting paths Grishin. J Struct Biol (2001) vol. 134 (2-3) pp. 167-85

  9. Beta Propellers Symmetry C4 C5 C6 C7

  10. Symmetry • Functionally important • Protein evolution (e.g. beta-trefoil) • DNA binding • Allosteric regulation • Cooperativity • Widespread (~20% of proteins) • Focus of algorithmic work FGF-1 Lee & Blaber. PNAS 2011 TATA Binding Protein 1TGH Hemoglobin 4HHB

  11. Cross-class example • 3GP6.A • PagP, modifies lipid A • f.4.1 (transmembrane beta-barrel) • 1KT6.A • Retinol-binding protein • b.60.1 (Lipocalins)

  12. Summary of Preliminary Research • Calculated all-vs-all alignment • Prlić A, Bliven S, Rose PW, Bluhm WF, Bizon C, Godzik A, Bourne PE. Pre-calculated protein structure alignments at the RCSB PDB website. Bioinformatics (2010) vol. 26 (23) pp. 2983-2985 • Built network of significant alignments • Approximately matches SCOP classifications • Improved structural alignment algorithms • Identify symmetry, circular permutations, topology independent alignments • Discussed more in report

  13. Future Research • Improve the network • Improve all-vs-all comparison algorithm • Tune parameters during graph generation • Annotate the network & draw biological inferences • Annotate nodes with functional information • Compare with other networks • Create new networks • Enhance structural comparison algorithms

  14. 1. Improve all-vs-all comparison algorithm • Need domain decomposition • Use Combinatorial Extension (CE)

  15. 2. Tune parameters during graph generation • Don’t use p-values • Shouldn’t compare p-values, statistically* • Not normalized by secondary structure • Not accurate due to multiple testing problem • Use TM-score • RMSD, normalized to the alignment length • Determine optimal thresholds for determining “significance” • For instance, train an SVG * Technically ok here, since one-to-one with the FATCAT score

  16. FATCAT p-value by Class • Perform poorly on all-alpha in “twilight zone” • Terrible on membrane proteins • Probably reflects non-structural considerations in SCOP assignment

  17. 3. Annotate nodes with functional information • SCOP/CATH classifications • GO terms • Metal binding • Ligand binding • Symmetry a b a/b a+b Multi Membrane Small

  18. 4. Compare with other networks • Define other types of network over the set of protein representatives • Protein-protein interactions • Co-expression • Correlate to the structural similarities Structural similarity Protein-protein interaction

  19. 5. Enhance structural comparison algorithms • Improve automated pseudo-symmetry detection • Find topology-independent relationships C3

  20. Summary • Fold space as network • Improve network creation • Annotate network with functional information • Improve structural similarity detection

  21. Acknowledgments Bourne Lab Philip Bourne Andreas Prlić Lab & PDB members Qualifying Exam Committee Ruben Abagyan Patricia Jennings Andy McCammon Collaborators Philippe Youkharibache Jean-Pierre Changeux Rotation Advisors Pavel Pevzner Philip Bourne JoséOnuchic & Pat Jennings Mike MacCoss Virgil Woods

More Related