1 / 27

Scientific Benchmarks for Structure Prediction Codes

Scientific Benchmarks for Structure Prediction Codes. Jack Snoeyink & Matt O’Meara Dept. Computer Science UNC Chapel Hill. With thanks to:. Collaborators Brian Kuhlman, UNC Biochem Many other members of the RosettaCommons Richardson lab, Duke Biochem Funding NIH NSF. Key Points….

sydnee
Download Presentation

Scientific Benchmarks for Structure Prediction Codes

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Scientific Benchmarks for Structure Prediction Codes Jack Snoeyink & Matt O’Meara Dept. Computer Science UNC Chapel Hill

  2. With thanks to: Collaborators • Brian Kuhlman, UNC Biochem • Many other members of the RosettaCommons • Richardson lab, Duke Biochem Funding • NIH • NSF

  3. Key Points… • Scientific Models, esp. for Structural Molecular Biology • Models are the lens through which we view data • Models are predominantly geometric • Computational models are complex • Models evolve, so testing becomes crucial • Focus on statistical/computational models with • a sample source, observable local features, chosen functional form, fit parameters, & visualization/testing methods • Capture assumptions and date used to build models to: • Visualize for making design decisions while building • Fit parameters to ensure best performance • Record as scientific benchmarks Case Study: Rosetta protein structure prediction software [B]

  4. Science views nature thru models

  5. Scientists view nature thru models

  6. People view the world thru models

  7. Geometric molecular models

  8. Model complexity • Physical and Conceptual models • Kept simple to aid understanding • Statistical and Computational models • Evolve by combining simple models • Even when complex can still be effective atValidation (Molprobity) or Prediction (Rosetta)

  9. Model complexity

  10. Model complexity

  11. Computational model life cycle

  12. Computational model life cycle Spiral development, much like software • Discover problematic features in some data • Create an energy function to adjust them • Fit parameters to improve results • Check into the software as a new option • Make default option if everyone likes it • Occasionally refactor and rewrite, removing outdated or unused models But less support for testing…

  13. Computational model testing Our goal: Capture data and assumptions from model building for use in model visualization and testing.

  14. Our computational models Abstraction: A simple component of a complex computational model consists of: • One or more sample sources giving • Pdb files from native or decoys • Observable local features having a • Hydrogen bond distances and angles • Chosen functional form that • Energy from distances and angles • Depends on fittingparameters • Weights for combining terms KMB’03

  15. Tool schematic data set A gather features data set B . . . data set Z plots SQL query filter transform statistics ggplot2 spec

  16. Visualization Implemented tools • Compare distributions from sample sources • Tufte’s small multiples via ggplot • Kernel density estimation • Normalization Opportunities for • Statistical analysis • Dimension reduction …

  17. Normalization [KMB’03]Histogram of Hbond A-H distances in natives

  18. Tool uses… Scientific unit tests native, HEAD, ^HEAD run on continuously testing server Knowledge-base score term creation native, release, experimental turn exploration into living benchmarks Test design hypotheses native, protocol, designs how strange is the this geometry?

  19. Rotamer recovery

  20. Key Points… • Scientific Models, esp. for Structural Molecular Biology • Models are the lens through which we view data • Models are predominantly geometric • Computational models are complex • Models evolve, so testing becomes crucial • Focus on statistical/computational models with • a sample source, observable local features, chosen functional form, fit parameters, & visualization/testing methods • Capture assumptions and date used to build models to: • Visualize for making design decisions while building • Fit parameters to ensure best performance • Record as scientific benchmarks Case Study: Rosetta protein structure prediction software [B]

More Related