1 / 43

Bioinformatics how to …

Learn how to use free tools for predicting protein structures through comparative modeling, understanding assumptions, recognition, alignment, and modeling quality evaluation. Explore tools, expectations, challenges, and practical hands-on activities.

taylornancy
Download Presentation

Bioinformatics how to …

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Bioinformatics how to … use publicly available free tools to predict protein structure by comparative modeling

  2. Proteins are 3D objects with complex shapes • Over 60,000 protein structures have been determined, mostly by X-ray crystallography (PDB) • 3D structure of ~70% of bacterial and 50% of human proteins can be predicted (comparative modeling)

  3. A predicted model simply illustrates our assumptions No assumptions, this is nature telling us how it is GNAAAAKKGSEQESVKEFLAKAKEDFLKKWENPA QNTAHLDQFERIKTLGTGSFGRVMLVKHKETGNH FAMKILDKQKVVKLKQIEHTLNEKRILQAVNFPF LVKLEYSFKDNSNLYMVMEYVPGGEMFSHLRRIG RFSEPHARFYAAQIVLTFEYLHSLDLIYRDLKPE NLLIDQQGYIQVTDFGFAKRVKGRTWTLCGTPEY LAPEIILSKGYNKAVDWWALGVLIYEMAAGYPPF FADQPIQIYEKIVSGKVRFPSHFSSDLKDLLRNL LQVDLTKRFGNLKDGVNDIKNHKWFATTDWIAIY QRKVEAPFIPKFKGPGDTSNFDDYEEEEIRVSIN EKCGKEFSEF Assumption (protein A is Similar to protein B) Result (protein A is Similar to protein B) Sequence

  4. Unknown protein GLLTTKFVSLLQEAKDGVLDLKLAADTLAVRQKRRIYDITNVLEGIGLIEKKSKNSIQW Well studied protein SRRSASHPTYSEMIAAAIRAEKSRGGSSRQSIQKYIKSHYKVGHNADLQIKLSIRRLLAA How do we know that these proteins are similar? similarity prediction

  5. How can we make such assumptions? • Statistical reliability of the prediction • E-value - the number of hits one can "expect" to see just by chance when searching a database of a particular size (closer to zero the better) • Z-score – score expressed as a distance from the mean calculated in standard deviations (the bigger the better)

  6. Similar, but not homologous • phosphoribosyltransferaseand viral coat protein, identity: 42%, different folds, different functions • . . . . . 99 IRLKSYCNDQSTGDIKVIGGDDLSTLTGKNVLIVEDIIDTGKTMQTLLSLVRQY.NPKMVKVASLLVKRTPRSVGY 173 : ||. ||| || |. || | : | | | | || | || |:| | ||.| | 214 VPLKTDANDQ.IGDSLY....SAMTVDDFGVLAVRVVNDHNPTKVT..SKVRIYMKPKHVRV...WCPRPPRAVPY 279

  7. Different, but homologous • Histone H5 and transcription factor E2F4, identity 7%, similar fold, similar function (DNA binding) • PTYSEMIAAAIRAEKSRGGSSRQSIQKYIKSHYKVGHNADLQIKLSIRRLLAAGVLKQTKGVGASGSFRL | | | | | • GLLTTKFVSLLQEAKD-GVLDLKLAADTLA------VRQKRRIYDITNVLEGIGLIEKKS----KNSIQW

  8. Steps in comparative modeling Are there any well characterized proteins similar to my protein? Recognition What is the position-by-position target/template equivalence Alignment What is the detailed 3D structure of my proteins Modeling Model analysis Is my model any good?

  9. Recognition • BLAST, PSI-BLAST or PFAM, FFAS, metaserver (bioinfo) • Name (PDB code) of the template • Statistical significance of the match (Z-score, e.value, p.value, points)

  10. Alignment • The same tools as in recognition (perhaps with different parameters), editing by hand • Position by position equivalence table

  11. Commercial programs Accelrys (Insight) Tripos (Sybyl) … Freeware/shareware/servers Modeller (Andrej Sali) Jackal (Barry Honig) SCRWL (Roland Dunbrack) SwissModel Modeling

  12. Model quality • Empirical energy based tools • PSQS (http://www1.jcsg.org/psqs/psqs.cgi) • SwissPDB viewer • Geometric quality • Procheck, SFCHECK, etc. (http://www.jcsg.org/scripts/prod/validation/sv3.cgi)

  13. Expectations of comparative modeling Easy – 100-40% sequence id - strong sequence similarity, strong structure similarity, obvious function analogy 75 Difficult – 40%-25% - twilight zone sequence similarity, increasing structure divergence, function diversification 50 25 Fold prediction – below 25% seq id. no apparent sequence similarity extreme function divergence 0

  14. Recognition Alignment Modeling Challenges Trivial Trivial Simple Loop modeling Trivial Easy Simple Loop modeling Simple Challenging Challenging Alignment, backbone shifts Difficult Very difficult Significant errors Alignment, backbone shifts Often impossible Significant errors Often impossible Recognition Challenges of comparative modeling 100 80 60 40 20

  15. Hands-on Activity • Click below for a hands-on, “bioinformatics how to” activity • Go to • http://bioinformatics.burnham.org/ • Click Structure Biology Course - “Protein Modeling Tutorial” Link in the homepage. • OR Go to…. http://bioinformatics.burnham.org/SSBC/modeling.html

  16. Models and Simulation • Computational Biology

  17. Chapter Goals • Models and Simulation • Complex Systems • Continuous and Discrete simulation • Object-oriented design and building models • Queuing systems • Weather and Seismic models

  18. Chapter Goals • Computational Biology • Bioinformatics • Computational Biomodeling • Protein Modeling • Molecular Modeling

  19. Chapter Goals • Computer Graphics • The CREATION of complex images • CAD • Fractal and Other Techniques • Light and Rendering • Movement

  20. What is a Model? • An Abstraction of a Real World System • A representation of the objects or quantities within the system (Noun, the Data) • and the rules that govern the interactions between them (Verb, the Code & Algorithms) • Systems that are best suited to being simulated are dynamic, interactive, and complicated

  21. What Is Simulation? • Simulation is RUNNING a model to PREDICT the results of experimental CHANGES in the system • Doing “What If” analysis “What happens if I change this? “What happens if I don’t?”

  22. Kinds of Models

  23. Kinds of Models • There are 2 Big Slices: • Discrete Models • Continuous Models

  24. Kinds of Models • Discrete event simulation • Made up of entities, attributes, and events • Entity The representation of some object in the real system • Attribute Some characteristic of a particular entity • Event An interaction between entities

  25. Air Traffic – A Discrete Model • Air Traffic in counrty Planes are objects • Attributes include speed • Events are planes entering and leaving airspace

  26. Kinds of Models • Continuous simulation • Treats time as continuous • Expresses changes in terms of a set of differential equations that reflect the relationships among the quantities in the model • Meteorological models falls into this category

  27. Hurricanes – A Continuous Model

  28. A Continuous Models and FEA • Finite Element Analysis (FEA): Dividing a volume of space into small cubes, which contain our quantities of interest • Many Continuous Models Use FEA

  29. Meteorological Models

  30. Weather – A Continuous Model

  31. Meteorological Models Meteorological models • Models based on the time-dependent partial differential equations of fluid mechanics and thermodynamics • Initial values for the variables are entered from observation, and the equations are solved to define the values of the variables at some later time

  32. Weather – A Continuous Model

  33. Computational Biology

  34. Computational Biology • The application of computer science to problems in biology • (or is it the other way around??  ) • Encompasses: • bioinformatics • computational biomodeling • molecular modeling • protein structure prediction 34

  35. Computational Biology • Bioinformatics • Discovering and Processing DNA sequences • Human Genome Project and Others 35

  36. Computational Biology • Computational Biomodeling • The simulation of biological systems Knees Cell Wall Protein Cell Metabolism 36

  37. Computational Biology • Protein Structure Modeling • Simulating 3-Dimensional Structure and Function of Protein Molecules 37

  38. Computational Biology • Molecular Modeling • Simulating Structure and Function of Chemical Molecules (usually drug discovery) 38

  39. “Cell” Models

  40. Earth, Wind, Fire and Water • Cell-Based Models • Like continuous FEA models • Uses quantities and laws from physics • “How is a hurricane like a glass of water?” • Or a Cloud? • Or Fire? • Or Smoke?

  41. Cell Models Figure 14.7 Water pouring into a glass

  42. Cell Models Figure 14.8 Cellular automata-based clouds

  43. Modeling Complex Objects Cell Models What do clouds, smoke and fire have in common? Figure 14.9 A campfire

More Related