80 likes | 172 Views
Topic 3: Structural Genomics and Models Contributors: S.K. Burley, A. Fiser, A. Godzik, A. Joachimiak, J. Markley, G. Montelione, C. Orengo, A. Sali, and M. Sauder Discussion Leader: Stephen K. Burley. Workshop on Biological Macromolecular Structure Models
E N D
Topic 3: Structural Genomics and Models Contributors: S.K. Burley, A. Fiser, A. Godzik, A. Joachimiak, J. Markley, G. Montelione, C. Orengo, A. Sali, and M. Sauder Discussion Leader: Stephen K. Burley Workshop on Biological Macromolecular Structure Models RCSB PDB • Piscataway, NJ • November 19-20, 2005
Role of Comparative Protein Structure Modeling in Structural Genomics
Protein Structure Initiative 2: Need for Large-Scale Homology Modeling • PSI-2 will yield 3,000-4,000 protein structures, most at course granularity • Each structure will represent a large number of sequence homologues • Homology modeling must provide “useful” models for distant (15-30%) sequence homologuesprotein function assignment and evolutionary insights • Models should guide functional characterization • Models must be readily accessible • Models must be subject to rigorous peer review
Issues Addressed in Contributed Slides • Current limitations of homology modeling • Role of homology modeling in target selection/execution • Role of homology modeling in structure determination • Homology modeling pipelines
Current Limitations of Homology Modeling • Input from • Joachimiak--MCSG • Sali--NYSGXRC
Issues with Homology Modeling for Structural Genomics • Models for distant (15-30%) homologues are poor quality • For very large families only small fraction of sequences can be reliably modeled (<10%) • Modeling must guide target selection in fine coverage of protein families • Domain parsing needs improvement • We should be able to model multi-domain proteins from structures of individual domains • We should be able to model side chains and important structural and functional features that currently are difficult to assign and predict correctly • We need methods to predict unusual features and departures from the structure that is used for modelling • Modelling loop and high B factor regions needs improvement
Scope for further improvement (significant e-value, bad model score) Good Models <30% seq.id Good Models >30% seq.id. Only 363 bad-models ≥30% sequence identity. Models Based on NYSGXRC Target Structures Good Models: E-Value ≤ 1.0e-4 GAScore ≥ 0.7
Questions for Homology Modeling Community • Should models be stored in archives or calculated “on the fly”? • Should models from pipeline approaches be centrally accessible? • Should the output of pipeline approaches be made interoperable with the PDB? • Should there be a publicly available model database for storage of modeling results to facilitate peer review? • Should models currently on deposit in the PDB be moved elsewhere? If so, where?