1 / 18

Structural Proteomics Automatic Target Selection

Structural Proteomics Automatic Target Selection. Gordon Whamond. Project Overview. Aim: Provide a resource that facilitates the automatic selection of potential targets for protein structure determination while minimising human interaction with the software (if required). Input:

shelbyc
Download Presentation

Structural Proteomics Automatic Target Selection

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Structural Proteomics Automatic Target Selection Gordon Whamond

  2. Project Overview • Aim: • Provide a resource that facilitates the automatic selection of potential targets for protein structure determination while minimising human interaction with the software (if required). • Input: • Raw amino acid sequence • UniProt accession number • UniProt accession number and a sequence range • Output: • Query sequence showing possible domains • All candidates for structure determination • Recommendation for which sequence to use

  3. Considerations • Is there a known structure? • Are there Classified Structural (CATH, SCOP) Domains? • Are there Known Sequence (Pfam) Domains? • Are there Predicted Structural (Gene3D, Superfamily) Domains? • Do Domain Boundaries Conform to Secondary Structure Restrictions? • Which Species has a Representative Domain that is the Most Compactly Folded? • The core implementation needs to be extendible and easily maintainable.

  4. Taverna The software is to be implemented using the Taverna workbench. This is a tool that can be used to formulate the workflow and implement each of the processes as distributed web services. • Advantages: • Distributed computing reduces resource requirement. • Easily extendible system • Maintenance issues shifted to external providers • Disadvantages: • Learning curve • Convincing service providers to adopt a standard format • Maintenance issues shifted to external providers Tom Oinn - http://taverna.sourceforge.net/

  5. Taverna The prototype workflow: When it is expanded to show all of the incorporated sub-workflows is quite complex Luckily Taverna can provide a top level view.

  6. Taverna

  7. Dealing With DAS

  8. Taverna

  9. Process Data Secondary Structure Elements: (Method not yet chosen) Sequence Domains: Pfam, Gene3D, Superfamily etc Protein Folding: RONN, FoldIndex, DisEMBL Rank Target Selection: Based on loop lengths, folding predictions, etc

  10. Starting the Process

  11. Monitoring Progress

  12. Assess Data

  13. Review Results

  14. Extensibility • Java Services • Straightforward to provide as a web service using Tomcat and Axis • WSDL (describing the service) can be generated automatically • Legacy Software • Any command line based tools can be wrapped into a web service using Soaplab • For example the EMBOSS tools are already available

  15. Extensibility Output Format: To ensure generic service compatibility it helps to define a common results format. As a result we are using the e-Family service schema (http://www.efamily.org.uk/) Current collaborators include: The Weizmann Institute - FoldIndex University of Oxford - RONN

  16. Results Viewers http://www.efamily.org.uk/software/dasclients/spice/

  17. Conclusions • Taverna and Web Services: • Taverna facilitates the provision of complex distributed systems that utilise web services • This reduces maintenance overheads and keeps technology requirements at a reasonable level • It is also easily extensible to accommodate new services • Availability: • Hopefully the core system will be ready by the end of the year • This will provide the basic workflow for users to customise according to their needs

  18. Acknowledgments Thanks to: Tom Oinn Andreas Prlic The RONN and FoldIndex teams The MSD Group

More Related