1 / 43

BioSimGRID and BioSimGRID ’lite’ Towards a worldwide repository for biomolecular simulation

BioSimGRID and BioSimGRID ’lite’ Towards a worldwide repository for biomolecular simulation www.biosimgrid.org. Philip C Biggin http://indigo1.biop.ox.ac.uk phil@biop.ox.ac.uk. Overview. Introduction - Motivation - Consortium - Case stud ies – added value from comparisons

Download Presentation

BioSimGRID and BioSimGRID ’lite’ Towards a worldwide repository for biomolecular simulation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. BioSimGRID and BioSimGRID ’lite’ • Towards a worldwide repository for biomolecular simulation • www.biosimgrid.org Philip C Biggin http://indigo1.biop.ox.ac.uk phil@biop.ox.ac.uk

  2. Overview • Introduction • - Motivation • - Consortium • - Case studies – added value from comparisons • Design • - Architecture • - Data schema • How to use • - Deposition • - Analysis • - Worldwide application • The Future • Towards computational systems biology

  3. Current Paradigm for MD Simulations • Target selection: literature based; interesting protein/problem • System preparation: highly interactive; slow; idiosyncratic • Simulation: diversity of protocols • Analysis: highly interactive; slow; idiosyncratic • Dissemination: traditional – papers, posters, talks • Archival: ‘archive’ data … and then mislay the tape! • No third party involvement

  4. Integrating Simulations and Structural Biology of Proteins Sequence alignment Biomedically relevant homologue(s) Novel structure (RCSB) bacterial K channel bioinformatics & structural biology Homology model(s) mammalian K channel Biomolecular simulation database MD simulations dynamics in membrane BioSimGRID Comparative analysis Interaction site dynamics Evaluation/refinement of model Biological and pharmacological simulation & modelling e.g. drug discovery drug docking calculations drug discovery

  5. York Nottingham Oxford RAL London Bristol Southampton Consortium • Oxford:Mark Sansom, Paul Jeffreys, Bing Wu, Kaihsu Tai • Southampton:Jon Essex, Simon Cox, Stuart Murdock, Muan Hong Ng, Hans Fogohr, Steven Johnston • London:David Moss • Nottingham:Charlie Laughton • York:Leo Caves • Bristol: Adrian Mulholland

  6. Comparative Simulations: Drug Receptors • Why? – increase significance of results • Sampling – long simulations and multiple simulations • Sampling via biology – exploiting evolution • Biology emerges from comparisons… • e.g. mammalian receptor vs. bacterial binding protein glutamate • Rat GluR2 EC fragment • Major receptor in mammalian brains – drug target • MD simulations with/without bound ligands • Analyse inter-domain motions D1 D2

  7. GluR2 – Flexibility & Gating… Kainate empty Glutamate > >> “ON” “OFF” 4 empty • Flexibility depends on ligand occupancy & species • Gating mechanism – decrease in flexibility on channel activation • But … incomplete sampling • Need: longer simulations & comparative simulations 3 +Glu RMSD (Å) 2 1 +Kai 0 2.0 0 0.5 1.0 1.5 time (ns)

  8. + Gln GlnBP – A Bacterial Binding Protein MD Simulation X-ray structures empty Gln bound empty Gln bound • GlnBP – bacterial 2-domain periplasmic binding protein • Similar fold to mammalian GluR2 • X-ray shows ligand binding induces domain closure • MD shows ligand binding reduces inter-domain motions - cf. GluR2 simulations

  9. Case Study 2.. OMPLA AChE Acetylcholinesterase Outer-membrane phospholipase

  10. So how do compare… • Similar active sites or similar motions • Different structures • Simulated with different MD packages (analysis difficult if not visualization) • On different hard drives/tapes/CDs/DVDs. • Under different graduate students’ desks • Under different postdocs’ beds • In different rubbish bins!

  11. Answer… Create a wordwide repository of molecular simulations….  BioSimGrid = BioSimDB + Toolkits + Integration

  12. Apache / Tomcat / SSL / Python Authentication Authorisation Accounting Trajectory Query Tool SQL Editor Data Retrieval Tool Data Deposition Tool Video/Img Engine HTML Generator Analysis Tool Database BioSimGrid Architecture… GUI Web Application Python Application HTTP(S) SSH Service TCP/IP Middle- ware BioSim Data Engine / Storage Resource Broker TCP/IP DB/Data Flat Files

  13. Cross-software Analysis… • BioSimDB = PDB (or NDB) for MD •  enable discovery of new science (cf. genomics/proteomic initiatives) CHARMM NAMD BioSimDB LAMMPS AMBER GROMACS TINKER

  14. It’s a Distributed Database • Nobody has enough disk space in one place anyway • Distributed and duplicate • Any piece of information is stored in at least two sites • …for resilience

  15. Database Database Current Architecture soton.biosimgrid.org oxford.biosimgrid.org BioSim Data Engine Services BioSim Data Engine Services IDA MCAT SRB Server SRB Server MCAT SRB Agent IDA SRB Agent F/F Interface DB Interface F/F Interface DB Interface F/F Engine DB Engine F/F Engine DB Engine Flat Files Cache Flat Files Cache

  16. Data Schema • The hierachy is like that in the PDB: • Chain  residue  atom  coordinate • …but also extended in the time dimension: frames

  17. Metadata.. • …is the data about data • MD setup, parameters, instantaneous properties, etc. • People currently write this in papers • People forget something • The disciplined way:- • …structured schema

  18. Deposition… Unified deposition for trajectories from any packages.

  19. Analysis

  20. BioSimDB Toolkit • Analysis tools • Radius of Gyration • Surface and Volume • RMSD/RMSF • Centre of Mass • Inter-atomic distances • Distance matrix • Internal angles • Principal Component Analysis • Average structure

  21. Current Implementation

  22. New workflow with BioSimGrid • Target selection: literature based; interesting protein/problem • Perform simulation (or use someone else’s) • Protocals more systematically recorded/checked/confirmed • Archive data to BioSimGrid • Analyse shared data (either locally or distributed) • Dissemination: traditional – papers, posters, talks • Store results in BioSimGrid • Third parties can analyse data you deposit

  23. That’s dandy - but who is this aimed at? • Novice and Expert.. • Novice (web/GUI) • Makes selections • Guided through the options • Can only do specific things • Difficult to make mistakes • Expert (employ scripting) • Python interpreter • Much available • Reasonably unrestricted

  24. Example sessions

  25. Example sessions

  26. Example sessions

  27. Example sessions

  28. Example sessions

  29. Example sessions

  30. Example sessions

  31. Example sessions Even in script mode the syntax is quite informative:- FC = FrameCollection(`2, 100-200`) myRMSD = RMSD(FC) myRMSD.createPNG()  Provide biochemists with little computational experience a means of analysing computational data and obtain meaningful results.

  32. Example sessions Viewlet of a session; Demo4.html

  33. BioSimGrid ‘Lite’ • Light version before final rollout • Provides equilibrated lipid bilayer boxes • Also provides ontogeny: How the box came about… • …metadata • …equilibration process (all the frames)

  34. Deliverables to Date… • Database schema • Sample database (with test trajectories) • Prototype shared between 2 sites • Analysis tools – preliminary versions (about 14 tools) • Interface to database for data retrieval • Python hosting environment

  35. Roadmap • Dec 2002 – project started • July 2003 – (internal) prototype • September 2003 – working prototype (All Hands meeting) • November 2003 – test ‘real world’ applications • December 2003 – multi-site prototype • 2004 – multi-site deposition of data • 2005 – open up to additional groups for deposition/testing

  36. If you are interested… The team would like to hear from interested parties especially with new ideas etc • Benefits to you • New directions are implemented • Toolkit suits your needs • Shared development of code • Faster and more thorough development • BioSimGrid Benefits • Larger user community • More work gets done • Code is efficient. • BioSimGrid and community is successful

  37. Future Directions in the GRID context • HTMD – simulations coupled to structural genomics • Diamond light source • Computational system biology – virtual outer membrane • HPCx • Multiscale biomolecular simulations – from QM/MM to meso-scale modelling • GRID-enabled simulations BioSimGrid

  38. Structural Genomics & HTMD synchrotron compute GRID MD database novel biology… • Overall vision – simulation as an integral component of structural genomics • Needs capacity computation – GRID? • MD database (distributed) – BioSimGRID

  39. d- d- d- d- TolC OMPLA Pi d+ OpcA OmpT OmpX FhuA LamB OmpA OmpF PhoE TonB MalE FhuD Pi PiBP Towards a Virtual Outer Membrane (vOM) • First step towards computational systems biology – a suitable system • Bacterial OMs – 5 or 6 proteins = 90% of protein content • Structures or good homology models of proteins are available • Complex lipid – outer leaflet is lipopolysaccharide (LPS) • Minimum system size ca. 2.5x106 atoms; simulation times ca. 50 ns • cf. current FhuA – 80,000 atoms & 10 ns – need HPCx

  40. Multiscale Biomolecular Simulations QM (Bristol) Drug-binding (Southampton) Protein Motions (Oxford) Drug Diffusion (London) • Membrane bound enzymes – major drug targets (cf. ibruprofen, anti-depressants, endocannabinoids) • Complex multi-scale problem: QM/MM; ligand binding; membrane/protein fluctuations; diffusive motion of substrates/drugs in multiple phases • Need for GRID-based integrated simulations

  41. References… • K. Tai, S. Murdock, B.Wu, MH Ng, S. Johnston, H. Fangohr, S. Cox, P Jeffreys, J. Essex, M.S.P. Sansom. Org. Biomol. Chem :: Under review • MH Ng, S. Johnston, S. Murdock, B. Wu, K. Tai, H. fangohr, S. Cox, J. Essex, M.S.P. Sansom, P.Jeffrey. • UK E-Science Programme All Hands Meeting 2004 :: Accepted. • 3. Python Website – www.python.org • 4. BioSimGrid – www.biosimgrid.org

  42. Acknowledgements Oxford e-Science Center Professor Paul Jeffreys Dr Bing Wu (database management) Matthew Dovey Ivaylo Kostadinov Oxford Professor Mark Sansom Dr Carmen Domene Dr Alessandro Grottesi Dr Andrew Hung Dr Daniele Bemporad Dr Shozeb Haider Dr Kaihsu Tai (curation and integration) Dr George Patargias Oliver Beckstein Jennifer Johnston Syma Khalid Jorge Pikunic Pete Bond Zara Sands Jonathan Cuthbertson Sundeep Deol Jeff Campbell Yalini Pathy Loredana Vaccaro Shiva Amiri Katherine Cox Robert d’Rozario John Holyoake Samantha Kaye Anthony Ivetac Sylvanna Ho Southampton Dr Stuart Murdock (generic analysis tools) Dr Muan Hong Ng (data retrieval) Dr Hans Fangohr Steven Johnston Prof Simon Cox Dr Jon Essex Elsewhere Leo Caves (York) Charles Laughton (Nottingham) David Moss (Birkbeck) Oliver Smart (Birmingham) Adrian Mulholland (Bristol) Marc Baaden (Paris) BBSRC DTI The Wellcome Trust GSK EC (TMR) OeSC (EPSRC & DTI)EPSRC OSC(JIF) MRC

  43. More information… team@biosimgrid.org www.biosimgrid.org

More Related