1 / 51

BioMOBY 2005: Its working! Now What?!

BioMOBY 2005: Its working! Now What?!. Benjamin Good Wilkinson Laboratory iCAPTURE Centre University of British Columbia. http://bioinfo.icapture.ubc.ca/bgood. Acknowledgements. Mark Wilkinson , Edward Kawas, Nina Opushneva – iCAPTURE @ UBC

pier
Download Presentation

BioMOBY 2005: Its working! Now What?!

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. BioMOBY 2005: Its working! Now What?! Benjamin Good Wilkinson Laboratory iCAPTURE Centre University of British Columbia http://bioinfo.icapture.ubc.ca/bgood

  2. Acknowledgements • Mark Wilkinson , Edward Kawas, Nina Opushneva – iCAPTURE @ UBC • Phillip Lord, Martin Senger – myGrid @ U Manchester • Heiko Schoof, Rebecca Ernst – MIPS • Paul Gordon - University of Calgary • Carole Goble – myGrid @ U Manchester • Lincoln Stein - CSHL • Damian Gessler, Andrew Farmer, Gary Schiltz - NCGR • Bill Crosby, Matthew Links, Luke McCarthy – U of S • Midori Harris – EBI & GO Consortium • Mike Niemi – IBM • Fiona Cunningham, Shuly Avraham – CSHL • Ken Stuebe – SDSC

  3. Outline • What BioMOBY is • Why it was needed • How it works • What is being done with it now • What might be next.

  4. What BioMOBY is A generic solution for sharing distributed computational resources

  5. Why it was needed High throughput Biology SGD SGD SGD SGD SGD SGD SGD SGD

  6. Why it was needed High throughput Biology SGD SGD SGD SGD SGD SGD SGD TAIR

  7. Why it was needed High throughput Biology SGD NCGR SGD SGD SGD SGD MIPS TAIR

  8. Why it was needed High throughput Biology SGD NCGR SGD SGD SGD GO TAIR MIPS

  9. Why it was needed High throughput Biology SGD NCGR SGD SGD ?!?!? GO TAIR MIPS

  10. Dis- Integration? DB1 Program DB2

  11. Moby DIC Meeting Sept. 2001 • Model Organism Bring Your Own Database Interface Conference • All model organism databases invited • Some could not attend because it happened right after September 11th • BioMOBY project emerged from this meeting

  12. Note the Target Audience • Not NCBI • Small to medium sized resource providers • First priority to support their own users • Limited time and money • Makes certain options impossible • No massive data warehouse • No standardization of implementation • (database, programming language)

  13. Outline • What BioMOBY is • Why it was needed • How it works • What is being done with it now • What might be next.

  14. The Moby plan • Design an ontological framework for data-type creation • Let independent service providers build data-types using this framework • Use these data-types to define web service interfaces. • Register these interfaces in a “yellow pages” • Machines can find an appropriate service • Machines can execute that service unattended

  15. Object Ontology • Data types defined in an open, shared GO-like ontology • Nodes define data Classes • Edges define the relationships between Classes • Edges define one of three relationships • ISA • Inheritance relationship • All properties of the parent are present in the child • HASA • Container relationship of ‘exactly 1’ • HAS • Container relationship with ‘1 or more’

  16. Data-typing is the key • Each Object in the ontology maps to a simple, concise XML Schema • This rigid yet easily extensible structure facilitates serialization and parsing in any language. • Sharing a framework for creating data-types turns out to be largely sufficient to achieve interoperability

  17. The Simplest Data-Type <Object namespace=‘NCBI_gi’ id=‘111076’/> The combination of a namespace and an identifier within that namespace uniquely identify a data ‘entity’. (Not its representation) Object

  18. MOBY Primitives ISA DateTime ISA Float ISA Integer <Integer namespace=‘’ id=‘’>38</Integer> ISA Object String

  19. A MOBY Data-Type <VirtualSequence namespace=‘NCBI_gi’ id=‘111076’> <Integer namespace=‘’ id=‘’ articleName=“length”>38</Integer> </ VirtualSequence > ISA Integer HASA ISA Object String ISA Virtual Sequence

  20. A MOBY Data-Type <GenericSequence namespace=‘NCBI_gi’ id=‘111076’> <Integer namespace=‘’ id=‘’ articleName=“length”>38</Integer> <String namespace=‘’ id=‘’ articleName=“SequenceString”> ATGATGATAGATAGAGGGCCCGGCGCGCGCGCGCGC </String> </ GenericSequence > ISA Integer HASA HASA ISA Object String ISA ISA Virtual Sequence Generic Sequence

  21. A MOBY Data-Type <DNASequence namespace=‘NCBI_gi’ id=‘111076’> <Integer namespace=‘’ id=‘’ articleName=“length”>38</Integer> <String namespace=‘’ id=‘’ articleName=“SequenceString”> ATGATGATAGATAGAGGGCCCGGCGCGCGCGCGCGC </String> </ DNASequence > ISA Integer HASA HASA ISA Object String ISA ISA ISA Virtual Sequence Generic Sequence DNA Sequence

  22. A portion of the MOBY-S Object Ontology …community-built! 137 registered by 34 authorities

  23. How it works Service Providers Sequence Express. Protein Alleles … MOBY Central registry Gene names Client

  24. Outline • What BioMOBY is • Why it was needed • How it works • What is being done with it now • What might be next.

  25. Moby Stats • Mailing list count 162 members • Google Scholar • ‘BioMOBY’ 103 • Citations of original BioMOBY paper 52 • Google links to biomoby.org 322

  26. Deployed Moby Services • Services registered 272 total, 249 non-test • Services developers (by contact email) 69 • Budget - US$230,000 3 years http://castor.brc.mcw.edu/files/mobysphere/ > 10 < 10 Thanks to Simon Twigger

  27. Major Implementations • PlaNet consortium • European consortium of plant databases • 121 Services • National Bioinformatics Institute of Spain • Nationwide initiative • 35 Services • CGIAR-GCP & ACPFG

  28. Registry use 2004-2005 PlaNet implements separate Moby registry Requests Month

  29. It seems to be working! Why? • It provides useful functionality for the target audience. • Functionality not currently available from any other WS/SWS project • It is not difficult to deploy services.

  30. Is it useful outside of these consortia? • Many public services now available (via passive altruism). • As a result, interesting clients are emerging.

  31. Client style 1,2,3 • Power User when you want to do what you already know how to do • Taverna • Produced by the myGrid Consortium • Graphical workflow composer and invoker • Supports BioMOBY services (and many others)

  32. Taverna

  33. Client style 1,2,3 • Quick and Dirty You know what you have and what you want, but you don’t know how to make it happen • MobyGraphs • Martin Senger of myGrid • Discovers service connectivity between two datatypes • PlaNet Service Aggregator • Precomputes all possible workflows starting from a single input

  34. Client style 1,2,3 • Exploration Mode • Gbrowse_moby • Ahab Starting Data

  35. Ahab • Java Server Pages • Simultaneous service invocations • Session stored as RDF graph • Results displayed with clickable graph. • 0_1 Runs all possible services • 0_2 Gives user control http://bioinfo.icapture.ubc.ca/bgood/Ahab.html

  36. Outline • What BioMOBY is • Why it was needed • How it works • What is being done with it now • What might be next.

  37. Current Development • Make service development even easier • Expand myGrid collaboration • Migrate to their registry & service ontology • Enhance support for BioMOBY in Taverna • Validation of workflows • Workflow construction “wizards” • Continue Development of Ahab • Visualization

  38. Current Research • mySWeb • “Ishmael” MOBY exploration tool • Unattended construction of a personalized semantic web centered around user requests • Minimally curated community ontology construction • It can work • How can we use and improve the process

  39. Summary • BioMOBY was designed to allow distributed communities to share their computational resources, it seems to be working • Many new opportunities for real distributed data integration are starting to appear • New ways of thinking about the Semantic Web are arising!

  40. Conclusion If the Service Web and the Semantic Web are to succeed as the WWW has, the end-users and the novice developers must be able to contribute easily BioMOBY is working because it makes this possible

  41. Sponsors BC Bioinformatics Training Program BioMOBY • National Science Foundation (NSF), USA • Canadian Bioinformatics Resource, NRC, Halifax • Open-Bio Foundation • IBM

More Related