BioMOBY: An architecture for interoperability

BioMOBY:An architecture for interoperability Benjamin Good Wilkinson Laboratory iCAPTURE Centre University of British Columbia

Acknowledgements • Mark Wilkinson , Edward Kawas, Nina Opushneva – iCAPTURE @ UBC • Phillip Lord, Martin Senger – myGrid @ U Manchester • Heiko Schoof, Rebecca Ernst – MIPS • Paul Gordon - University of Calgary • Carole Goble – myGrid @ U Manchester • Lincoln Stein - CSHL • Damian Gessler, Andrew Farmer, Gary Schiltz - NCGR • Bill Crosby, Matthew Links, Luke McCarthy – U of S • Midori Harris – EBI & GO Consortium • Mike Niemi – IBM • Fiona Cunningham, Shuly Avraham – CSHL • Ken Stuebe – SDSC • Richard Bruskiewich – IRRI

Outline • What BioMOBY is • Why it was needed • How it works • Current Status • Works in Progress

What BioMOBY is A generic solution for sharing distributed computational resources

Why it was/is needed High throughput Biology SGD SGD SGD SGD SGD SGD SGD SGD

Why it was/is needed High throughput Biology SGD SGD SGD SGD SGD SGD SGD TAIR

Why it was needed High throughput Biology IRRI Gramene SGD SGD SGD SGD MIPS TAIR

Why it was/is needed High throughput Biology IRRI Gramene SGD SGD SGD GO TAIR MIPS

Why it was/is needed High throughput Biology IRRI Gramene SGD SGD ?!?!? GO IPGRI MIPS

An Architecture for Dis- Integration? DB1 Program DB2

Web ServicesAnother architecture for Dis-Integration? API1 API2 API3 WuBlast Genbank NCI

BioMOBYAn architecture for Integration Program DB2 DB1

Note the Target Audience • Not NCBI • Small to medium sized resource providers • First priority to support their own users • Limited time and money • Makes certain options impossible • No massive data warehouse • No standardization of implementation • (database, programming language)

Outline • What BioMOBY is • Why it was needed • How it works • Current Status

The Moby plan • Design an ontological framework for data-type creation • Let independent service providers build data-types using this framework • Use these data-types to define web service interfaces. • Register these interfaces in a “yellow pages” • Machines can find an appropriate service • Machines can execute that service unattended

Object Ontology • Data types defined in an open, shared GO-like ontology • Nodes define data Classes • Edges define the relationships between Classes • Edges define one of three relationships • ISA • Inheritance relationship • All properties of the parent are present in the child • HASA • Container relationship of ‘exactly 1’ • HAS • Container relationship with ‘1 or more’

Data-typing is the key • Each Object in the ontology maps to a simple, concise XML Schema • This rigid yet easily extensible structure facilitates serialization and parsing in any language. • Sharing a framework for creating data-types turns out to be largely sufficient to achieve interoperability

The Simplest Data-Type <Object namespace=‘NCBI_gi’ id=‘111076’/> The combination of a namespace and an identifier within that namespace uniquely identify a data ‘entity’. (Not its representation) Object

MOBY Primitives ISA DateTime ISA Float ISA Integer <Integer namespace=‘’ id=‘’>38</Integer> ISA Object String

A MOBY Data-Type <VirtualSequence namespace=‘NCBI_gi’ id=‘111076’> <Integer namespace=‘’ id=‘’ articleName=“length”>38</Integer> </ VirtualSequence > ISA Integer HASA ISA Object String ISA Virtual Sequence

A MOBY Data-Type <GenericSequence namespace=‘NCBI_gi’ id=‘111076’> <Integer namespace=‘’ id=‘’ articleName=“length”>38</Integer> <String namespace=‘’ id=‘’ articleName=“SequenceString”> ATGATGATAGATAGAGGGCCCGGCGCGCGCGCGCGC </String> </ GenericSequence > ISA Integer HASA HASA ISA Object String ISA ISA Virtual Sequence Generic Sequence

A MOBY Data-Type <DNASequence namespace=‘NCBI_gi’ id=‘111076’> <Integer namespace=‘’ id=‘’ articleName=“length”>38</Integer> <String namespace=‘’ id=‘’ articleName=“SequenceString”> ATGATGATAGATAGAGGGCCCGGCGCGCGCGCGCGC </String> </ DNASequence > ISA Integer HASA HASA ISA Object String ISA ISA ISA Virtual Sequence Generic Sequence DNA Sequence

A portion of the MOBY-S Object Ontology …community-built! 170 registered by 34 authorities

MOBY-S follows the typical Web Service Paradigm MOBY hosts & services Sequence Express. Protein Alleles … MOBY Central - yellowpages Align Phylogeny Primers Sequence Alignment Gene names

Outline • What BioMOBY is • Why it was needed • How it works • Current status • Works in progress

Moby Stats • Mailing list count 162 members • Google Scholar • ‘BioMOBY’ 103 • Citations of original BioMOBY paper 52 • Google links to biomoby.org 322

Deployed Moby Services • Services registered 478 • Services developers (by contact email) 69 http://castor.brc.mcw.edu/files/mobysphere/ > 10 < 10 Thanks to Simon Twigger

Major Implementations • PlaNet consortium • European consortium of plant databases • 121 Services • European Bioinformatics Institute SOAPLab, myGrid • National Bioinformatics Institute of Spain • Nationwide initiative • 35 public services (plus many more on private registry)

MOBY Central Activity

It seems to be working! Why? • It provides useful functionality for the target audience. • Functionality not currently available from any other WS/SWS project • It is not difficult to deploy services.

Outline • What BioMOBY is • Why it was needed • How it works • Current Status • Works in Progress

Is it useful outside of these consortia? • Many public services now available (via passive altruism). • As a result, interesting clients are emerging.

Client style 1,2,3 • Power User when you want to do what you already know how to do • Taverna • Produced by the myGrid Consortium • Graphical workflow composer and invoker • Supports BioMOBY services (and many others)

Taverna

Client style 1,2,3 • Quick and Dirty You know what you have and what you want, but you don’t know how to make it happen • MobyGraphs • Martin Senger of myGrid • Discovers service connectivity between two datatypes • PlaNet Service Aggregator • Precomputes all possible workflows starting from a single input

Client style 1,2,3 • Exploration Mode • Gbrowse_moby • Ahab Starting Data

Ahab • Java Server Pages • Simultaneous service invocations • Session stored as RDF graph • Results displayed with clickable graph. • 0_1 Runs all possible services • 0_2 Gives user control http://bioinfo.icapture.ubc.ca/bgood/Ahab.html

Core Development • Make service development even easier • Expand myGrid collaboration • Migrate to their registry & service ontology • Enhance support for BioMOBY in Taverna • Validation of workflows • Workflow construction “wizards” • Continue Development of Ahab • Visualization

Conclusions • BioMOBY was designed to allow distributed communities to share their computational resources, it seems to be working • Many new opportunities for real distributed data integration are starting to appear

Sponsors BC Bioinformatics Training Program BioMOBY • National Science Foundation (NSF), USA • Canadian Bioinformatics Resource, NRC, Halifax • Open-Bio Foundation • IBM

BioMOBY: An architecture for interoperability

BioMOBY: An architecture for interoperability

Presentation Transcript

Chapter 5 Case Study: MVC Architecture for Web Applications

Software Architecture

Fibonacci Numbers in Architecture

Distributed Systems Architecture Presentation II

Microsoft and Interoperability

COMPUTER ORGANIZATION AND ARCHITECTURE

DESIGN OF SOFTWARE ARCHITECTURE

JITC Interoperability Certification Process

Linux I/O

Introduction to Architecture

CSE503: Software Engineering Software architecture

V615 CeBS 1.x - Detailed Architecture

Software Architecture

Information Management CSC824

Network Code Interoperability and Data Exchange Rules

CORBA Part I A RMI case study

Presentation to the Australian Centre for Plant Functional Genomics

Chapter 5 Case Study: MVC Architecture for Web Applications

-Early Islamic Architecture- -Moorish Architecture-

Introduction of Revit Architecture, Structure, and System