1 / 43

my Experiment – A Web 2.0 Virtual Research Environment David De Roure Carole Goble

my Experiment – A Web 2.0 Virtual Research Environment David De Roure Carole Goble. Overview. e-Science is about scientists doing science A Tale of Two Projects my Experiment Design Patterns for a VRE. Comb e Chem pilot project. Video. Simulation. Properties. Analysis.

kimo
Download Presentation

my Experiment – A Web 2.0 Virtual Research Environment David De Roure Carole Goble

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. myExperiment – A Web 2.0 Virtual Research Environment David De RoureCarole Goble

  2. Overview • e-Science is about scientists doing science • A Tale of Two Projects • myExperiment • Design Patterns for a VRE

  3. CombeChem pilot project Video Simulation Properties Analysis StructuresDatabase Diffractometer X-Raye-Lab Propertiese-Lab Grid Middleware www.combechem.org

  4. Virtual Learning Environment Reprints Peer-Reviewed Journal & Conference Papers Technical Reports LocalWeb Preprints & Metadata Institutional Archive Publisher Holdings Certified Experimental Results & Analyses Data, Metadata & Ontologies Undergraduate Students Digital Library Graduate Students E-Scientists E-Scientists E-Scientists Reducing time-to-experiment E-Experimentation Entire e-Science CycleEncompassing experimentation, analysis, publication, research, learning http://www.ukoln.ac.uk/projects/ebank-uk/

  5. Provenance • The key observation! • “Publication at Source” describes the need to capture data and its context from the outset and maintain a complete end-to-end connection between the laboratory bench and the intellectual chemical knowledge that is published as a result of the investigation The details of the origins of data are just as important to understanding as their actual values

  6. My Chemistry Experiment Box of Chemists

  7. Data creation & capture in “Smart lab” Presentation services: portals Data discovery, linking, citation Search, harvest Data analysis, transformation, mining, modelling Aggregator services Harvest Deposit e-Research workflows Institutional data repositories Laboratory repository e-Crystals Federation model Deposit Validation Validation Publication (Chemistry Central) Data curation & preservation: databases & databanks Linking, citation Publishers: peer-review journals, conference proceedings This work is licensed under a Creative Commons LicenceAttribution-ShareAlike 2.0

  8. Bioinformatics is not Chemistry There are many pieces, from many boxes, but no box, and no lid with a complete picture of what the puzzle is supposed to be. • Planning? No. • Metadata an afterthought

  9. myGrid • Open Source middleware for Life Scientists that enables them to undertake in silico experiments and share those experiments and their results. • Machinery for linking together datasets and tools • Individual scientists, in under-resourced labs, who use other people’s datasets and applications. • Ad hoc & exploratory workflows (data flows) • To support sharing and collaboration between scientists to disseminate best practice and improve the quality of science • 33,000 downloads; 200+ user sites; 400+ workflows; • 3500 third party external services accessible. • Moved from prototype to production quality. • Open Middleware Infrastructure Institute UK • http://www.mygrid.org.uk

  10. Taverna Workflow Workbench

  11. Widespread Adoption • Users in US, Asia, UK, Europe, Australia • Systems biology • Proteomics • Gene/protein annotation • Microarray data analysis • Medical image analysis • Heart simulation orchestration • High throughput screening of chemical compounds • Phenotypical studies • Public Health studies • Clinical trial analysis • Plants, Mouse, Human • Astronomy • Cultural Heritage

  12. Recycling, Reuse, Repurposing • Identified a pathway for which its correlating gene (Daxx) is believed to play a role in trypanosomiasis resistance. • Manual analysis on the microarray and QTL data failed to identify this gene as a candidate. • Repetitive, unbiased analysis. • Trypanosomiasis cattle workflow reused without change to identify the biological pathways involved in sex dependence in the mouse model, previously believed to be involved in the ability of mice to expel the parasite. • Previously a manual two year study of candidate genes had failed to do this. Paul Fisher et al A Systematic Strategy for Large-Scale Unbiased Analysis of Genotype-Phenotype Correlations Bioinformatics in review

  13. Service and workflow annotation Ontology 710 classes Full time curator Tagging by the masses 3500 service. 350 curated • Provenance • Ontology 35 classes • Enriched with domain ontologies and service ontologies. Possibly. • Export with data. Desirably.

  14. New Scientific Digital Artefacts Design • Workflow design history • Experiment purpose • Scientist LogBook • Workflow run log • Data lineage • Results interpretation log

  15. Kepler Triana New digital artefacts

  16. myExperiment.org Portal Party • 28th & 29th Sept 2006 • Hand picked Taverna users + Taverna development team • Facilitated by NCeSS. • AJAX based development • CombeChem xfer • A social networking environment for sharing any workflow • A Taverna workflow run environment • A multi-workflow launch environment

  17. openwetware.org

  18. What are we trying to do? • Enabling scientists to be (more) creative. • Enabling scientists to be scientists. And not programmers. • Enabling mediocre scientists to become better and thus have better science. • Enabling smart scientists to be smarter and propagate their smartness. • Accelerate dissemination, pooling, insight. • Encouraging sanctioned plagiarism.

  19. Principles • Focus on making it easy to publish information • Discovering and sharing experimental artefacts • Publishing results to standard community repositories • Publishing scholarly output • Familiar social networking / web paradigms • Keeping it free and fluid and creative. Me-Science. • Crossing system boundaries • Trans-workflow • Crossing discipline boundaries • Multi-disciplinary, Inter-disciplinary, Trans-disciplinary • Clustering expertise • Intellectual fusion outside discipline. We-Science. • Life Science, Social Science, Astronomy, Chemistry

  20. Scoping exercise • Workflow warehouse / federation of repositoriesOpen Archives Initiative. Federated myExperiments. Sharepoint. • Social space + organised rich siteSocial discourse + organised service / workflow space using curated semantics. • Granularity and identifiersRolling-up provenance. Id resolution • Open vs protected contentQuality, Reliability, Validation, Safety, Intellectual Property, Ownership, Secrecy, A duty of guardianship. Curation? Policing? Local data mixed with shared resources • Desktop integrationGoogle gadgets for workflows. Interacting with workflows through Office products. • Workflow execution(WHIP) Workflows Hosted in Portals project • Evolving the myExperiment software Community development • Enabling Scientists added valuethrough applications and collaborative tagging

  21. Hack Fest

  22. Q1. Workflow Warehouse orFederation of Repositories? • Everything on the myExperiment.org web site vs • Distributed stores • Multiple myExperiments

  23. Q2. Social Space or Shoe Shop? • Shopping for Workflows and Services and Data should be as easy as shopping for shoes. • Organic growth is good and bad. • Social tagging might help discover workflows but we need good metadata for automated use. 26/2/2007 | myExperiment | Slide 33

  24. Q3. How open is the content? • OpenWetware is open • Our users don’t want this • Provenance helps

  25. Q4. Integration • Bring user to Web Site vs • Bringing myExperimentness to existing interfaces

  26. Web 2.0 Design Patterns • http://www.oreillynet.com/pub/a/oreilly/tim/news/2005/09/30/what-is-web-20.html The Long Tail Data is the Next Intel Inside Users Add Value Network Effects by Default Some Rights Reserved The Perpetual Beta Cooperate, Don't Control Software Above the Level of a Single Device

  27. 1. The Long Tail • Our target users are not just the specialist e-Scientists using computing resources to tackle major scientific breakthroughs, but also the large number of scientists conducting the routine processes of science on a daily basis. • Through sharing we have the potential to enable smart scientists to be smarter and propagate their smartness, in turn enabling other scientists to become better and conduct better science.

  28. 2. Data is the Next “Intel Inside” • myExperiment understands that scientists are focused on data, not software or one particular workflow engine. • Workflows are components of customised applications, many of which are data-oriented rather than process-oriented. • Users manipulate, through their own applications, the product (data, model) yielded by the workflow. • Furthermore, workflows themselves are the data of myExperiment and provide its unique value.

  29. 3. Users Add Value • myExperiment makes it easy to find workflows and is designed to make it useful and straightforward to share workflows and add workflows to the pool. • To succeed we draw on the insights into the incentive models of scientists gained through experience with Taverna.

  30. 4. Network Effects by Default • myExperiment aggregates user data as a side-effect of using the VRE. • The ability to execute workflows from myExperiment, and the integration of tools such as Taverna with myExperiment, further enable us to achieve increased value through usage.

  31. 5. Some Rights Reserved • myExperiment users require protection as well as sharing, but the environment is designed for maximum ease of sharing to achieve collective benefits – workflows are "hackable" and "remixable". • Initiatives such as Science Commons provide a useful context for this.

  32. 6. The Perpetual Beta • myExperiment is an online service (a collection of online services) and is continually evolving in response to its users. • To support this, the project commenced with developers being embedded in the user community. • Through day-to-day contact between designers and researchers, design is both inspired and validated.

  33. 7. Cooperate, Don't Control • myExperiment is a network of cooperating data services with simple interfaces which make it easy to work with content. • It both provides services and reuses the service of others. • It aims to support lightweight programming models so that it can easily be part of loosely coupled systems.

  34. 8. Software Above the Level of a Single Device • The current model of Taverna running on the scientist’s desktop PC or laptop is evolving into myExperiment being available through a variety of interfaces and supporting workflow execution.

  35. Closing • e-Science is difficult – workflows and Web 2.0 make it easier. • Our design workshops and the review against Web 2.0 design patterns have revealed the relationship between myExperiment and Web 2.0. • The collective benefits of participation arise not only from the users but also from the developers – ease of use and ease of development. • It might be useful to review other VREs against the design patterns.

  36. Take homes • myExperiment is a Web 2.0 Environment for Scientists to share experiments • Join us! • David De Roure • dder@ecs.soton.ac.uk • Carole Goble • carole.goble@manchester.ac.uk

  37. Credits • myGrid and CombeChem • Matt Lee • David Withers • Don Cruickshank • Rob Procter • Alex Voss • June Finch • Ed Zaluska • All the users inc. embedders

More Related