340 likes | 353 Views
The SCOPE System is a tool designed to simplify the authoring and publishing of compound objects in scientific research. It provides an interactive GUI for linking and specifying components retrieved from institutional repositories, visualizes provenance/workflows, and allows for the assignment of URIs, attachment of metadata, and licensing.
E N D
The SCOPE SystemScientific Compound ObjectPublishing and Editing Kwok Cheung & Jane Hunter The University of Qld Australia DCC, Dec12-13, 2007
Overview • Objectives of SCOPE • Background and Related Work • OAI-ORE • Architecture • User interface and functionality • Demo • Conclusions and Future Work DCC, Dec12-13, 2007
Scientific Publishing Researchers are under increasing pressure to: • publish raw and derivative data • document precise provenance • share data, methodology + analytical & modelling services • enable review, re-use, repeatability and validation • maintain competitiveness and protect IP DCC, Dec 12-13, 2007
Barriers Lack of: • Simple tools for recording methodology plus derived results – time consuming, difficult • Simple tools for publishing data and methods or linking them to publications • Standards – data, metadata, workflows, provenance • Incentive • Issue of Granularity • Concern for IP/ownership DCC, Dec 12-13, 2007
Existing Systems • Nature, Acta Crytallographica, ACS, ePIC -> link to databases PDB, GenBank,SwissPROT • Datuments – (Murray-Rust and Rzepa) • XML documents with data embedded in marked-up document • No semantic relationships • Little or no provenance data • Lack of selectivity, interactivity or flexibility (assume fixed methodology) • No multi-level access – open or restricted • Hardwired presentation/display
eScience Workflows Organization B Organization C Organization A t1 t8 t8 t5 t6 t8 Model Validation, Statistical Analysis Initiate New Experiments t2 Publications t3 Data Integration, Exploration Model Formulation Data Processing Semantic Indexing Conduct Experiments Capture Experimental Results/Data Kepler, Taverna, CombeChem, eLab notebooks BPEL4WS – workflow based on web services
Experimental Design Experimental Results Samples Samples ModellingeScienceProvenance -> RDF Experiment Processing Objectives State3 State1 State2 Type Type Model Input Input Event E1 Event E2 Output Output Visualization Action Action Context Context Tool Tool Zeiss STEMI microscope Scope MatLab Role Role Date/ Time Place Agent Date/ Time Place Conditions Agent • Agents/Actors can be people, instruments or software e.g., web services • Need to record events in both digital and physical world
Ideal - Scientific Publication Packages RDF Package External Database Title Creator Description Type Discipline Date.Published License Derived_from analysis_of Average LE = 1/T exp –(A –B/T) derived_from graph_of refers_to refers_to Slattery, O., Lu, R., Zheng, J., Byers, F., Tang, X. "Stability Comparison of Recordable Optical Discs- A study of error rates in harsh conditions," Journal of Research of the NIST, 109, 517-524, 2004 Each component has software, OS, hardware dependencies + interdependencies
Compound Digital Objects Digital content with multiple components of variable: • Content/semantic type or genre • Article, web page, documentary, photo, dataset, music recording • Media types • Text, image, video, audio, 3D, numerical, mixed • Media formats (PDF, XML, MPEG-1, AVI, SMIL) • Network locations • Institutional repositories, databases, web sites • Typed Relationships between components • Lineage, versions, derivations, is_part_of DCC, Dec 12-13, 2007
Objectives of OAI-ORE • Develop standardized, interoperable, machine-readable mechanisms to express compound objects on the web • Enable more effective and consistent ways: • to support the creation, management and dissemination of these objects; • to facilitate the discovery of these objects, • to reference (link to) these objects (and parts thereof), • to provide access to different representations of these objects, • to aggregate and disaggregate these objects, • to enable their re-use by repositories, agents, and services beyond the bounds of the holding repository • Provide foundation for value-adding services • Perfect for representing Scientific Publication Packages DCC, Dec 12-13, 2007
Complex Compound Object http://arXiv.org/astro-ph/061175/ • OAI/ORE Named Graphs/Resource Maps: • Define set of components • Typed Relationships between components • Relationships to external components • Different views of the compound object • Metadata attached to compound object • creator, created, rights Identifier URI PDF cites is_derived_from PS HTML hasRepresentation MP3 View1.html hasRepresentation View2.smil DCC,Dec 12-13, 2007
Components Distributed Across Repositories Identifier http://arXiv.org/astro-ph/061175/ DSpace Fedora SRB PS MP3 HTML PDF DCC, Dec 12-13, 2007
SCOPE Objectives • Simple easy-to-use streamlined tool for authoring compound objects (OAI-ORE compliant) • Interactive GUI to specify and link components retrieved from: • Web – institutional repositories • Visualized Provenance/Workflows - LIMS • Label/infer relationships – with type • Assign URI, attach metadata, license and publish • RSS notification Microsoft eScience Workshop RENCI Oct 21-23, 2007
Automatic Inferencing IF (processed_powder input_to X-ray_diffraction) AND (X-ray_diffraction outputs XRD_pattern) THEN (processed_powder characterized_by XRD_Pattern)
Typed Relationships isRelatedTo subPropertyOf isPartOf isMember ofCollection RefersTo isDerivedFrom Describes inversePropertyOf isReferencedBy isTranslationOf isVersionOf • Spatial relationship ontology • Temporal relationship ontology • Discipline-specific relationship ontologies DCC, Dec 12-13, 2007
Inferencing Rules • IF (A isDerivedFrom B) AND (B isDerivedFrom C) THEN (A isDerivedFrom C) (transitive) • IF(processed_powder isInputTo X-rayDiffractometer) AND(XRDPattern isOutputFrom X-rayDiffractometer) THEN(processed_powder isCharacterizedBy XRD_Pattern) (XrayDiffractometer isSubclassOf CharacterizationInstrument) DCC, Dec 12-13, 2007
Publishing Process • Assign URI, and update/enhance Metadata for Compound Object • Attach Creative Common License • Publish as: • RDF/XML • TriX , TriG, N-Triple, N3 • Atom • FOXML • Ingest to Fedora DCC, Dec 12-13, 2007
Update Metadata DCC, Dec 12-13, 2007
Export Named Graph in TriX DCC, Dec 12-13, 2007
Output as Atom DCC, Dec 12-13, 2007
Output as FOXML DCC, Dec 12-13, 2007
Tabbed Web Browser DCC, Dec 12-13, 2007
Compound Object Display Microsoft eScience Workshop RENCI Oct 21-23, 2007
Future Work • Top-level relationship ontology • Relationship ontologies for particular disciplines • Inferencing rules • Search, retrieval and editing/re-use of scientific compound objects – ongoing enhancement • Presentation/display rules • Map from typed relationships -> spatio-temporal relationships • Annotation/social tagging services – peer review • Detailed Evaluation trials DCC, Dec 12-13, 2007
Preservation of Compound Objects • Parse compound object • For each node/component within ReM • Retrieve/extract preservation metadata • Check format registry (GDFR, PRONOM) • Check software version registry • Notify custodian/owner of potential obsolescence • Migrate to new format/version DCC, Dec 12-13, 2007
Conclusions • Simple, flexible tool for encapsulating and publishing (data+ derived data+ processing/analysis+publication) • As an OAI-ORE-alpha-compliant Scientific Compound Object • Maximizes re-use of components • Maximizes potential value-add services/knowledge mining • Maximizes chances of long-term preservation • Embeds and protects confidential information in expandable links/typed relationships • Comparison with Semantic Datuments • XHTML+Microformats/RDFa+GRDDL DCC, Dec 12-13, 2007
Acknowledgements • Kwok Cheung, School of ITEE • Anna Lashtaberg, John Drennan • Australian Institute of Bioengineering and Nanotechnology • Carl Lagoze and Herbert Van de Sompel – OAI/ORE DCC, Dec 12-13, 2007
For Further Information http://www.openarchives.org/ore/ http://www.itee.uq.edu.au/~eresearch Contacts: j.hunter@uq.edu.au kwokc@itee.uq.edu.au DCC, Dec 12-13, 2007