190 likes | 204 Views
Explore how LSIDs are used to identify data, workflows, and workflow runs in Taverna Workbench. Learn about LSID granularity, security, and the integration of LSIDs with Named Graphs and Provenance. Discover the benefits and challenges of using LSIDs in a scientific workflow environment.
E N D
LSIDs in Taverna Daniele Turi University of Manchester RDF, Ontologies and Metadata, Edinburgh, 7-9/6/06
Outline • Taverna Workbench: • workflows of biological services • LSIDs used to identify: • data, workflows, workflow runs • LSIDs and Named Graphs • LSID Resolution • Security (under development) • LSID granularity
myGrid • eScience project • biological workflows • compose web services • execute • discover • audit/provenance
myGrid • eScience project • biological workflows • compose web services • execute • discover • audit/provenance Taverna
myGrid • eScience project • biological workflows • compose web services • execute • discover • audit/provenance Taverna Provenance Service Annotation/ Discovery
Taverna Workbench • Large user community in biology • about 1,000 downloads per month • one release each 6 weeks • Collect and browse provenance • new feature (released 2 days ago!)
Provenance as RDF • RDF generated automatically • audit trail • RDF is typed (semantics!) • 1 RDF graph for each workflow run • named graph
Workflow Run urn:lsid:…:workflow:6 urn:lsid:…:org:HY7 runs belongsTo urn:lsid:..:wfRun:HU77I8 launchedBy urn:lsid:…:person:4 hasInput hasInput urn:lsid:…:dataItem:K84P urn:lsid:…:dataItem:51HJ3
Typed Workflow Run launchedBy Provenance Ontology hasInput WorkflowRun Workflow DataObject Experimenter Organization belongsTo runs urn:lsid:…:workflow:6 urn:lsid:…:org:HY7 runs belongsTo urn:lsid:..:wfRun:HU77I8 launchedBy urn:lsid:…:person:4 hasInput hasInput urn:lsid:…:dataItem:K84P urn:lsid:…:dataItem:51HJ3
LSIDs • LSIDs used to identify: • data, workflows, workflow runs • internal • external LSIDs not used (call by value) • Taverna 2 (call by reference) near future • data and workflows (and people and organizations!) • Workflow runs LSIDs are names of graphs
Storage • Named RDF graphs • retrieve whole workflow runs • implementation in • Sesame2 native store • scalable • alpha release (bugs) • NG4J (Jena + MySQL) • scalability issues • Future implementations: Oracle and Boca
LSID Resolution • Implemented but not deployed • obstacle: single user v enterprise • virtual organisation • Resolution returns • only data for workflows and data • only metadata for workflow runs • Data v Metadata • why data immutable and metadata mutable?
Security • LSID granularity very good • Policies (in XACML) easily expressed in terms of LSIDs • LSID spec does not mention https and credentials • IBM Java Toolkit supports credentials
Security Policy • Scenario • supervisors can access all workflow runs in the organization • students can access only their own workflow runs • blacklisted users cannot access anything • See policySet.xml on myGrid wiki
Conclusions • LSIDs • Named Graphs • persistence • Ontologically typed RDF • Mutable v immutable identified with metadata v data • Credentials not part of LSID spec • LSID granularity for security