190 likes | 193 Views
LSIDs in Taverna. Daniele Turi University of Manchester RDF, Ontologies and Metadata, Edinburgh, 7-9/6/06. Outline. Taverna Workbench: workflows of biological services LSIDs used to identify: data, workflows, workflow runs LSIDs and Named Graphs LSID Resolution
E N D
LSIDs in Taverna Daniele Turi University of Manchester RDF, Ontologies and Metadata, Edinburgh, 7-9/6/06
Outline • Taverna Workbench: • workflows of biological services • LSIDs used to identify: • data, workflows, workflow runs • LSIDs and Named Graphs • LSID Resolution • Security (under development) • LSID granularity
myGrid • eScience project • biological workflows • compose web services • execute • discover • audit/provenance
myGrid • eScience project • biological workflows • compose web services • execute • discover • audit/provenance Taverna
myGrid • eScience project • biological workflows • compose web services • execute • discover • audit/provenance Taverna Provenance Service Annotation/ Discovery
Taverna Workbench • Large user community in biology • about 1,000 downloads per month • one release each 6 weeks • Collect and browse provenance • new feature (released 2 days ago!)
Provenance as RDF • RDF generated automatically • audit trail • RDF is typed (semantics!) • 1 RDF graph for each workflow run • named graph
Workflow Run urn:lsid:…:workflow:6 urn:lsid:…:org:HY7 runs belongsTo urn:lsid:..:wfRun:HU77I8 launchedBy urn:lsid:…:person:4 hasInput hasInput urn:lsid:…:dataItem:K84P urn:lsid:…:dataItem:51HJ3
Typed Workflow Run launchedBy Provenance Ontology hasInput WorkflowRun Workflow DataObject Experimenter Organization belongsTo runs urn:lsid:…:workflow:6 urn:lsid:…:org:HY7 runs belongsTo urn:lsid:..:wfRun:HU77I8 launchedBy urn:lsid:…:person:4 hasInput hasInput urn:lsid:…:dataItem:K84P urn:lsid:…:dataItem:51HJ3
LSIDs • LSIDs used to identify: • data, workflows, workflow runs • internal • external LSIDs not used (call by value) • Taverna 2 (call by reference) near future • data and workflows (and people and organizations!) • Workflow runs LSIDs are names of graphs
Storage • Named RDF graphs • retrieve whole workflow runs • implementation in • Sesame2 native store • scalable • alpha release (bugs) • NG4J (Jena + MySQL) • scalability issues • Future implementations: Oracle and Boca
LSID Resolution • Implemented but not deployed • obstacle: single user v enterprise • virtual organisation • Resolution returns • only data for workflows and data • only metadata for workflow runs • Data v Metadata • why data immutable and metadata mutable?
Security • LSID granularity very good • Policies (in XACML) easily expressed in terms of LSIDs • LSID spec does not mention https and credentials • IBM Java Toolkit supports credentials
Security Policy • Scenario • supervisors can access all workflow runs in the organization • students can access only their own workflow runs • blacklisted users cannot access anything • See policySet.xml on myGrid wiki
Conclusions • LSIDs • Named Graphs • persistence • Ontologically typed RDF • Mutable v immutable identified with metadata v data • Credentials not part of LSID spec • LSID granularity for security