190 likes | 320 Views
Paul Groth, Simon Miles, Luc Moreau. Outline. Process Documentation for Provenance Power of the P-Structure P-assertion Recording Protocol PReServ’s Functionality Performance Pitch . Provenance. The Provenance Question Lots of definitions… Boil it down to a question.
E N D
Paul Groth, Simon Miles, Luc Moreau UK e-Science All Hands Meeting 2005
Outline • Process Documentation for Provenance • Power of the P-Structure • P-assertion Recording Protocol • PReServ’s Functionality • Performance • Pitch UK e-Science All Hands Meeting 2005
Provenance • The Provenance Question • Lots of definitions… • Boil it down to a question. • What is the process that led to a particular result? • How do we answer this question? • Search through documentation. UK e-Science All Hands Meeting 2005
Documentation • Process Documentation • encompasses all other documentation • SOA based model of process • Actors communicate via message passing • Actors make ASSERTIONS to document process. Termed p-assertions. • How to organise these p-assertions UK e-Science All Hands Meeting 2005
P-Structure UK e-Science All Hands Meeting 2005
P-Structure View UK e-Science All Hands Meeting 2005
Benefits • Domain independent queries • That are provenance specific • P-structure is a shared logical organisation of p-assertions • Does not prescribe how p-assertions are exactly stored in an implementation. UK e-Science All Hands Meeting 2005
PReP • Introduces the Provenance Store • A Separate entity for maintaining process documentation • PReP specifies how an actor can communicate with the Provenance Store. • PReP has a number of nice properties. • Statelessness • Idempotence • Terminiation UK e-Science All Hands Meeting 2005
An Implementation • What is PReServ? • A Web Services implementation of a Provenance Store • Implements • PReP for recording • XQuery for querying • Provides libraries and wrappers for making applications provenance aware. UK e-Science All Hands Meeting 2005
PReServ Implementation Diagram WS Client Web Service Axis Handler Axis Handler PS Client Side Library PS Client Side Library Backend Store Interface PS Client Side Library In-Memory Store Database Store … Query Actor WS Backend Stores Provenance Store WS Calls Java Calls UK e-Science All Hands Meeting 2005
Implementation cont. • Caching mechanism to improve performance • Berkeley Java Database 2.0 • No setup required • Completely Transactional SOAP Msg SOAP Msg Dispatcher Store Plug In Query Plug In … Backend Store Interface Java Object Database Memory … UK e-Science All Hands Meeting 2005
Requirements • Apache Tomcat 5.0 • Apache Ant 1.6.2 • Java 1.5 (1.4 supported with some help) • Pure Java, tested on • Windows • Mac OS X • Debian Linux UK e-Science All Hands Meeting 2005
Evaluation Deployment • Protein Compressibility Experiment • HPDC’05 • Workflow runs under VMWare • deployment consistency • ease of development • Workflow is executed on one machine • PReServ runs on another machine • Version 0.1.5 of PReServ UK e-Science All Hands Meeting 2005
Record Performance UK e-Science All Hands Meeting 2005
Query Performance UK e-Science All Hands Meeting 2005
Applications UK e-Science All Hands Meeting 2005
Conclusion • The p-structure allows for domain independent, provenance specific queries using XQuery. • Both recording and query times are linear • PReServ has a extensible architecture allowing for further functionality to be easily added. UK e-Science All Hands Meeting 2005
Download! • Try it out! • Download PReServ 0.2: • The AHM release • Released under Open Source MIT License • www.pasoa.org • Click software • Contact us, we will try to help you make your application provenance-aware. UK e-Science All Hands Meeting 2005
Configuration • Redhat Linux 9.1 on VMWare on Windows XP • Pentium P4 2.8 GHZ 1.5 GB RAM • PReServ on another machine • Database backend Berkley JDB • 100 Mb local ethernet UK e-Science All Hands Meeting 2005