Overview of Today’s Talks

Overview of Today’s Talks • Provenance Data Structures • Recording and Querying Provenance • Break (30 minutes) • Distribution and Scalability • Security • Methodology

PrIMe: Provenance Incorporating MethodologySteve Munroe (sjm@ecs.soton.ac.uk)

Overview of Talk • Introducing PrIMe • Stepping through PrIMe • Step 1. Provenance use cases • Step 2. Information items • Step 3. Identifying actors • Step 4. Actor interactions • Step 5. Knowledgeable actors • Step 6. Adaptations • Summary • Conclusions

Introducing PrIMe • A Methodology for making applications provenance-aware • Provenance use cases involving documentation of identifiedpast processes • Therefore, processes and how they are enacted must be known before PrIMe is applicable • We suggest that, for new applications, use of PrIMe is interleaved with standard development methodologies Application Development Provenance Incorporation

Introducing PrIMe:Key aims Provide guidelines for: • identifying and expressing provenance use cases • identifying the kinds of information items that are required to satisfy use cases • identifying actors and the interactions between them in order to effect the recording of process documentation • identifying the set of adaptations that integrate the provenance architecture with the application to expose documentation for querying • Aim to expose only factual information, i.e. no inferences are made at this point

Introducing PrIMe:Overview Application Structure

Introducing PrIMe:Overview Step 1. Use Cases Application Structure

Introducing PrIMe:Overview Step 1. Use Cases Application Structure Step 2. Information Items

Introducing PrIMe:Overview Step 1. Use Cases Application Structure Step 3. Actors Step 2. Information Items

Introducing PrIMe:Overview Step 1. Use Cases Application Structure Step 3. Actors Step 4. Interactions Step 2. Information Items

Introducing PrIMe:Overview Step 1. Use Cases Application Structure Step 3. Actors Step 4. Interactions Step 2. Information Items Step 5. Knowledgeable Actors

Introducing PrIMe:Overview Step 1. Use Cases Application Structure Step 3. Actors Step 4. Interactions Step 2. Information Items Step 5. Knowledgeable Actors Step 6. Adaptations

Step 1: Provenance Use Cases Step 1. Use Cases Application Structure

Step 1: Provenance Use Cases • We distinguish two types of provenance use case • A core provenance use case is a use case known when PrIMe is applied. • A future provenance use case is a use case that is not considered until after a process in the application is enacted but uses documentation of that process • Core provenance use cases help inform the designers of the granularity of the processes to be considered and the critical information to expose. • Future provenance use cases cannot be known by the designer, but can be anticipated by ensuring that the application is designed to capture potentially useful documentation.

Step 1: Provenance Use CasesGathering Core Use Cases • It is not always obvious to users what use cases they could expect the provenance architecture to support. • We provide a simple requirements elicitation process to help designers collect the core provenance use cases • Give definitions of provenance • Give examples of the general questions that can be answered using the architecture • How to express provenance use cases

Step 1: Provenance Use CasesDefinition of Provenance • The provenance of a result is the process that produced that result. • The provenance of an item of data is the process that generated it.

Step 1: Provenance Use Cases The OTM Application • In the OTM application, use case questions relate to specific objects within the application, i.e.: • Recipients of organs • Organs • Organ Donors • Decisions

Step 1: Provenance Use CasesOTM Use Case Questions • Below are questions that have been taken from the OTM application. • Retrieve data linked to all actions / events associated with a patient (recipient or donor) • What decisions were made for a particular case? • What is the medical analysis tree for a given organ? • Determine if any deviation took place from the standard workflow for a given organ

Step 1: Provenance Use CasesEliciting Use Cases: • We are looking to elicit use cases of the form: • (1) Actor A does something. • (2) Actor B does something else etc. • (3) Actor C determines the answer to a question about the provenance of data (such as a specific example of one of those above).

Step 1: Provenance Use CasesElicitation Steps • 4 Important steps: • Step (1) Describe something that already happens in the application. • Step (2) Describe a specific provenance-related use case question that cannot be answered (easily), but our functionality could help to achieve. • Step (3) Identify the relevant services required for answering the use case. • Step (4) Identify the relevant information items.

Step 1:Provenance Use CasesExample use case • Donor A’s organs are screened for potential donation. • What is the provenance of the donor’s organ diagnosis? User interface Donor organ diagnoser Donor data collector Electronic health care records Testing laboratory

Step 1: Provenance Use CasesForm-Based Capture • Donor A’s organs are screened for potential donation. • What is the provenance of the donor’s organ diagnosis?

Step 2: Information items Step 1. Use Cases Application Structure Step 2. Information Items

Step 2: Information ItemsOverview • The kinds of information that will answer your use case • May be one piece or many pieces of information • E.g. a given result, or a sequence of decisions • For each core provenance use case, identify the information items required to satisfy the use case. • Foreach process in the system, identify any additional items of information that could be exposed and may be useful in future provenance use cases.

Step 2: Information ItemsExamples • Information items may be : • Data items, i.e. the result of some calculation, decision (contained in state). • Whole or part processes, e.g. the sequence of decisions that led to a donor’s organ being rejected for donation. • Relationships, e.g. what were the causal determinants of a given decision.

Step 2: Information ItemsCapture • Information items are to be captured by process documentation, i.e. p-assertions • Data items: Interaction or actor state p-assertions • Processes: Interaction and relationship p-assertions • Relationships: Relationship and interaction p-assertions

Step 3: Actors Step 1. Use Cases Application Structure Step 3. Actors Step 2. Information Items

Step 3: ActorsDescription • An actoris an entity within the application that performs actions, e.g. Web Services, components, machines, people etc. and interacts with other actors. • One actor may be seen as being composed of other actors. • A primitive actor is one for which the designers do not know the other actors of which it is composed (or do, but the decomposition is deemed to be too detailed to be relevant).

Step 3: ActorsRoles in a provenance architecture • Asserting Actors – assert p-assertions • Recording Actors – record p-assertions • Querying Actors – retrieve p-assertions • Managing Actors – maintain provenance stores

Step 3: ActorsIdentification Heuristics • Identify the components that receive information. E.g. a component/service in a workflow, a script command, the GUI/desktop application into which a user enters information. • Identify the components that provide the information in each interaction. These could be, for example, a workflow engine, a script executor, a user. • Each may also be intensionally defined, i.e. “the component that is the receiver in this interaction”.

Step 3: ActorsOTM Example User request (M1) User interface Donor organ diagnoser Diagnosis result (M8) Get Donor info (M2) Return result (M7) Request p ID (M3) Donor data collector Return p ID (M4) Electronic health care records Request blood test (M5) Return result (M6) Testing laboratory

Step 3: ActorsInformation Actor : Donor data collector Sending Donor data collector Receiving Actor: Donor data collector

Step 4: Interactions Step 1. Use Cases Application Structure Step 3. Actors Step 4. Interactions Step 2. Information Items

Step 4: InteractionsInformation exchange

Step 5: Knowledgeable Actors Step 1. Use Cases Application Structure Step 3. Actors Step 4. Interactions Step 2. Information Items Step 5. Knowledgeable Actors

Step 5: Knowledgeable ActorsDescription A knowledgeable actor is an actor that has access to an information item. The primaryknowledgeable actor for an information item is the primitive actor who first becomes aware of that information, for one of the following reasons. • The actor creates the item. • The actor receives or observes the item from outside the application.

Step 5: Knowledgeable ActorsWho knows what? Hospital EHCRS Testing lab

Step 5: Knowledgeable ActorsOTM Example Hospital User interface Donor organ diagnoser Donor data collector Electronic health care records Testing laboratory

Step 3,4,5: Knowledgeable ActorsRepeat as necessary • Step 3: Identify actors • Step 4: Identify interactions • Step 5: Identify knowledgeable actors

Step 5: Knowledgeable ActorsRecording Process Documentation

Step 6: Adaptations Step 1. Use Cases Application Structure Step 3. Actors Step 4. Interactions Step 2. Information Items Step 5. Knowledgeable Actors Step 6. Adaptations

Step 6: AdaptationsModifying actors • A non-knowledgeable actor may be modified so that it exposes for documentation an information item that is part of its state. • A non-knowledgeable actor may be modified so that it gains access to information items not currently available to itself or other actors in the system.

Step 6: AdaptationsActor Introduction • A new actor can be introduced to the application to help in the answering of use cases

Step 6: AdaptationsInteraction Extension • An interaction in the application can be extended to exchange more information between a knowledgeable actor and a non-knowledgeable actor, making the latter knowledgeable. Before a Actor Actor After a,b Actor Actor

Step 6: AdaptationsInteraction Introduction • A new interaction between actors can be introduced into the application in which a knowledgeable actor sends the information item to another actor, which then becomes knowledgeable. Before Actor Actor After b Actor Actor

Step 6: Adaptations Tracking processes • A common information item required for provenance use cases is the process to which documentation refers • Interaction p-assertions • Relationship p-assertions • Tracers

Step 6: Adaptations Tracer terminology • A computational activity • Actors cooperating on some work • Superiors • Any actor sending requests to other actors • Inferiors • Any actor receiving requests from other actors • Tasks • An independent computation within an actor, delimited by a request to the actor and a subsequent response from the actor

Step 6: Adaptations Session Tracer Semantics • Generation rule • An actor must generate a new session tracer at the start of each task and add the tracer to all requests within that task • Propagation rule (to inferior) • An actor must add any session tracers received from a superior to all requests it makes to inferiors within the task started by the superior’s request • Propagation rule (to superior) • An inferior must add the session tracers supplied by its superior to its response to its superior

Step 6: Adaptations Session Tracer Testing Lab Testing Lab Run 1 Run 2 Donor Data Collector Donor Data Collector Tracers Donor Organ Diagnoser Donor Organ Diagnoser

Step 6: Adaptations Other Tracers • Other application specific tracers possible • e.g. In the medical domain, a tracer could be used to identify all interactions belonging to a particular case.

Overview of Today’s Talks