1 / 69

Planning and Watch

Planning and Watch. Review presentation. Christoph Becker Vienna University of Technology www.ifs.tuwien.ac.at/~becker. SCAPE First year project review, Luxembourg March 20-21, 2012. Outline. Objectives and overall progress Key results Watch design (D12.1)

presta
Download Presentation

Planning and Watch

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Planning and Watch Review presentation Christoph Becker Vienna University of Technology www.ifs.tuwien.ac.at/~becker SCAPE First year project review, Luxembourg March 20-21, 2012

  2. Outline • Objectives and overall progress • Key results • Watch design (D12.1) • Decision factors analysis (D14.1) • Sneak preview: The knowledge browser • Integration and outlook • Time for questions

  3. Preservation Planning: Key concepts • Repeatable, standardized planning workflow • A weighted hierarchy of objectives • Measurable criteria on the leaf level of the tree • Utility functions make criteria comparable • Controlled experimentation on sample content • Evidence-based decision making • Standardized structure for plan specification • Transparency and documentation • Comparability across scenarios • Planning tool Plato guides, validates, documents

  4. Scalability Challenges • Creating a plan is effort-intensive • Sharing experience is difficult • Monitoring changes is manual • Integrating context, strategies and operations is difficult

  5. Scalability Challenges • Creating a plan is effort-intensive • Increase efficiency of planning • Sharing experience is difficult • Increase standardisation and reusability • Monitoring changes is manual • Introduce automation • Integrating context, strategies and operations is difficult • Manage policies • Integrate systems

  6. Work packages and major goals • PW.WP.1 (WP12): Automated Watch • Watch component for monitoring aspects of interest • Simulation component for prediction • PW.WP.2 (WP13): Policies Representation • Catalogue of high-level policy statements • Machine-understandable model of low-level policy statements • Structural and procedural relations between these • PW.WP.3 (WP14): Automated Planning • Refinement of the planning method • Analysis of decision factors and criteria • Planning component (integrated with repositories)

  7. Overall progress in year 1 • Startup phase • Conceptual advances • Development started a bit delayed • No major impact on delivery schedule • Parallel interacting streams • Analysis of methods: planning, policies, monitoring • Prototype development: Plato4, analysis module, watch services • Integration experiments: Components and Taverna workflows • Milestones and deliverables • MS58 Policy elements (m6) • D14.1 Decision factors analysis (m10) • MS59 Policy catalogue (m12) • D12.1 Watch design (m12)

  8. Status WP12: Watch • Watch service definition completed • Clarification of goals, scope and key concepts • Watch component design finalised: D12.1 • Analysis of drivers and constraints • Analysis of events and triggers • Architecture design • Development started • First milestone release in autumn 2012 • Simulation environment: Preliminary work started

  9. D12.1: Key goals for Automated Watch • Enable the planning component to automatically monitor entities and properties of interest • Enable human users and software components to pose questions about entities and properties of interest • Act as a central place for collecting relevant knowledge that can be used to preserve an object or a collection • Collect information from different sources through adaptors • Enable human users to add specific knowledge • Notify interested agents when an important event occurs • Act as an extensible component

  10. Watch: Key concepts • Knowledge base • Entities and their properties • Measures of properties over time • Triggers define conditions and events • Flexibility and extensibility • A well-defined, flexible data model • Adaptors for different information sources • Monitoring Capabilities • Internal Monitoring • External Monitoring • Monitor compliance, risks and opportunities

  11. Information sources and clients Watch core Source Adaptors Knowledge base Conditions Notifications Format registries Content profiles Component catalogue Workflows Planning Experiments Policies Watch Frontend Planning Operations Browser snapshots Watch Frontend

  12. Example conditions and events • Policies specify object properties, content profiles describe object properties • Policy violation (e.g. objects that are not well-formed) • Plan specification includes tolerance levels for operations • QA measures on migration results outside specified boundaries • Migration performance below specified threshold • Plan specification includes format properties • Number of tools supporting a certain format drops below threshold • Plans specify criteria to be measured • New components developed/tested on platform that support desired QA measures • Experiments show risks related to tools in use

  13. Current status in Watch • Proof-of-concept (May/June) • Full-circle architecture validation • Mockup data sources • First iteration of Watch focuses on web content • Watch central service • Content profile adaptor • Focused vs. dispersed web crawls over time • Incremental addition of information sources • New adaptors may reveal new requirements

  14. Content profile • Global view of content • Distribution of file formats • Distribution of characteristics • Representative data sets • Stages • Collect metadata • Combine and filter • Reason on the result

  15. Status WP13: Policies • Policies are governance statements, notexecutable rules • 3 levels of policy statements • Hi-level guidance: A Policy catalogue • Mid-level procedures and structures • Low-level control policies: A machine-understandable Policy model • Milestone 59: Policy catalogue closed in March • 1st semantic model of control policies in m15 • Further refinement in second iteration

  16. Status WP14: Planning • Development baseline based on Plato 3 • Removed: PLANETS and other legacy dependencies • Refactored: Modularise, decoupling, testing, .... • Upgraded: JBoss7, JSF2, Richfaces 4.... • Moving: maven, github, continuous builds... • First milestone release in July(Policy model, repository integration, Taverna integration) • Define interfaces and integration • Taverna experimentation • Requirements for components catalogue • Repository and platform interface • Collect decision points to automate • Analyse decision factors and criteria

  17. D14.1: Analysis of decision criteria • PLATO, the Planning Tool • Evidence-based, well-documented plans • Hierarchy of objectives leading to quantified decision criteria • Traceability from decision factors to decisions • Case studies in and after Planets • Challenges: Effort, sharing, automation, scalability • Analysis of the measurability and automation of criteria • Standardisation and alignment of criteria • Systematic assessment of the impact of certain criteria

  18. A method and tool for decision criteria analysis

  19. Collect: Some case study data from Plato

  20. Collect: Decision Criteria • Objective Tree • Utility Function • Semantics • Taxonomy of criteria measurements

  21. Decision criteria: What to measure and how • 13 case studies with 617 criteria • Frequency distribution of criteria across taxonomy • Taxonomy is complete • Preservation of scanned images -distribution over four case studies  • But: no analysis of impact

  22. Align models for decision factors • Format Properties • Library of Congress format evaluation • PRONOM format evaluation • Actual decision criteria • Software Quality • ISO SQUARE: Standardised software quality model • Object properties • Formats • Representation Instances • Significant Properties

  23. A method and tool for decision criteria analysis

  24. Develop • Impact factors for criteria and sets • Frequency • Weighting • Utility function • Impact • Selectivity • Measures • Analysis tool • Criteria browser, set builder and analyser • Integrated in upcoming release of the SCAPE planning component

  25. Analyse a criterion (set) C Understand key decision factors Goal How often does C occur in scenario S? How critical is C? How important is C? Question Coverage Range Criticality Metric

  26. Sneak preview:The Knowledge browser • Analysis module for decision criteria • Part of the planning component • First milestone release: July 2012

  27. Conclusions • Systematic approach for analysis of decision criteria in preservation planning • Standardisation, cross-referencing, reusability • Method and tool for quantitative impact assessment • Enables SCAPE Planning and Watch to • Facilitate experience sharing and knowledge creation • Reduce complexity • Optimize decision making • Guide automation • Integrated in upcoming planning component • Enable sharing and alignment • Real-time analysis over time (Watch) • Guidance and QA of planning activities

  28. Year 2 work plan for Planning and Watch (1) • Watch • Proof of concept prototype • Content profile adaptor and monitor • Additional adaptors • Simulation environment prototype in September • Watch core services (version 1) in November • Policies • Control policy model • Catalogue elaboration • Model refinement and validation

  29. Year 2 work plan for Planning and Watch (2) • Planning • Automated planning component in July (Plato 4) • Scalability roadmap • Integration • Content profiling • Repositories • Workflow discovery and execution • Evaluation • Case studies in Testbeds • Key Performance Indicators

  30. Most critical technical dependencies • Preservation components • Planning evaluates action components • Watch uses (the output of) characterisation components to create content profiles • Quality Assurance measures quality of preservation actions for evaluation (including as part of planning) • Web browser watch service uses QA components • Platform • Planning and Watch queries components and workflows • Planning runs experiments as Taverna workflows (directly in real-time) • Planning and Watch components interface with repositories • Plans specify workflows to be run on the platform • Watch monitors REF

  31. Other results and publications • Lessons learned in Preservation Planning (JCDL) • Automated planning experiments • Actions, characterisation, QA and results reporting (ICADL) • Workflow construction in Taverna, components discovery and invocation • Automation and crowdsourcing (CIKM) • Decision making and governance • Relationship of preservation planning and IT Governance (ASIST, IPRES) • Maturity model for preservation planning and operations (ASIST) • Repository simulation • Evolution of a repository over time, given starting point and rules (IPRES)

  32. It’s 2014. You have content, a mandate, no action plans defined. What do you do? • Deploy the content profiler (uses characterization components for identification and property extraction) • Sign up with SCAPE Planning and Watch • Connect your repository to SCAPE Planning and Watch • Specify your policy model • Watch component starts monitoring content and policies and detect policy violations • You quickly create preservation plans • by evaluating action components • using characterisation and QA components • in Taverna workflows, all integrated in planning • The finished Plans contain workflow specification including SLAs • Deploy plans to repository (running e.g. on SCAPE platform)

  33. In 2015... • Watch monitors compliance of operations to plans and risks and opportunities connected to plans and policies • Monitoring conditions are automatically generated • New content? Monitored • Changed policies? Monitored • Changed environment, format risks? Monitored • New, better tools? Monitored • New QA tools that measure critical features you had to check manually? Monitored • Need an outlook on the status in 2017? Run a simulation • Is there something else you want to have monitored?Write a watch adaptor and plug it in. • Upon changes, you can swiftly adapt plans and redeploy

  34. Thank you! • Questions? „SCAPE is set to move forward the control of digital preservation operations from ad-hoc decision making to proactive, continuous preservation management, through a context-aware planning and monitoring cycle integrated with operational systems.“

  35. What is a policy? Goals and constraints • Goals and constraints are often not defined explicitly • Policy definitions... • “an official expression of principles that direct an organization’s operations” • “Formal statement of direction or guidance as to how an organization will carry out its mandate, functions or activities’ • But: “Policies” are encountered on a variety of levels in DP • From TRAC statements to enforceable processing rules • From the perspective of planning: • Preservation Policies are governance statements (about constraints, goals, preferences, directives) that constrain or drive operational planning, but may also have other effects outside of operational planning. • They are not directly enforceable (they are business policies) • Preservation planning translates them into concrete actions.

  36. Domain model for the Knowledge Base

  37. Compliance, risk and opportunities Compliance of operations to deployed plan (SLAs) Risks to operations (errors uncovered in QA tool) Opportunities for operations (new QA tool) Opportunities for operations (new action tool) • Planning will generate SLAs and monitoring conditions automatically

  38. Compliance, risk and opportunities Compliance of operations to deployed plan Risks to operations (errors uncovered in QA tool) Opportunities for operations (new QA tool) Opportunities for operations (new action tool) • Monitor criteria: change in objectives (caused by driver or constraint) • Add the policy context • Governance, Risk and Compliance

  39. Content-related triggers

  40. Environment-related triggers

  41. Community-related triggers

  42. Organisation-related triggers

  43. High-level design of the Watch component

  44. Four cases, three solutions: Scanned images • Bavarian State Library, 72TB TIFF6: Leave and monitor • British Library, 80TB TIFF5: Migrate to JP2 (ImageMagick) • Royal Library of Denmark, ~10.000 aerial photographs in TIFF6: Leave and monitor • State and University Library Denmark, scanned yearbooks in GIF: Migrate to TIFF 6

  45. Scanned books requirements

  46. Scanned books results

More Related