260 likes | 341 Views
RepoMMan: using Web Services and BPEL to facilitate workflow interaction with a digital repository Richard Green. Overview. JISC Workflows activity and implementation meeting Aston Business School Tuesday 13 th February 2007. RepoMMan Project overview User and University needs
E N D
RepoMMan: using Web Services and BPEL to facilitate workflow interaction with a digital repository Richard Green
Overview JISC Workflows activity and implementation meeting Aston Business School Tuesday 13th February 2007 • RepoMMan Project overview • User and University needs • A solution: BPEL and Web Services • Example workflows: deposit and metadata • Problems? What problems? • Future work
RepoMMan project • Repository, Metadata and Management project • JISC-funded for two years to end May 2007... • ...to develop a BPEL-based, standards-compliant, Web Services based, workflow tool for Fedora • Closely aligned with the University’s commitment to deploy an institutional repository
RepoMMan • Two strands: • research – user needs, documentation etc (BPEL, surveys, Beginner’s guide etc) • technical – development • Surface tool in University portal and/or Sakai C&LE
Hull’s vision for a DR • Hull’s vision for an Institutional Repository is an extremely broad one • Conventional view was exposure of completed objects • Hull’s view encompasses storage, access, management and preservation of a wide range of file types from concept to completion
User needs • Survey and interviews revealed very wide range of file types potentially to deal with • Users need • Storage (backed up) • Access (from anywhere) • Management (sharing, locking, versioning etc) • Some want preservation • Tool must fit users’ expectations and environment • Assist/improve doing what they already do
University needs • Flexible • wide range of content, some public - some private, appropriate UIs • Standards-based • Web Services, standards – BPEL, JSR168 etc • Scalable • initially 2500 potential depositors • Open source software • Effective search & discovery • need good metadata
Summary • The repository will be a working environment as well as a showcase • Aim is to provide storage, access, management and potential preservation painlessly • Development based on user needs • Standards compliant, Web Service approach
Solution • Toolset to manage repository workflows for users • BPEL - Business Process Execution Language • (Active Endpoints Open Source version) • Web Services • Fedora repository software • SOAP, REST • scales to millions of flexible objects (NSDL = 6m+)
BUT • Fedora is not an ‘out-of-the-box’ solution • Provides the repository functionality for YOU to use flexibly
Basic depositor workflows • Put object into repository • including some sort of structuring • Get object from repository • Delete • including at version level • Add metadata • Publish • = promote to public space
‘Putting’ and ‘getting’ a file • Left hand side of screen gives browse capability on user’s hard drive • Right hand side of screen shows the user’s private repository space like a directory structure • (‘Folders’ are in fact repository collections) • Folders can be expanded and new folders created • Objects can be further expanded to show versions
Three-tier stack • Model View Controller layer providing user interface • BPEL orchestrating Web Services (Fedora and other) to move files and objects around • Fedora drawing on ID Management System and University Storage Area Network
‘Putting’ a file • Part of the BPEL process diagram • (Active Endpoints visualisation software) • switch depending on whether • object already exists • - the left hand side branch creates • a new object • the right hand side modifies an • existing one • each of the globes with a ‘swirl’ • round it is a Web Service call
BPEL and web services • BPEL can draw on any available Web Service • Access to an expanding range of data made available through Web Services at Hull (and, of course, elsewhere) • Which brings us to metadata!!!
Making public • Good metadata at the heart of effective search and discovery • When a user wants to ‘make an object public’ we will automatically add metadata • Potentially complex: BPEL to orchestrate • Has object already got metadata? • Derive metadata using tool(s) • Extract appropriate parts • Combine with contextual metadata • Allow user to edit • Create datastream(s) (DC, Rich metadata (UVa), preservation metadata, etc) • Add to digital object
Metadata from web services • Much contextual metadata about the ‘author’ can be derived from University systems’ Web Services or context • Technical metadata can be derived by using tools such as JHOVE (as a Web Service) • Descriptive metadata about the content is the Holy Grail? • Data Fountains (UC Riverside & NSDL)
Metadata from context & web svcs • Contextual metadata from environment (LDAP, Portal, Sakai) • Technical metadata (JHOVE etc) • Additional descriptive metadata from content (Data Fountains etc) • Allow user to edit / tweak Acknowledgements to the Arrow consortiumfor the design idea
Problems? What problems? • Not a smooth journey! • Lessons: Don’t be a pioneer? Puppies don’t come free? • Active BPEL was/is fine • Fedora Web Services (v2.0, v2.1, v2.1.1) an issue • Web Services were rpc/encoded • BPEL and AXIS couldn’t always successfully validate responses • Fedora Web Services (v2.2) OK! • Web Services now document/literal
Next stages • Complete the RepoMMan deposit tool (!!!) • Internal testing • Trialling with users • Small-scale trials • Complete the RepoMMan metadata work • Post-project, develop further and deploy • Develop workflow tools for the admin side of the ‘make public’ process, records management, preservation... • In time switch to DROID (for MIME type)? Shibboleth?
Project website and contacts • RepoMMan: http://www.hull.ac.uk/esig/repomman • Contacts: • r.green@hull.ac.uk (Manager) • c.awre@hull.ac.uk (Integration architect) • i.dolphin@hull.ac.uk (Director) • s.lamb@hull.ac.uk (Software developer)