690 likes | 904 Views
Research Workflow Process on the GRIDs Surface. Keith G Jeffery President euroCRIS. Agenda. Introduction The R&D Process: Recording The Key: Metadata and Data Exchange Standards Workflow on the GRIDs surface Conclusion. Nirvana.
E N D
Research Workflow Process on the GRIDs Surface Keith G Jeffery President euroCRIS
Agenda • Introduction • The R&D Process: Recording • The Key: Metadata and Data Exchange Standards • Workflow on the GRIDs surface • Conclusion
Nirvana Commonly used to indicate an optimal state of a person (professional) or system (suitable) • Buddhism. “The ineffable ultimate in which one has attained disinterested wisdom and compassion” • Hinduism. “Emancipation from ignorance and the extinction of all attachment” In a euroCRIS context, best possible CRIS system(s) for end-users backed by best advice
Nirvana - Retrieval • An environment where an end-user can: • Request information and through an intelligent dialogue generate a ‘job’ which provides it • Example (Medical R&D planning) • How many researchers • expert in GlycoProtein gp120 and CD4 molecule • are likely be available in 2015; • Classify researchers by country, institution; • order list of researchers by number of refereed publications to date
Nirvana – input / update • An environment where an end-user can: • Input / update information and through an intelligent dialogue obtain assistance where needed and validation of the input • Example: • if value input for ‘person’ then possible valid values for ‘organisational unit’ suggested
The Solution is Required: • To overcome the ‘effort threshold’ to : • obtain the required answers from the CRIS • input and update the information in the CRIS • maintain data quality in the CRIS • Across • local stand-alone CRIS • heterogeneous distributed CRISs • Thus achieving ‘nirvana’
Agenda • Introduction • The R&D Process: Recording • The Key: Metadata and Data Exchange Standards • Workflow on the GRIDs surface • Conclusion
The R&D Process: Recording Workprogramme CRIS DATABASE Proposal Project Results Exploitation WealthCreation
The R&D Process: Feedbacks CRIS DATABASE Workprogramme Proposal Project Results Exploitation WealthCreation
The R&D Process: Review CRIS DATABASE Workprogramme Proposal Project Results Exploitation WealthCreation review review review review
The WorkProgramme Process Economic factors CRIS DATABASE Societal factors Technology Foresight -World / Country State -World / Country Models -Technology Prediction -Solicited Advice Workprogramme
The Proposal Process Idea CRIS DATABASE Review Previous Work Objectives CRIS DATABASE Method -Previous Results -Previous Projects Resources and dependencies Proposal -Human Resources -Finance
The Project Process Project CRIS DATABASE Project Management System CRIS DATABASE -Previous Results -Previous Projects -Human Resources -Finance
The Results Process Initial Results CRIS DATABASE Internal Review CRIS DATABASE Peer Review Publication or Registration Previous Results Results
The Exploitation Process Results CRIS DATABASE Business Plan Finance Marketing Production Selling Marketing Information Economic Information Exploitation
The Wealth Creation Process Exploitation CRIS DATABASE marketing production employment Marketing Information Economic Information WealthCreation
The R&D Process: Recording Workprogramme CRIS DATABASE Proposal Project Results Exploitation WealthCreation
The R&D ProcessRecording WorkProgramme Workprogramme ProgrammeNameFundingOrgUnitPerson responsibleWorkprogramme document CRIS DATABASE
The R&D ProcessRecording Proposal CRIS DATABASE TitleAbstractPerson(s)OrgUnit(s)Proposal Document Proposal
The R&D ProcessRecording Project TitleAbstractPerson(s)OrgUnit(s)FundingProject Plan CRIS DATABASE Project
The R&D ProcessRecording Results-Product Person(s)OrgUnit(s)Project(s)Product(s)Product Description CRIS DATABASE Results
The R&D ProcessRecording Results-Patent Person(s)OrgUnit(s)Project(s)Patent(s)Patent File CRIS DATABASE Results
The R&D ProcessRecording Results-Publication Person(s)OrgUnit(s)Project(s)Bibliographic InformationArticle CRIS DATABASE Results
The R&D ProcessRecording Exploitation Person(s) OrgUnit(s) Business plan Finance Data Marketing Data Production Data Sales Data CRIS DATABASE Exploitation
The R&D ProcessRecording Wealth Creation CRIS DATABASE Person(s) OrgUnit(s) Annual Reports/Accounts Employment Records Dividends Records WealthCreation
The R&D Process Workprogramme Note: some CRIS developers limit recording of outputs from the process to areas indicated Proposal Nirvana Project Results Exploitation WealthCreation
Complete Process ICT Support • Nirvana is • a complete, • integrated, • end-to-end ICT support • for the research process • across heterogeneous distributed CRISs
Agenda • Introduction • The R&D Process: Recording • The Key: Metadata and Data Exchange Standards • Workflow on the GRIDs surface • Conclusion
view to users SCHEMA NAVIGATIONAL ASSOCIATIVE constrain it how to get it data (document) Metadata and Data Exchange Standards • Metadata • a succinct representation of the object of interest • Schema, navigational, associative [descriptive, restrictive, supportive] • Used for rapid retrieval of navigational data to objects of interest • Can also be used for statistical purposes (‘how many…..’,’average number of…’)
Metadata • Many kinds and standards exist • Examples include: • Publications: MARC, DC (Dublin Core) • Geospatial: CSDGM (Content standard for digital geospatial metadata) • Engineering: STEP • Education: LOM (learning object metadata); EDNA (Education Network Australia metadata)
Metadata and CRISs • Commonly a CRIS stores the metadata rather than the object itself • e.g. result_publicationId which can be used to access the publication itself (person{author}, title, abstract etc usually stored in the CRIS) • e.g. projectId which can be used to access the detailed project documentation (title, abstract etc usually stored in the CRIS)
Descriptive Title Subject Keywords Description Resource Type Coverage Temporal Coverage Spatial Metadata: DCf: Publications Domain of CERIF Project Person OrgUnit Person OrgUnit UniqueId UniqueId Restrictive Security Privacy Quality Assessment AccessLevel Charge Annotation Classification ResourceIdentifier Navigational
Metadata in CRISs • Used for • Quality: validation on input / update • Summarising: overview results • Retrieval speed (find the list of objects of potential interest) • Controlling access • Rights management • And……..
Metadata in Interoperating CRISs • Metadata essential to allow interoperation of CRISs, especially heterogeneous distributed CRISs • Provides the information necessary to set up automatically retrieval (or update) over heterogeneous CRISs • Catalog technique • Universal schema technique(s) • Knowledge-based reconciliation technique(s)
Metadata and Data Exchange Standards • Data Exchange Standards • Needed not just for data (file) exchange • Also for returning results of a retrieval from one CRIS to another in a form (syntax, semantics) that is processable • Metadata plus dataset • Note data exchange standards used extensively in e-business, banking, insurance, medical, engineering, research areas
The Key: Metadata and Data Exchange Standards • Nirvana is • Formal metadata (machine understandable) • Query: Metadata describing CRIS resources to improve queries • Answer: Metadata attached to Query result files (data exchange) so the receiving CRIS or user can understand the output
Agenda • Introduction • The R&D Process: Recording • The Key: Metadata and Data Exchange Standards • Workflow on the GRIDs surface • Conclusion
Workflow on the GRIDs surface • GRIDs ‘surface’ provides • Computational capabilities of GRID • Information presentation capabilities of WWW • Information management capabilities • But not yet environment for workflow
Knowledge Layer Information Layer Data toKnowledge Control Computation / Data Layer The GRIDs Architecture
Data toKnowledge Control Particle Physics Application Genomics Application Environmental Application E-Business Application The GRIDs Architecture
U:USER R:RESOURCE S:SOURCE A POSSIBLE ARCHITECTURE The GRIDs Environment Um:User Metadata Ua:User Agent Sm:Source Metadata Sa:Source Agent Ra:Resource Agent Rm:Resource Metadata brokers
A Brief History of GRIDs • 1G: custom-made architecture machines to user • Pioneering metacomputing • 2G: proprietary standards and interfaces • I-WAY GLOBUS, UNICORE, CONDOR, LEGION AVAKI • 2.5G: added in FTP, SRB, LDAP, AccessGRID • 3G: adopted W3C concepts for open interfaces – OGSA / OGSI: note especially OGSA/DAI • But built on 2.G foundations e-Science Apps e-Science R&D
But….. • This comes nowhere near the requirements as originally defined for GRIDs • Too low-level (programmer not end-user level) • Insufficient representativity • Insufficient expressivity • Insufficient resilience • Insufficient dynamic flexibility
So….. • The US GRID is metacomputing plus extensions • In 2002 improved with OGSA using W3C Web Services ideas • European position is that GRID architecture (GLOBUS or even UNICORE) is the wrong starting point for the European vision
And….. • EC persuaded of importance of GRIDs • Started in IST/Environment (early 2000) with IT architectural framework for FP6 projects • Set up GRID Unit under Wolfgang Boch (late 2002) • January 2003: large workshop (GRID Unit) • (~ 240 participants) • Keynotes: • Thierry Priol (INRIA, FR) • Domenico Laforenza (CNR, IT) • Keith Jeffery (CCLRC, UK)
NGG Requirements WWW meets some of these • Transparent and reliable • Open to wide user and provider communities • Pervasive and ubiquitous • Secure and provide trust across multiple administrative domains • Easy to use and to program • Persistent • Based on standards for software and protocols • Person-centric • Scalable • Easy to configure and manage 2.5G or even 3G GRID basically meet none of these
NGG • NGG1: 200301-200306 • Brought together visionary experts • Defined properties required and research agenda to achieve them • NGG2: 200401-200407 • Updated NGG1 vision in the light of funded projects and evolving requirements and technology • NGG3 200509- • http://www.cordis.lu/ist/grids/pub-report.htm
GRIDs Vision and Requirements (1) • a user interacts with the GRIDs environment intelligently • such that the GRIDs environment proposes a 'deal' to the end-user to satisfy her request • which the user can then decide to execute - involving multiple resources of computation, information, detectors (for new data collection), interactions with other users through various communication devices etc.
GRIDs Vision and Requirements (2) • interoperation as a seemingly homogeneous 'surface' over a range of devices from smart dust through detectors to embedded systems (including controllers), handhelds, laptops, desktops, departmental servers, corporate servers and supercomputers. • the 'surface' depends on self-* (self-managing, self-repairing, self-tuning...) capability across arbitrary and dynamic collections of (large numbers of) nodes to give scalability, performance, reliability, access, security, privacy and other features.
NGG1 • NGG1 Properties Required: • Transparent and reliable • Open to wide user and provider communities • Pervasive and ubiquitous • Secure and provide trust across multiple administrative domains • Easy to use and to program • Persistent • Based on standards for software and protocols • Person-centric • Scalable • Easy to configure and manage