180 likes | 297 Views
SDSC Matrix Project: A Passionate Workflow towards Scientific Perfection. Arun Jagatheesan Architect and Team Lead, SDSC Matrix Project San Diego Supercomputer Center (SDSC). Super Computing Conference 2003, Exhibit at SDSC Booth November 18 Phoenix, Arizona. Credit / Acknowledgements.
E N D
SDSC Matrix Project:A Passionate Workflow towards Scientific Perfection Arun Jagatheesan Architect and Team Lead, SDSC Matrix Project San Diego Supercomputer Center (SDSC) Super Computing Conference 2003, Exhibit at SDSC Booth November 18 Phoenix, Arizona
Credit / Acknowledgements • Participants • Allen Ding • Jonathan Weinberg • Lucas Gilbert • Reena Mathew • Xi Cynthia Sheng • Well Wishers (They had the Matrix red pill) • Reagan Moore ( & SRB Team) • SDSC DAKS (Big Team, Big Support !) • Kim Baldridge • YOU • Sponsors • NSF GriPhyN, NSF SCEC, NPACI REU, NIH BIRN
Talk Outline • Workflow • Requirements for Grid Workflow • Data Grid Language • Matrix as a WfMS • Demonstrations • XQuery (CDL) • External Status Requests
Workflow • Automation of business process • Whole or Part • Documents/Information or tasks passed between participants • Based on a set of procedural rules • Scientific Computing Workflow • Computational research process as pathways or pipelines • Gather data, cleanse data, apply different combinations of transformations, simulations, visualization, publish in digital library, archive data, get Nobel prize (makes us also happy :-)
What is needed for Grid Workflow • Yet Another Standard XML language • Describe import and export of Workflow in Grid • Peer-2-Peer Collaboration for Workflows • Looping Structures • Scientific Workflows • Iterations over millions of data sets • Generic System • Multiple Domains: Bio, Physics, Digital Libraries… • Dynamic Status Queries • Dynamic and robust execution based on prior executions • Grid Service Handles to Query, Publish or Subscribe • XQuery subset - Uniform query for data and process You too Arun? ( Becoming Anti-standards by issuing a new standard ) – But, we need it
What is needed for Grid Workflow II • Granular Metadata • Context-based workflow, with control-based constructs • Context Based flows • Apart from being just Control based • Sequential, Parallel, Multiple Split, Conditional, … • Dynamic rule (ECA rules) to update milestones • Grid Data Types • Support to have Schema to describe data sets, collections • Inbuilt support to describe Grid Locations
Grid Workflow Process I Workflow Description Data Grid Language End User
Planner Concrete Workflow Grid Workflow Process II Post-presentation comment (based on questions asked): We are not implementing this planner now. We are implementing the DGL parser, DGL Query interpreter in the Matrix server to manage the workflow state for grid workflows. We are also implementing the protocols for the P2P workflow on the grid. Abstract Workflow Data Grid Language
Grid Workflow Process III Grid Workflow Processor Concrete Workflow Export Workflow to Matrix P2P
Matrix Server • Acts as a Peer in WfMS P2P System * • Processes Data Grid Requests • Can maintain state an manage process steps • Can invoke SRB data grid processes, OGSA-Services, WSDL Services (OGSA Threads to be implemented) • Implemented as an Open-source Project * Being Designed/developed as of the presentation date
Implementation Status • Data Grid Language Schema for basic workflow constructs, Data Grid Operations • Matrix agents for executing data grid requests • Basic process pipeline management • Data Grid Language: Rules, Embedded query, OGSA operations to be added • Matrix: P2P, export/sharing of workflow to be added
SDSC Matrix Architecture SOAP Service Wrapper Abstraction Event Publish Subscribe, Notification JMS Messaging System JAXM Wrapper OGSA RPC-Style for SOAP Matrix Data Grid Request Processor Status Query Handler Pipeline Query Processor Transaction Handler Flow Handler and Execution Manager XQuery Processor Termination Handler Data flow pipeline Meta data Manager Matrix Agent Abstraction Persistence (Store) Abstraction Other Data Services OGSA Agent WSDL Agent SRB Agents JDBC In Memory Store
Data Grid Request (DReq) • Datagrid Request • Asynchronous requests for data/process-flow in datagrids • Requests are either a Transaction or a Status Query • Each Transaction consists of one or more Flows • Each Flow consists of one ore more datagrid operations • Datagrid operation = data transformation or data query • A flow can be executed sequential or parallel
Data Grid Request Remind me to show the new Matrix 3.0 Schema
Datagrid Response Either Transaction Acknowledgement or Status Response Status Response contains the results of a Transaction Response could be received at any granular level Status response is used for coordination of flows and inter-process notifications Data Grid Response
Data Grid Response (DRes) Remind me to show the new Matrix 3.0 Schema
Conclusion • Data Grid Language • Grid Workflow Description • Basic Stuff or foundation ready • Solid Design to handle more complex stuff • Workflow Modeling not investigated (like Ptolemy?) • Matrix Server Implementation • Create, Query, Manage Grid Workflows • OGSA, Rules, P2P to be implemented • More Support will expedite R&D
Demos ? He is trying to escape. Where are the Demos?