230 likes | 313 Views
Schematic Description of Grid Exceptions. Arun Jagatheesan San Diego Supercomputer Center (SDSC) & High Energy Physics Group, University of Florida. Outline. Why describe? How to describe? What to describe?. Who / MIB (Men Involved Behind the scene)? RFS (Request for Suggestions)
E N D
Schematic Description of Grid Exceptions Arun Jagatheesan San Diego Supercomputer Center (SDSC) & High Energy Physics Group, University of Florida
Outline • Why describe? • How to describe? • What to describe? • Who / MIB (Men Involved Behind the scene)? • RFS (Request for Suggestions) • NOTE: Still in learning phase…
Data Grid Management System (DGMS) • Before we start, why we were interested : Data, Knowledge Management in Grids • Data Grid Management System as set of Services • Credit (DGMS): • Reagan Moore (SDSC) – Knowledge Grids • Jim Gray (Microsoft) – VLDB Suggestions • Arcot Rajasekar (SDSC) – Data Grids • You (TIA for your suggestions in workshop) A question session will follow. Take Notes
What: Datagrid and DGMS • Data Grid: Logical view of collection of heterogeneous data spread across virtual organization(s) providing a transparent access irrespective of data location, storage media, storage format and data identifier (name) • DGMS: System to manage relationship between data and events associated with the data grid workflow to help in automated data and knowledge discovery
Descriptions Required • DGMS required Grid Data Flow – distributed operations on the grid specified using XML documents Also required … • Description of Exceptions in inter-grid and intra grid environments • Description of Handling Cases which might dynamically change
Whatz ahead for Grid • “Grid” as it was meant to be. Grid in a Cell??
Exception Handling in Grid Service Requestor Store file1.xyz in data grid Grid / Executor Service Providers
Exception Handling in Grid – User Service Requestor Just send me an e-mail of failure Grid / Executor Service Providers
Exception Handling in Grid - Provider Service Requestor Archival System: Maintenance required for tape robot or try after 1 working day Grid / Executor Service Providers
Exception Handling in Grid – System Service Requestor Mission Critical Grid: Need to find another Service Provider Grid / Executor Service Providers
Exception Handling • Grid User Specified • Service Provider Specified • Grid/System Specified • 2 Questions: • How do we dynamically specify the different exception handling associated with the same exception • Who does the real exception handling
Customized Exception Handling Try (store file1.xyz in data grid) catch any exception and handle • Using Arun_Out_of_Office_Handler (User) • Using Vegas_HPSS_Handler (Service Provider) • Using GriPhyN_Level1_Handler (System)
Exception Name spaces xmlns:condor = “http://www.cs.wisc.edu/condor/exceptions” <gdfl:exception> <faultHandler name=“ArunFirstHandler"> <fault value=“condor:FileNotFound”> <action> <mail name="faultHandlerMail"> <To>arun@sdsc.edu</To> <body>Sum of all Fears - SOS Grid </body> </mail> <log>Fatal Error: Mercury Rising</log> </action> <flow>die</flow> <fault> </faultHandler> </gdfl:exception>
gator.edu The Grid Scenario Again… Service Requestor Do this on the grid as mentioned in this VDL document Grid / Executor Service Providers go.org
The Grid Scenario Again… Service Requestor Now where the the hell did the grid service crash? If I now it, I could handle it appropriately Grid / Executor Service Providers gator.edu go.org
So what we need… • Need for Name spaces describing the taxonomies of error handling with respect to the Service Provider, Type of the error (generic/specific), the policies in the grid • Need for description of what happened exactly. Just a message “Some thing bad happended during the service invocation” is not fine to recover from it.
Sum up • VDL/DAG Submission (For each operation on the grid) • User App has its own customized error handler based on the name spaces/ categories of error types • Service provider has its own error handler based on the name spaces/ categories of its own domain which is shared with the grid • Grid (System) has its own error handler based on the grid policies and efficiency concerns All these error handlers could change dynamically.
Advantages • Structured Handling based on profiles • Handled by the respective providers • Profiles can be dynamically changed • Suitable for inter-grid and intra-grid scenarios
Disadvantages • Xtra processing • Definition and categorization of grid errors required • New Mechanisms to parse and handle these exception documents (probably in XML) is required
Summary • Grid Exceptions involve: The Service Requestor, Service Provider and the Grid Policy which change dynamically • Generic Classification of Grid Errors which could be extended later is required • Error types and error handling description based on service provider required to handle if more efficiently