230 likes | 400 Views
Expressing Workflows Using Grid Enabled Computer Algebra Systems Palaiseau, France, January 19-21, 2009. Alexandru C ÂRSTEA Georgiana M ACARIU Marc F RÎNCU. Introduction. Existing Computer Algebra Systems(CAS) and packages: General purpose Special purpose Packages
E N D
Expressing Workflows Using Grid Enabled Computer Algebra SystemsPalaiseau, France, January 19-21, 2009 Alexandru CÂRSTEAGeorgiana MACARIUMarc FRÎNCU
Introduction Existing Computer Algebra Systems(CAS) and packages: • General purpose • Special purpose • Packages Types of possible interactions: • CAS to CAS interaction • CAS to Web/Grid service interaction • External components to CAS interaction The best solution to expose CAS functionality is through Web/Grid services. Advantages of using Web services: • Cross platform support • Standard mechanism for advertising the interface • Standard mechanism to describe data types • Compatibility with firewall policies Advantages introduced by WSRF-Grid services (additional to the ones of Web services): • Standard mechanisms to describe resources • Standard technologies to access functionality • Built in security features
Usage Sketch • Compound computation requiring functionality from several CASs • The client might sometime require a certain CAS to solve a certain part of the problem • Memory/Computational power details may be required • Asynchronous calls as the calls may be issued from a portable device (laptops, PDAs, etc..) • Later provenance may be required
Standard Interface - CAS Servers The standard interface of CAS Servers offers: • an operation that received the computation call; calls are formulated as OpenMath objects • callback configurable functionality by specifying the callback address • informational services • management services Management capabilities: • Decide where to advertise the exposed functionality • Choose the functionality that must be available • Enable provenance
Models for CAS Integration • By implementing SCSCP • Communication between Web service wrapper and the CAS achieved through TCP/IP calls • Semantic meaning of formulae offered by OpenMath support trough common Open Math CD. • By plain string messages exchanged using various technologies (where available) • Communication between Web service wrapper and the CAS achieved though files, data pipes or TCP/IP • No support for semantic meaning. The messages are meaningful only in the context of the targeted CAS Note. Both types of messages are sent to the Web service wrapper as strings encapsulated in SOAP messages.
Example of Formulated Calls Method Call <OMOBJ> <OMATTR> <OMATP> <OMS cd="scscp1" name="call_ID" /> <OMSTR> alexk_9055</OMSTR> </OMATP> <OMA><OMS cd="scscp1" name="procedure_call"/> <OMA><OMS cd="SCSCP_transient_1" name="WS_factorial" /> <OMI>3</OMI> </OMA></OMA> </OMATTR> </OMOBJ> SCSCP Cal <OMOBJ> <OMATTR> <OMATP> <OMS cd="scscp1" name="call_ID" /> <OMSTR>alexk_9055</OMSTR> </OMATP> <OMA> <OMS cd="scscp1" name="procedure_call"/> <OMA> <OMS cd="SCSCP_transient_1" name="WS_factorial" /> <OMI>3</OMI> </OMA> </OMA> </OMATTR> </OMOBJ>
Composition Functionality (I) • Allows composing the functionality of CASs installed on different machines and of heterogeneous types • Offers support for compound computations • Collaboration: demand tools that orchestrate the steps of computation used in scientific discovery • Support for reproducibility by storing meta information about the execution of the workflow • Easy to use • Start/Stop/Resume + steering • Inspect status and values obtained on the fly
Composition Functionality (II) Web service interaction/composition patterns: • sequence pattern • parallel split pattern • multiple instances without synchronization • conditional patterns: • exclusive choice pattern • multi-choice pattern • deferred choice pattern • conversational patterns: • request/reply pattern • one way invocation
General Overview of the Architecture • CAGS : Computer Algebra to Grid Services • AGSSO : Architecture for Grid Symbolic Services Orchestration
Behind the Scene • SA Platform - manages scheduling of the tasks that are ready to be scheduled. - notifies the Client Manager (Workflow Manager) that the call can be started. • Computational element may hide a single machine or a cluster hierarchy( SymGridPar)
Workflow Description at Client Side • Subset of the BPEL language • Sequence: <sequence>…</sequence> • Parallel: <parallel>…</parallel> • Multi-choice: <multichoice>…</multichoice> • If-Else: <if>..<else>.. </if> • Foreach: <foreach>…</foreach> • While: <while>…</while> • Variable declaration: <newvariable> • Invoke: <invoke invokeID = “…”>… </invoke> • Higher level constructs implemented directly through libraries
Example - A Rhomb WorkflowBased on a GAP Package startWorkflow(); startSequence(); startParallel(); v1:=invoke("KANT",Bernoulli(1000)); v2:=invoke("KANT",Bernoulli(2000)); endParallel(); invoke("GAP",gcd(v1,v2)); endSequence(); endWorkflow();
Symbolic Computation Problem – Ring Workflow Workflow arising from the orbit enumeration algorithm: • job server, sending procedure calls to appropriate image service. • image service for computing the image of the point (may be more than one, each sending procedure call to appropriate orbit service). • orbit service for storing the orbit (may use hash tables, may be more than one, each maintaining part of the table and sending procedure call, if necessary, to the job server ).
Example – Arbitrary Nr. CyclesBased on a GAP Package LoadPackage("SWIP"); SWIP_startWorkflow(); SWIP_declareVariable(n,"0"); SWIP_startWhile("$n<10"); SWIP_startSequence(); aVar1:=SWIP_invoke("GAP", "Int($n+1)", "$n"); SWIP_startMultiChoice(); SWIP_startChoiceBranch("$n<10"); SWIP_invoke("GAP", "Int($n+1)", "$n"); SWIP_endChoiceBranch(); SWIP_endMultiChoice(); SWIP_endSequence(); SWIP_endWhile(); SWIP_endWorkflow();
Installation Requirements and Issues CAS Server • Globus 4.2.0 • PosgresSQL 8.0 + • run script to create the database and to populate it • deploy the .gar file to Globus • start container AGSSO Platform • Active BPEL 4.1 ( workflow engine) • PostgreSQL 8.0+ • Tomcat • run script to create the database and to populate it • configure the service • deploy .war archive Client • Java package installation • Setting a property file
Cancel/Pause/Resume • It is supported with limitations: • The user cannot always make a successful call • Lack of support for check-pointing ; the task is simply restarted • Computation steering available partially • Better support at CAS level may/should provide: • Threaded server that is able to handle interrupts • Check-pointing and resume
A XML Description of a Sequencewith Data Dependency <workflow xmlns="http://ieat.ro"> <sequence> <invoke invokeID="invoke_0"> <casid>GAP</casid> <call> <OMOBJ> <OMATTR> <OMATP> <OMS cd="scscp1" name="call_ID" /> <OMSTR>ieat_9055</OMSTR> </OMATP> <OMA> <OMS cd="scscp1" name="procedure_call" /> <OMA> <OMS cd="SCSCP_transient_1" name="WS_factorial" /> <OMI>3</OMI> </OMA> </OMA> </OMATTR> </OMOBJ> </call> </invoke> <invoke invokeID="invoke_1"> <casid>GAP</casid> <call> <OMOBJ> <OMATTR> <OMATP> <OMS cd="scscp1" name="call_ID" /> <OMSTR>ieat_9056</OMSTR> </OMATP> <OMA> <OMS cd="scscp1" name="procedure_call" /> <OMA> <OMS cd="SCSCP_transient_1" name="WS_factorial" /> <OMSTR>$invoke_0</OMSTR> </OMA> </OMA> </OMATTR> </OMOBJ> </call> </invoke> </sequence> </workflow>
A XML Description of a Parallel Execution <workflow xmlns="http://ieat.ro"> <parallel> <invoke invokeID="invoke_0"> <casid>GAP</casid> <call> <OMOBJ> <OMATTR> <OMATP> <OMS cd="scscp1" name="call_ID" /> <OMSTR>ieat_9055</OMSTR> </OMATP> <OMA> <OMS cd="scscp1" name="procedure_call" /> <OMA> <OMS cd="SCSCP_transient_1" name="WS_factorial" /> <OMI>3</OMI> </OMA> </OMA> </OMATTR> </OMOBJ> </call> </invoke> <invoke invokeID="invoke_1"> <casid>GAP</casid> <call> <OMOBJ> <OMATTR> <OMATP> <OMS cd="scscp1" name="call_ID" /> <OMSTR>ieat_9056</OMSTR> </OMATP> <OMA> <OMS cd="scscp1" name="procedure_call" /> <OMA> <OMS cd="SCSCP_transient_1" name="WS_factorial" /> <OMI>6</OMI> </OMA> </OMA> </OMATTR> </OMOBJ> </call> </invoke> </parallel> </workflow>
CAS Server Setup In order to expose and advertise the functionality of a CAS server, the administrator must: • Describe the computational capabilities of the computational node • Add to the Local Registry the names and details regarding any Methods/OM Symbols that must are going to be exposed • Specify which CAS(GAP, Maple, etc..) supports the functionality • Add to the Local Registry detail about the Main Registries that the current Local Registry will advertise in. • Choose which for every method/symbol that should be exposed the Main Registries to advertise in.
Adding an OM Symbol Step 2 Add the OM Symbol to CD Step 1 Adding the Open Math CD
Conclusions • Composition of symbolic Grid services is close • Some features may require extra support from the CAS • A general solution is needed in order to make sure that interoperability is not just a word in the dictionary • Web/Grid Services • Open Math representation of semantic data • SCSCP representation of communication