360 likes | 531 Views
Introduction to gUSE and WS-PGRADE portal. Gergely Sipos sipos@sztaki.hu MTA SZTAKI www.guse.hu www.wspgrade.hu. Outline. History, family of P-GRADE products P-GRADE Portal, WS-PGRADE, gUSE WS-PGRADE features Scalable architecture Seamless access to various types of resources
E N D
Introduction to gUSE and WS-PGRADE portal Gergely Sipos sipos@sztaki.hu MTA SZTAKI www.guse.huwww.wspgrade.hu
Outline • History, family of P-GRADE products • P-GRADE Portal, WS-PGRADE, gUSE • WS-PGRADE features • Scalable architecture • Seamless access to various types of resources • Comfort features • Separated views for end users and developers • Advanced data-flows • Users and applications • Summary and Next steps
Family of P-GRADE Portal products • P-GRADE portal • Creating (basic) workflows and parameter sweeps for gLite and Globus middleware based utility grids • www.portal.p-grade.hu • P-GRADE/GEMLCA portal (University of Westminster) • To wrap legacy applications into Grid Services • To add legacy code services to P-GRADE workflows • http://www.cpc.wmin.ac.uk/cpcsite/gemlca • (No parameter sweep support!) • WS-PGRADE • Creating complex workflow and parameter sweeps for local clusters, utility grids and desktop grids • Creating complex applications using embedded workflows,legacy codes and community components from repository • www.wspgrade.hu • Apply for an account on Beta release • Browse the User manual
P-GRADE Grid Portal • Pros. • Easy-to-use workflow system withgraphical editor • Easy-to-useparameter sweep concept at workflow level • Multi-grid/multi-VO access mechanism: job submission to LCG (~old gLite), gLite and Globus Toolkit 2 • Intelligent handling of grid errors • Open source community on Sourceforge • Reliable, production installations for several Grid, EGEE VOs • Part of EGEE RESPECT programme • Cons. • Considered too simple for some IT end users, while too complecated forsome non-IT end users • Workflow features found limited for some applications • Internal structure is monolithic hard to build developer community around it
Motivations of creating gUSE • To overcome (most of)the limitations of P-GRADE Portal: • To provide better modularity • To improve scalability • To enable advanced dataflow patterns • To interface with wider range of resources • To separate Application Developer view from Application End User view • New products: WS-PGRADE (Web Services Parallel Grid Runtime and Developer Environment) and gUSE (Grid User Support Environment) architecture
WS-PGRADE – gUSE architecture Graphical User Interface: WS-PGRADE Gridsphere portlets gUSE Filestorage Workflowstorage Filestorage gUSEinformationsystem Autonomous Services: high level middleware service layer WorkflowEngine Applicationrepository Submitters Submitters Submitters Submitters Meta-broker Logging Resources: middleware service layer Local resources, Service grid resources, Desktop Grid resources, Web services, Databases
gUSE application: Acyclic dataflow • Job to run on dedicated machine • Job to run in a gLite VO • Job to run in a Globus 2 VO • Job to run in a Globus 4 VO • Task to run in a BOINC Grid • Web service invocation • Database operation (R / W) • File from the client host • File from a GridFTP site • File from an LFC catalog (content from gLite SE) • Input string for a task or service • Result of a Database query
Dataflow programming with gUSE • Separate application logic from data • Cross & dot product data-pairing • All-to-all vs. one-to-one pairing of data items • (Concept from Taverna) • Generator components: to producemany output files from 1 input file • Collector components: to produce1 output file from many input files • Any component can be generator or collector • Conditional execution based on equality of data • Nesting, cycle, recursion 40 20 50 1000 40 Collector Generator 5000 1 7042 tasks Collector 5000 1
Task execution process • User action, external event or time triggering WS-PGRADE File storage Workflow storage gUSEWebServices Workflow Engine Meta-broker LocalSubmitter Web ServiceClient EGEESubmitter Desktop GridSubmitter DatabaseClient … Desktop Grid server Dedicatedcluster WebService gLite WMS DBMS
gUSE Generic Grid-Grid bridge EDGeS EGEE VO 1 Desktop Grid 3G Bridge DGserver WMS Machine Other EGEE services DC-APIplugin UI machine 3G Bridge EGEE plugin WS-PGRADEgUSE Comp. Element UI machine WN WN WN EGEE VO 2 WN WN WN
Ergonomics • Users can be grid application developers or end-users. • Application developers design sophisticated dataflow graphs in gUSE • embedding into any depth, recursive invocations, conditional structures, generators and collectors at any position • Use personal grid certificate for test executions • Publish applications in the repository at certain stages of work • Graphs • Templates • Concrete • End-userssee gUSE as a science gateway • List ready to use applications from repository • Import and execute applications without knowledge of programming, dataflow, grid, internal structure of application • Use personal grid certificate for production execution
Import an application from repository To avoid overwriting any of you existing applications: Choose a new name!
Email notification Ask email notification to know when the execution of your application is finished!
Application list Provide input for the application then submit!
Set auto-submission Define when should WSPGRADE submit your application!
Submission triggered by an external event Your application will be submitted when a request arrives with the key that you just set. (WS invocation)
Write you own submission trigger! URL of the WS-PGRADE portal server
Set custom input files/ input Strings/ SQL queries for components
Monitoring application execution Step 1: The application is selected by button “Submit” Step 2: Define a freeform description to identify this particular executable instance
Submit Resume Origin Submitted Suspended Internal Error Suspend Abort First Job Starts Resume Internal Error Running Suspended Suspend Abort Last Job terminates Error Aborted Finished States of a Workflow Instance 1.If all state counters are 0 then there is no Instance of the given Workflow 2. In Column “Error” the number instances being in states “Error” and “Aborted” are summed 3. Instances in state “Suspended” are displayed according their preceding states
Downloading results Information about the quota of the user allotted storage capacity in the Portal server Keep large data in Grid files or DBMS Columns of individual instances, please note, that outputs can be downloaded separately Inner slider to encounter and access each Instances Columns of bulk download: All or proper parts of all instances of a given WF can be downloaded
Repository Item Concrete Workflow Graph Workflow Instance Template Algorithms/servicesResourcesInputs Running stateOutputs Component layoutI/O PortsEdges ConstraintsComments Development cycle of a gUSE dataflow application Which parts and parameters can be modified, which cannot gUSE Applicationrepository service Define content Prepare forre-usage Publish Import Test execution End user’saccount Concrete Workflow Submit Workflow instance Workflow instance Workflow instance
Graph editor(P-GRADE Portal look-and-feel) Components remain empty at this stage!
Note the inner slider:By moving it you can encounter –and make visible – any Port of the current job Workflow Configuration:Creating Concrete Workflow Select Configure Select a job by mouse click Fill the job property characteristics. Details have discussed previously. Select Port Property Configuration Fill port property characteristics. Details have discussed previously. Select JDL/RSL Configuration Return to main view Insert a definition Select one of the JDL/RSL Configuration Parameters of the list box Save & Upload the Workflow configuration. Remind eventual error messages! Close the configuration of this job Confirm the settings
0 1 2 0 1 2 0 *3 Explaining Generator, Normal and Collector job types Number of results generated by a single query can be unknown! Generator Collector 1 run 1 run Typically for generating input parameters of a simulation. (Often a database query) Typically for statistical analysis 1. run 2. run 3. run Typically for processing data in “parameter sweep” fashion
*K Legend: Cross Product Dot Product Configuring the Workflow:Data processing pattern Define input data elements for every open input Port Number of data elements m n h Determine Job to be Generator by defining Multiple output port.In this case the job may be able to produce more than 1 jobs associated to the multiple output port within one job submission step G Determine Dot or Cross product relation of Input ports to define the number of job submissions 1 C Determine Job to be Collector by defining a Gathering Input Port. The Job execution will be postponed until all input files to that Port have arrived and can be elaborated in a single job submission step
*K h m n m*n*h*K h S S S m*n Animating the number of generated output data elements In case of Generator job the number of job submissions may differ from the number of files on Output Ports G In case of dot product the Job is submitted with input files having a common index number in each input Ports m*n h*K m*n h*K m*n h*K m*n*h*K S S=max(m*n,h*k) 1 1 S C S In case of cross product individual Job submission is generated for each possible input file combination S S S
Data processing complexity: P-GRADE or WS-PGRADE WS-PGRADE Portal: • Generators at any level of graph • Collectors at any level of graph • Freedom to use CROSS or DOT product P-GRADE Portal: • Generators only the first level of graph • Collectors always the last level of graph • Always CROSS product between parameter input channels G G G G G NormalWF C C
Other advanced features for developers • Workflow embedding • Simplify workflows by hierarchical development • Special case – Iterative embedding: iterate computation towards the final result • Conditional execution of workflow branches • Execute a branch only if it receives the expected input
Example: Need for complexityCancerGrid workflow 1 N=20e-30e, M=100 ~2.7 billion tasks !!! x1 NxM 1 xN xN xN NxM NxM 1 N N N Generator job Generator job NxM
Current users of gUSE • CancerGrid project • Predicting various properties of molecules to find anti-cancer leads • Creating science gateway for chemists • EDGeS project (Enabling Desktop Grids for e-Science) • Integrating EGEE with BOINC and XtremWeb technologies • User interfaces and tools • ProSim project • In silico simulation of intermolecular recognition • JISC ENGAGE program • University of Westminster Desktop Grid • Using AutoDock on institutional PCs
Conclusions • P-GRADE Portal remains supported • Features can serve most grid scenarios • Open source project on Sourceforge • WS-PGRADE • Implemented on top of scalable, WS based gUSE architecture • More expressive dataflow patterns • Transparent access to • Local resources • Service Grids • Desktop Grids • Databases • Web services • Application repository • Service for collaboration of developers and end-users
Next steps with gUSE: www.guse.hu User manual Request an account
Thank you Gergely Sipos sipos@sztaki.hu www.wspgrade.hu www.guse.hu www.lpds.sztaki.hu