1 / 53

P-GRADE Portal and GEMLCA: A workflow-oriented portal and application hosting environment

P-GRADE Portal and GEMLCA: A workflow-oriented portal and application hosting environment. Gergely Sipos sipos@sztaki.hu MTA SZTAKI (Hungarian Academy of Sciences). www.portal.p-grade.hu www.cpc.wmin.ac.uk/gemlca. Contents. Motivation of creating the tools

cher
Download Presentation

P-GRADE Portal and GEMLCA: A workflow-oriented portal and application hosting environment

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. P-GRADE Portal and GEMLCA: A workflow-oriented portal and application hosting environment Gergely Sipos sipos@sztaki.huMTA SZTAKI (Hungarian Academy of Sciences) www.portal.p-grade.huwww.cpc.wmin.ac.uk/gemlca

  2. Contents • Motivation of creating the tools • P-GRADE Portal and GEMLCA in a nutshell • Lifecycle of GEMLCA / P-GRADE applications • Services provided for application developers • Introduction of the hands-on • Hands-on • How to use P-GRADE / GEMLCA Portal for training and dissemination

  3. Application Application toolkits, standards Higher-level grid services (brokering,…) Basic Grid services:AA, job submission, info, … Context Graphical interface Middleware independent services and interfaces of P-GRADE/GEMLCA Middleware specific clients Grid middleware services

  4. Current situation and trends in Grid computing • Fast evolution of Grid systems and middleware: • GT2, OGSA, GT3 (OGSI), GT4 (WSRF), LCG-2, gLite, … • Many production Grid systems are built with them • EGEE (LCG-2  gLite), UK NGS (GT2), Open Science Grid (GT2  GT4), NorduGrid (~GT2) • Although the same set of core services are available everywhere, they are implemented in different ways • Data services (file management) • Computation services (job submission) • Security services (proxy based single sign-on) • Brokers (not in every middleware)

  5. TODAY’S FOCUS P-GRADE Portal in a nutshell • General purpose, workflow-oriented computational Grid portal. Supports the development and execution of workflow based Grid applications – a Grid orchestration environment • Based on GridSphere web portal framework • Functionalities are accessed through portlets • Easy to expand with new portlets (e.g. application-specific portlets) • Easy to tailor to end-user or community needs • Developed by SZTAKI (1.0 in 2003, now 2.5) • Grid services supported by P-GRADE Portal 2.5: Solves Grid interoperability problem at the workflow level

  6. GEMLCA extension of theP-GRADE Portal • P-GRADE Portal extended with GEMLCA Grid service back-end • To share jobs and legacy codes as application components with others • A step towards collaborative e-Science • Developed by the University of Westminster (London) • Support for Globus 4 grids (besides GT2 and EGEE) • Available on the NGS and OGF GIN LCG / gLite VOs P-GRADE Portal Globus 2 VOs GEMLCA job job job job Globus 4 VOs

  7. Related projects The development, operation and training of P-GRADE Portal and GEMLCA is supported by the following projects: • SEE-GRIDwww.see-grid.eu Development, application support • Coregrid www.coregrid.net Research, development • EGEEwww.eu-egee.org gLite training, application development • ICEAGEwww.iceage-eu.org Grid training and education

  8. A Grid application in the GEMLCA / P-GRADE Portal • A directed acyclic graph where • Nodes represent jobs or services (a batch program executed on a computing resource) • Ports represent input/output files the components expect/produce • Arcs represent file transfer operations • Semantics of the workflow: • A job can be executed if all of its input files are available • Responsibility of the built-in workflow manager

  9. Parallel execution inside a workflow node • Parallel execution among workflow nodes Multiple nodes can run parallel The job/service can be a parallel code Three levels of parallelism within a P-GRADE Portal application Multiple instances of the same workflow process different data files • The workflow concept of the GEMLCA/ P-GRADE Portal enables the efficient parallelization of complex problems • Semantics of the workflow enables two levels of parallelism: • Parametric sweep execution of the workflow (SIMD)

  10. Workflow-level Grid interoperability:The GIN Resource Testing portal OGF effort to demonstrate workflow level grid interoperability between major production Grids and to monitor OGF GIN VO resources

  11. SAVE WF / PS REUSE WORKFLOW COMPONENTS START EDITOR The typical user scenarioPart 1 - development phase MyProxy servers Gridservices Portal server OPEN & EDIT or DEVELOP WORKFLOW or PS WF

  12. TRANSFER FILES, SUBMIT JOBS DOWNLOAD PROXY CERTIFICATES MONITOR JOBS VISUALIZE JOBS and WORKFLOW PROGRESS DOWNLOAD (SMALL) RESULTS DOWNLOAD (SMALL) RESULTS The typical user scenarioPart 2 - execution phase MyProxy servers Gridservices Portal server Keep large files on Grid storage resources

  13. Share workflow components with other users of the same portal Export and share workflows with users of the same, or another portal The typical user scenarioPart 3 - collaborative phase MyProxy servers Gridservices Portal server

  14. Optional plug-in: GEMLCA service(WSRF) Inside the portal server Webbrowser Java Webstartworkflow editor Client • Technology specific gateways • File transfer • Proxy management • Load monitoring Tomcat P-GRADE Portal portlets (JSR-168, Gridsphere 2):Workflow, Certificates, Information System, Settings, GEMLCA P-GRADEPortalserver DAGMan workflow manager Mercury API Informationsystemclients CoG API& scripts shell scripts Grid middleware clients Grid middleware services (WMS, LFC, gridFTP, GRAM, …) Mercurymonitorservice Informationsystems MyProxy server & VOMS Grid

  15. Workflow EditorDefining the graph Define a Directed Acyclic Graph (DAG) of jobs and services (GEMLCA jobs): • Drag & drop components:nodes and ports • Define their properties • Connect ports by channels(no cycles, no loops, no conditions…)

  16. Workflow EditorProperties of a job component • Properties of a job: • Type of executable • Client side location of the binary • Number of required processors • Command line parameters • The resource to be used for the execution: • Grid (VO) • Resource / broker

  17. Workflow EditorProperties of a service component (GEMLCA job) • Properties of a service: • The location of the service: • Grid (VO) • Resource / broker • An application (binary) associated with that resource • Input parameter values for the service

  18. Workflow EditorDefining job / service input-output data File properties Type: input:the component reads output:the component writes File type:local: originates from my desktop remote: originates from a grid storage element File: location of the file File storage type (for outputs only): Permanent:final result Volatile:used only for inter-component data transfer

  19. Client side location: c:\experiments\11-04.dat LFC logical file name(LFC file catalog is required – EGEE VOs)lfn:/grid/gilda/sipos/11-04.dat GridFTP address (in Globus Grids): gsiftp://somengshost.ac.uk/mydir/11-04.dat Local file Remote file How to refer to an I/O file? Input file Output file • Client side location: result.dat • LFC logical file name(LFC file catalog is required – EGEE VOs)lfn:/grid/gilda/sipos/11-04_-_result.dat • GridFTP address (in Globus Grids): gsiftp://somengshost.ac.uk/mydir/result.dat

  20. LOCAL INPUT FILES & BINARIES LOCAL INPUT FILES& BINARIES REMOTE INPUTFILES REMOTE OUTPUTFILES LOCAL OUTPUT FILES LOCAL OUTPUT FILES Binaries of GEMLCA jobs Workflow level file transferby the workflow manager Gridinfrastructure Portal server Storage elements User levelstorage GEMLCArepository Computing elements

  21. Generated by the portal LOCALINPUTFILE REMOTE INPUTFILE 1 0 Custom file transfer LOCAL OUTPUTFILE REMOTE OUTPUTFILE 2 3 Generated by the portal Job / service level file transferby the workflow manager 0 1 Gridinfrastructure 2 3 Storage Elements Computing Element Pre script Portal server binary Post script

  22. Information system portlet tobrowse computing elements Graphical interface for BDII servers

  23. Workflow execution Main steps • Download proxies • Submit workflow • Observe workflow progress • If some error occurs correct the graph • Download result

  24. Certificate ManagerCertificates portlet • To start your session on the Grid you must create a proxy certificate on the portal server • “Certificates” portlet: • to upload a proxy into MyProxy servers • to download a proxy from MyProxy into the portal server

  25. Certificate ManagerMulti-grid portal Multi-proxy environment Multiple proxies can be available on the portal server at the same time! Certificate from Hungarian CA: HUNGRID CEs and SEs Certificate from EGEE CA: SEE-GRID CEs and SEs

  26. Proxy2 VOMS ext. Proxy2 VOMS ext. Proxy2 Certificates, proxies with gLite VOs:Download MyProxyserver Proxy1 VOMSserver Gridservices I have to do this every time when I want to execute workflows Portal server

  27. Workflow Management(workflow portlet) • The portlet presents the status, size and output of the available workflow in the “Workflow” list • It has a Quota manager to control the users’ storage space on the server • The portlet also contains the “Abort”, “Attach”, “Details”, “Delete” and “Delete all” buttons to handle execution of workflows • The “Attach” button opens the workflow in the Workflow Editor • The “Details” button gives an overview about the jobs of the workflow

  28. Workflow Execution(observation by the workflow portlet) White/Red/Green color means the job is initial/running/finished state

  29. Workflow Execution(observation by the workflow portlet) White/Red/Green color means the job is initial/running/finished state

  30. Workflow Execution(observation by the workflow portlet) White/Red/Green color means the job is initial/running/finished state

  31. Workflow Execution(observation by the workflow portlet) White/Red/Green color means the job is initial/running/finished state

  32. Workflow Execution(observation by the workflow portlet) White/Red/Green color means the job is initialised/running/finished

  33. The portal monitors and visualizes parallel jobs(if they are prepared for Mercury monitor) On-Line Monitoring both at the workflow and job levels (workflow portlet) • The portal monitors and visualizes workflow progress

  34. Downloading the results…

  35. Sharing a successfully finished job with other users: GEMLCA repository GEMLCArepository

  36. Combine services and your codein the same workflow! Collaborative grid applications Service invocation Service invocation Job submission Service invocation Job submission

  37. Support for parametric study workflows Since P-GRADE Portal v2.5

  38. General structure of a PS applications Algorithm 2 Algorithm 2 Algorithm 2 Algorithm 2 Algorithm 2 Algorithm 2 Algorithm 2 Algorithm 2

  39. Advanced PS applications Algorithm 1 Cut inputintosmaller units Algorithm 2 Algorithm 2 Algorithm 2 Algorithm 2 Algorithm 2 Algorithm 2 Algorithm 2 Algorithm 2 Algorithm 3 Aggregate result; perform “global operation”

  40. PS applications inP-GRADE Portal 2.5 Complete workflow Files stored in an LFC catalog (e.g. /grid/gilda/sipos/myinputs) Results will be generated in the same catalog

  41. Initial input data Generator component(s) Generate orcut input intosmaller units Collector component(s) Aggregate result Advanced PS applications inP-GRADE Portal 2.5 Complete workflow Files stored in an LFC catalog (e.g. /grid/gilda/sipos/myinputs) Results will be generated in the same catalog

  42. Parallel execution inside a workflow node (SIMD/MIMD/MISD) • Parallel execution among workflow nodes(SIMD/MIMD/MISD) Multiple jobs run parallel Each job can be a parallel program Third level of parallelism Multiple instances of the same workflow process different data files • Parameter study execution of the workflow(SIMD)

  43. Turning a WF into a parameter study By turning at least of the open input ports into a “PS Input port” the WF is turned into a Parameter Study

  44. Turning a WF into a parameter study /grid/gilda/sipos/InputImages Image.0 Image.1 /grid/gilda/sipos/XCoordinates XCoordinate.0 XCoordinate.1 /grid/gilda/sipos/YCoordinates YCoordinate.0 YCoordinate.1 /grid/gilda/sipos/Output ImagePart.0 ImagePart.1 . . .

  45. Generating multi-grid eWFs Assigns the 24 jobs to 24 Grid Resources within 2 Grids 1 PS workflow execution PS port: 4 instances of the input file XC XD XA XB YG YH YE YF PS port: 3 instances of the input file = XI XK XL XJ YO YP YM YN Assign to broker of Grid X Assign to broker of Grid Y XV XS XT XR YY YZ YU YX

  46. Auto generator Pre defined program logic To generate text files User controls file content by templates and parameters Custom generator User provides program logic To generate binary file content (e.g. image, audio, …) Generators • Generate input files for parameter study workflows • Saves these files into LFC catalogs • Two types:

  47. Collectors • Collect output units and perform a collective operation on them. E.g. • Standard deviation • Average • Statistics • Evaluation • Selecting the “best” point of the parameter space • … • User provides the program logic • Portal provides data transfer • No need to use any Grid API in your code • Open and write I/O files as local files

  48. References • P-GRADE Portal service for: • SEE-GRID infrastructure • Central European VO of EGEE • GILDA: Training VO of EGEE • Many national Grids (UK National Grid Service, HunGrid, Turkish Grid, etc.) • US Open Science Grid, TeraGrid • Economy-Grid, Swiss BioGrid, Bio and Biomed EGEE VOs, BalticGrid • OGF Grid Interoperability Now (GIN) VO portal.p-grade.hu/index.php?m=5&s=0

  49. Summary and conclusion • P-GRADE Portal hides the complexity of Grid systems • Globus 2, Globus 4, LCG, gLite • Various components can be integrated into workflows • Sequential codes • MPI codes • Legacy code services (with the GEMLCA-specific version) • Workflows can be executed as parameter studies • Storage management • Generators • Collectors • Your code does not have to contain grid specific calls • Graphical interfaces for • grid application development • certificate management • application execution and monitoring • Support for collaborative work • Share workflow components • Share workflows • Built by standard portlet API  customizable to specific needs

  50. Questions, hands-on Learn once, use everywhere Develop once, execute anywhere www.portal.p-grade.hu pgportal@lpds.sztaki.hu

More Related