290 likes | 307 Views
Explore gLite & P-GRADE Portal for application development on EGEE, with workflow concepts, parameter study, and hands-on exercises. Learn basic concepts and access mechanisms for EGEE.
E N D
Grid application development with gLite and P-GRADE Portal Miklos Kozlovszky MTA SZTAKI m.kozlovszky@sztaki.hu
Presenter • MTA SZTAKI (Hungarian Academy of Sciences)Laboratory of Parallel and Distributed Systemswww.lpds.sztaki.hu • Miklos Kozlovszky • EGEE-III (Enabling Grids for E-sciencE) • GASUC Team • Trainings and dissemination activities • SEE-GRID2 / SEE-GRID-SCI (South Eastern European GRID-enabled eInfrastructure Development) • Manager of “Dissemination and Training” (WP5/NA3)
Introduction of LPDS(Lab of Parallel and Distr. Systems) • Research division of MTA SZTAKI from 1998 • Head: Peter Kacsuk, Prof. • 22 research fellows • Foundation member • Central European Grid Consortium (2003) • Hungarian Grid Competence Center (2003) • Participant or coordinator in many European and national Grid research, infrastructure,andeducational projects (from 2000) • FP5: GridLab, DataGrid • FP6: EGEE I-II, SEE-GRID I-II, CoreGrid, ICEAGE, CancerGrid • FP7: EGEE III, SEE-GRID-SCI, EDGeS (coordinator), ETICS, S-CUBE • Central European Grid Training Center in EGEE (from 2004) www.lpds.sztaki.hu
Webpage http://indico.in2p3.fr/conferenceDisplay.py?confId=1265 Find it from EGEE User Forum Webpage OR EGEE Training webpage (Google EGEE NA3) • http://www.egee.nesc.ac.uk/ Events and registration (top menu) ..., Paris, December10-13 Save the direct link! • Long term storage of training material • Presentations in PPT • Tutorials in HTML/DOC/PDF
Feedback form • Your comments and feedbacks are highly valuable for EGEE training • Please fill in the feedback form and return at the end of the course • Anonymous • Scores: 1 - 6 (very bad - very good) • Comments are highly appreciated
Goals of the day • Basic concepts of • Workflow • Parameter study on EGEE • Implementation in P-GRADE Portal • Further information • How to learn more • How to get access to EGEE • How to port your own application to EGEE
Agenda • Application development on gLite * • Workflow and parameter study concepts on EGEE • Workload management and data services in gLite • Workflow and parameter study support in P-GRADE Portal • Hands-on • Workflow exercises • Parameter study exercises • How to learn more * = (mostly skipped, please refer to previous presentations from yesterday)
Agenda • Application development on gLite • Workflow and parameter study concepts on EGEE • Workload management and data services in gLite • Workflow and parameter study support in P-GRADE Portal • Hands-on • Workflow exercises • Parameter study exercises • How to learn more
Where computer science meets the application communities! The tools, services used by the VO’s applications NA4 Recommended External Software Packages for Egee CommuniTies Current RESPECT tools: GridWay P-GRADE Portal http://egeena4.lal.in2p3.fr/ “Grid software” menu EGEE grid, gLite middleware Application Application toolkits Command line & APIs Higher-level gLite services (WMS,...) Production infrastructure contains these services • High level services: help the users building their computing infrastructure but should not be mandatory • Basic services: Must be complete and robust; Should not assume the use of Higher-Level Grid Services Basic gLite services:CE, SE, info, security
VO concept • gLite middleware runs on each EGEE site to provide • Data services: Computing Element • Computation services:Storage Element • Security service • Sites and users form Virtual Organisations: basis for collaboration • Each VO can / must have central software services and support groups INTERNET P-GRADEPortal
File and Replica Catalog User Interface Resource Broker Computing Element Storage Element Site X Basic gLite use case:Job submission Information System Submit job (executable + small inputs) query Retrieve status & (small) output files create proxy query publish state Submit job Retrieve output Job status Logging Register file Input file(s) Job status process VO Management Service(DB of VO users) Output file(s) Logging and bookkeeping
Obtain a certificate from a recognized CA: www.gridpma.org – Find the official CA of your country 1 year long, renewable certificates Accepted in every EGEE VO GILDA CA – two weeks long, renewable certificate Accepted only in GILDA training VO (VO to be used today) Find and register at a VO List of VOs with Usage rules: CIC Operations portal: http://cic.gridops.org/ Scientific discipline Geographical region Use the VO services Through (low level) command line tools of gLite (Not today) Through high level tools E.g. P-GRADE Portal, GENIUS, GANGA, ... Access mechanism varies from tool to tool Obtaining certificate:Annually Joining VO:Once How can I get access to EGEE? CA VO manager VO Membership Service VOMS database Grid sites
Application developer’s questions • I have a computational intensive problem How does it relate to this scenario? • What is a grid job for me? • How many jobs do I have, how they relate to each other and to my data? • What is the input / output data for each job? • How to write a job to access input / output data? • How to submit, monitor the job? How to access their results? • Do I need to use additional services to my the application demands? • Answers • Now (sometimes specifically on P-GRADE Portal) • Or any time later for general purpose from Grid Application Support group (GASuC) www.lpds.sztaki.hu/gasuc
Functional Vs Data parallelism • Functional Decomposition (Functional Parallelism) • Decomposing the problem into different jobs which can be distributed to different CEs for simultaneous execution • Different executablesrun on different CEs (and may or may not process the same data) • Good to use when • When the data cannot be partitioned • there is not static structure or fixed determination of number of calculations to be performed
Functional decomposition The problem Job submission Job 3on Computing Element #3 Job 4on Computing Element #4 Job 1on Computing Element #1 Job 2on Computing Element #2 Job monitoring Result download time
Functional decomposition in practice: workflow The problem e.g. P-GRADE Portal server Job submission Workflow manager Job monitoring Datadependency Datadependency Result transfer Job submission Job monitoring Datadependency Datadependency Job submission Job monitoring Result download time
Functional Vs Data parallelism • Data Decomposition (Data Parallelism) • Partitioning the problem's data domain and distributing portions to multiple instances of the same job for simultaneous execution • Same executableruns on different CEs and processdifferent data • Good to use for problems where: • data is static (e.g. factoring, solving large matrix or finite difference calculations, parameter studies) • dynamic data structure tied to single entity where entity can be subsetted (large multi-body problems) • domain is fixed but computation within various regions of the domain is dynamic (fluid vortices models) • > 90% of grid applications employ data parallelism (parameter study, parametric study)
Data decomposition The problem Algorithm Data segment 1 Data segment 2 Data segment 3 Data segment 4 Job submission Job 2on Computing Element #2 Job 4on Computing Element #4 Job 1on Computing Element #1 Job 3on Computing Element #3 Job monitoring Result download time
Data decomposition in practice:Master-slave Master process, e.g. P-GRADE Portal server Generate inputs Master job Inputs Spawn slaves Job submit Monitor slaves Slave job Slave job Slave job Slave job Collect results Get job output Results Generate final result Final result
Generate inputs Master job Input Spawn slaves Job submit Monitor slaves Slave job Slave job Slave job Slave job Check job status Collect results Get job output Results Generate final result Final result Multi-level master-slave Generate inputs Master job Input Spawn slaves Job submit Monitor slaves Slave job Slave job Slave job Slave job Check job status Collect results Get job output Results
Complex master-slave Master job Generate inputs input Spawn slaves Monitor slaves Slave job Slave job Slave job Slave job Collect results results Generate inputs input Spawn slaves Monitor slaves Slave job Slave job Slave job Slave job Collect results results Generate inputs input Spawn slaves Monitor slaves Slave job Slave job Slave job Slave job Collect results results Generate final result Final result
3 input 9 input 9file 3file 3 x 9 = 27WF 27output Complex master-slave = Parameter study workflow Generate local inputs Master job Workflow manager input Spawn slaves Monitor slaves Slave job Slave job Slave job Slave job Collect local results results Generate local inputs input Spawn slaves Monitor slaves Slave job Slave job Slave job Slave job Collect local results results Generate local inputs input Spawn slaves Monitor slaves Slave job Slave job Slave job Slave job Collect local results results Generate result Final result
Defining a job • Executable (EGEE runs Scientific Linux v3 or v4) • Script: • No compilation is necessary • Can invoke real executable which is statically installed on the CE (VOBox) • Binary: • Must be compiled on the User Interface binary compatibility with EGEE is guaranteed • Statically linked to avoid errors caused by library versions • Input / output data • Input files • Smaller than 20 MByte? • If YES transfer them from client side (“InputSandbox” ) • If NOT upload them into Storage element before job submission • Output files • Smaller than 20 MByte? • If YES transfer them back to client side (“OutputSandbox”) • if NOT upload them into Storage element from Computing Element
Distribution of large datasets • Puts large files into Storage Elements and register them in Logical File Catalog (LFC) (covered already during previous sessions) • Large files do not go through the broker Inputs Master job Generate local inputs LFC & SEs Logical File Names Spawn slaves Broker Job submit Monitor slaves Slave job Slave job Slave job Slave job Check job status Collect local results Broker Get job output Logical File Names Generate result LFC & SEs Results
File services in gLite • Users’ files are stored on Storage Elements • A file on a SE is identified by a Storage URL (e.g. sfn://grid005.iucc.ac.it/flatfiles/SE00/gilda/generated/2007-06-23/filec79a9e3c-2485-4206-a2a5-235f) • User refer to files by Logical File Names (LFN) • LFC = directory structure of LFNs + pointers to SURLs (Files can have replicas) lfn:/grid/gilda/kozlovszky/run2/ input1 Storage Element 1sfn://grid005.iucc.ac.il/storage/gilda/generated/2007-06-23/fileb233d43f-5bc6-4ede-a5fe-611d48be2ba5 input2 Storage Element 2srm://aliserv6.ct.infn.it/dpm/ct.infn.it/home/gilda/generated/2007-06-23/filea21ab3e2-8ff6-4a44-82a7-f2 input3 Storage Element 3sfn://trigriden01.unime.it/flatfiles/SE00/gilda/generated/2007-06-23/filec79a9e3c-2485-4206-a2a5-235f LFC Storage Element 4sfn://grid005.iucc.ac.it/flatfiles/SE00/gilda/generated/2007-06-23/filec79a9e3c-2485-4206-a2a5-235f
LFC has a directory tree structure lfn:/grid/<VO_name>/<you create it> LFC Namespace Defined by the user Name conventions • Users primarily access and manage files through “logical filenames” Today: lfn:/grid/gilda/parisXX/. . .
Managing a workload with gLite command line tools • Login to the User Interface machine • Write your jobs. Operations in a job: • Access LFC, resolve LFN • Access SE, get file content • Process file • Write result to SE • Register file in LFC • (Compile your jobs to get the executables) • Write a job description for each job using Job Description Language (JDL) • Text file • Specifies Executable, Input and Output LFNs • Specifies resource requirements and preferences (Which CE) • Write the description of your workload • Workflow JDL or parametric job JDL (No parametric workflow!) myworkload.jdl • Use shell commands to • Submit the workload: glite-wms-job-submit myworkload.jdl wlID • Monitor the status: glite-wms-job-status wlID • Get the output sandbox:glite-wms-job-output wlID • Write a program (e.g. script) to • Register input files in LFC before the workload is started • Resubmit failed jobs • Download result files from Storages when wokrload is finished
Managing a workload with gLite command line tools • Login to the User Interface machine • Write your jobs. Operations in a job: • Access LFC, resolve LFN • Access SE, get file content • Process file • Write result to SE • Register file in LFC • (Compile your jobs to get the executables) • Write a job description for each job using Job Description Language (JDL) • Text file • Specifies Executable, Input and Output LFNs • Specifies resource requirements and preferences (Which CE) • Write the description of your workload • Workflow JDL or parametric job JDL (No parametric workflow!) myworkload.jdl • Use shell commands to • Submit the workload: glite-wms-job-submit myworkload.jdl wlID • Monitor the status: glite-wms-job-status wlID • Get the output sandbox:glite-wms-job-output wlID • Write a program (e.g. script) to • Register input files in LFC before the workload is started • Resubmit failed jobs • Download result files from Storages when wokrload is finished Or use P-GRADE Portal `
Further information, references • EGEE • http://www.eu-egee.org/ • gLite middleware • http://www.glite.org • gLite manuals, documentation • http://glite.web.cern.ch/glite/documentation/(gLite user guide) • Recommended External Software Packages for EGEE Communities (RESPECT) • http://egeena4.lal.in2p3.fr/ • P-GRADE Grid Portal • http://portal.p-grade.hu/ • P-GRADE Grid Portal (Here to login…) • http://portal.p-grade.hu/multi-grid