RISICO on the GRID architecture

First implementation RISICO on the GRID architecture Mirko D'Andrea, Stefano Dal Pra

Outline of the presentation • Porting features; • Jobs management; • Implementation tests and results; • Conclusions and further development.

Porting features • Totally implemented in python. • Uses the same executable of the RISICO system (no changes needed). • Easily configurable through configuration file.

The RISICO system • Italy: 310000 km^2 • Current system: 300k regular cells, 1km side. • Grid version: 30M regular cells, 0.1km side. GRIDIFICATION

Get Input from Database Upload Input into catalog Create n jobs JOB 1 JOB n Get input from catalog Get input from catalog Run RISICO on dataset 1 Run RISICO on dataset n Write output 1 to catalog Write output n to catalog Collect outputs from catalog Write Outputs to Database RISICO vs GRID-RISICO Get Input from Database GRIDIFICATION Run RISICO Write Output to Database

Job submission • A RISICO's job is fully defined by a jdl (job description language) file and by a parameter file. • Each submitted job must terminate successfully within a defined time. The job activity is monitored by a software module called JobMonitor. • The job submission procedure is handled by a JobSubmitter, which creates a set of job and associates a JobMonitor with each job.

Job Monitoring • All the jobs are monitored by an instance of a module called JobMonitor. • The JobMonitor: • Checks the job status during execution; • Retrieves the job output from catalog; • If the job fails, JobMonitor tries to resubmit it. • JobMonitor will log the error if the job fails to run correctly.

Workflow: job creation, submission and data-collection • Downloads input from remote meteo-data database, creates an archive and uploads it to catalog; • Creates a jdl and parameters file for each job; • Submits the jobs. • Waits for jobs output. • Gets jobs output from catalog and aggregates them.

job 1 job n Job definition (1)‏ • Each job works with a specific dataset defining a spatial domain (subset). • Such subsets are created off-line and stored on the catalog. • A parameters file states the association between a job and a dataset. • Each job produces an output, whose path in the catalog is a-priori known.

Job definition (2)‏ • Job 1: • Domain: celle/celle_01.tar.bz2 • Status: celle/stato0_01.tar.bz2 • Input: input/input_20070119.tar.bz2 • Output: output/output_01_20071119.tar.bz2 • Each job has its own domain. • Job domain, status information and output are referred to the same geographical domain • All jobs share the same input file.

Job 1: • Domain: celle/celle_01.tar.bz2 • Status: celle/stato0_01.tar.bz2 • Input: input/input_20070119.tar.bz2 • Output: output/output_01_20071119.tar.bz2 Job definition (3)‏ CATALOG • Job 2: • Domain: celle/celle_02.tar.bz2 • Status: celle/stato0_02.tar.bz2 • Input: input/input_20070119.tar.bz2 • Output: output/output_02_20071119.tar.bz2 • Job n: • Domain: celle/celle_nn.tar.bz2 • Status: celle/stato0_nn.tar.bz2 • Input: input/input_20070119.tar.bz2 • Output: output/output_nn_20071119.tar.bz2

Final version • Estimated performances on the complete set of data (30M cells): • Total CPU-Time: about 2 hours and 30 minutes; • Optimal job number: about 30 (5-10 minutes of CPU time for each job); • Storage: 30GByte / day.

Test Results • The porting has been tested with a subset (1M cells) of the RISICO system final working-set . • 10 parallel jobs were used. • Performances: • Job CPU-time: 30 seconds • Grid overhead: 2 minutes.

Conclusions • RISICO represents a feasible and significative test case. • Grid architecture provides a valuable benefits to operational activities.

RISICO on the GRID architecture

RISICO on the GRID architecture

Presentation Transcript

Grid Architecture for eLearning

Grid Architecture

SIMRI@Grid: An MRI Simulation Web Portal on EGEE Grid Architecture

Grid Computing and the Open Grid Service Architecture

The Grid: Globus and the Open Grid Services Architecture

Virtual Data Grid Architecture

PAGIS: An Architecture for Programming on the Grid

Grid Checkpoining Architecture

Grid Control Architecture

Service Oriented Grid Architecture

Grid Architecture

Grid systems architecture

The Grid, Open Grid Services Architecture and .NET

Grid architecture at PHENIX

SIMRI@Grid: An MRI Simulation Web Portal on EGEE Grid Architecture