1 / 43

GRID technolo gy by SZTAKI

GRID technolo gy by SZTAKI. Peter Kacsuk MTA SZTAKI Laboratory of Parallel and Distributed Systems www.lpds.sztaki.hu. Contents. SZTAKI participation in EU and Hungarian Grid projects P-GRADE ( P arallel GR id A pplication D evelopment E nvironment ) Integration of P-GRADE and Condor

adamdhoward
Download Presentation

GRID technolo gy by SZTAKI

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. GRID technologyby SZTAKI Peter Kacsuk MTA SZTAKI Laboratory of Parallel and Distributed Systems www.lpds.sztaki.hu

  2. Contents • SZTAKI participation in EU and Hungarian Grid projects • P-GRADE (Parallel GRid Application Development Environment) • Integration of P-GRADE and Condor • TotalGrid • Meteorology application by TotalGrid • Grid version of GRM/PROVE in the DataGrid project

  3. EU Grid projects of SZTAKI • DataGrid – performance monitoring and visualization • GridLab – grid monitoring and information system • APART-2 – leading the Grid performance analysis WP • SIMBEX – developing a European metacomputing system for chemists based on P-GRADE

  4. Hungarian Grid projects of SZTAKI • VISSZKI • explore and adopt Globus and Condor • DemoGrid • grid and performance monitoring and visualization • SuperGrid (Hungarian Supercomputing Grid) • integrating P-GRADE with Condor and Globus in order to provide a high-level program development environment for the Grid • Hungarian Cluster Grid Initiative • To provide a nation-wide cluster Grid for universities

  5. Hungarian and international GRID projects EU DataGrid • VISSZKI • Globus test • Condor test • DemoGrid • - file system • monitoring • applications Cactus CERN LHC Grid • SuperGrid • - P-GRADE • portal • security • accounting Condor APART-2 EU GridLab EU COST SIMBEX

  6. NIIFI 2*64 proc. Sun E10000 ELTE 16 proc. Compaq AlphaServer BME 16 proc. Compaq AlphaServer • SZTAKI 58 proc. cluster • University (ELTE, BME) clusters Structure of theHungarianSupercomputing Grid 2.5 Gb/s Internet

  7. The Hungarian Supercomputing GRID project GRID application GRID application Web based GRID access GRID portal High-level parallel development layer P-GRADE Low-level parallel development PVM MW MPI Grid level job management Condor-G Grid middleware Globus Grid fabric Condor, SGE Condor, SGE Condor, SGE Condor, SGE Compaq Alpha Server Compaq Alpha Server SUN HPC Clusters

  8. Distributed supercomputing: P-GRADE • P-GRADE (Parallel GRid Application Development Environment) • A highly integrated parallel Grid application development system • Provides: • Parallel, supercomputing programming for the Grid • Fast and efficient development of Grid programs • Observation and visualization of Grid programs • Fault and performance analysis of Grid programs • Further development in the: Hungarian Supercomputing Grid project

  9. Three layers of GRAPNEL

  10. Communication Templates • Pre-defined regular process topologies • process farm • pipeline • 2D mesh • User defines: • representative processes • actual size • Automatic scaling

  11. Mesh Template

  12. Hierarchical Debuggingby DIWIDE

  13. Support for systematic debugging to handle non-deterministic behaviour of parallel applications Automatic dead-lock detection Replay technique with collective breakpoints Systematic and automatic generation of Execution Trees Testing parallel programs for every time condition Macrostep Debugging

  14. GRM semi-on-line monitor • Monitoring and visualising parallel programs at GRAPNEL level. • Evaluation of long-running programs • Support for debugger in P-GRADE with execution visualisation • Collection of both statistics and event trace • No lost of trace data at program abortion. The execution to the point of abortion can be visualised. • Execution (and monitoring) remotely from the user environment -> first step towards the Grid

  15. GRM semi-on-line monitor • Semi-on-line • stores trace events in local storage (off-line) • makes it available for analysis at any time during execution for user or system request (on-line pull model); • Advantages • analyse the state (performance) of the application at any time • scalability: analyse trace data in smaller sections and delete them if they are not longer needed • Less overhead/intrusion to the execution system than with on-line collection (see NetLogger) • Less network traffic: pull modelinstead of push model. Collections initiated only from top.

  16. PROVE Statistics Windows • Profiling based on counters • Analysis of very long running programs is enabled

  17. PROVE: Visualization of Event Traces • User controlled focus on processors, processes and messages • Scrolling visualization windows forward and backwards

  18. Integration of Macrostep Debugging and PROVE

  19. Features of P-GRADE • Designed for non-specialist programmers • Enables fast reengineering of sequential programs for parallel computers and Grid systems • Unified graphical support in program design, debugging and performance analysis • Portabilityon • supercomputers • heterogeneous clusters • components of the Grid • Two execution modes: • Interactive • Job level

  20. P-GRADE Interactive Mode on Clusters Design/Edit Development cycle Visualize Compile Monitor Map Debug P-GRADE Interactive mode

  21. Design/Edit Attach P-GRADE and Condor Compile Detach CondorMap Submit job P-GRADE Job Mode for the Grid

  22. Condor flocking Condor 2100 2100 2100 2100 Condor 2100 2100 2100 2100 P-GRADE P-GRADE P-GRADE Mainframes Clusters Grid Condor/P-GRADE on the whole range of parallel and distributed systems GFlops Super-computers

  23. P-GRADE program runsat the Madisoncluster P-GRADE program runs at the Budapest cluster P-GRADE program runsat the Westminstercluster Berlin CCGrid Grid Demo workshop: Flocking of P-GRADEprograms by Condor P-GRADE Budapest n0 n1 m0 m1 Budapest Madison p0 p1 Westminster

  24. P-GRADE program runsat the Londoncluster P-GRADE program downloaded to London as a Condor job 1 3 P-GRADE program runs at theBudapestcluster 4 2 London clusteroverloaded=> check-pointing P-GRADE program migrates to Budapest as a Condor job Next step: Check-pointing and migration of P-GRADEprograms Wisconsin P-GRADE GUI Budapest London n0 n1 m0 m1

  25. Further develoment: TotalGrid • TotalGrid is a total Grid solution that integrates the different software layers of a Grid (see next slide) and provides for companies and universities • exploitation of free cycles of desktop machines in a Grid environment after the working/labor hours • achieving supercomputer capacity using the actual desktops of the institution without further investments • Development and test of Grid programs

  26. Layers of TotalGrid P-GRADE PERL-GRID Condor or SGE PVM or MPI Internet Ethernet

  27. PERL-GRID • A thin layer for • Grid level job management between P-GRADE and various local job managers like • Condor • SGE, etc. • file staging • job observation • Application in the Hungarian Cluster Grid

  28. Hungarian Cluster Grid Initiative • Goal: To connect the new clusters of the Hungarian higher education institutions into a Grid • By autumn 42 new clusters will be established at various universities of Hungary. • Each cluster contains 20 PCs and a network server PC. • Day-time: the components of the clusters are used for education • At night: all the clusters are connected to the Hungarian Grid by the Hungarian Academic network (2.5 Gbit/sec) • Total Grid capacity in 2002: 882 PCs • In 2003 further 57 similar clusters will join the Hungarian Grid • Total Grid capacity in 2003: 2079 PCs • Open Grid: other clusters can join at any time

  29. Structure of the Hungarian Cluster Grid TotalGrid 2002: 42*21 PC Linux clusters, total 882 PCs 2003: 99*21 PC Linux clusters, total 2079 PCs TotalGrid 2.5 Gb/s Internet TotalGrid

  30. Live demonstration of TotalGrid • MEANDER Nowcast Program Package: • Goal: Ultra-short forecasting (30 mins) of dangerous weather situations (storms, fog, etc.) • Method: Analysis of all the available meteorology information for producing parameters on a regular mesh (10km->1km) • Collaborative partners: • OMSZ (Hungarian Meteorology Service) • MTA SZTAKI

  31. Structure of MEANDER First guess data ALADIN SYNOP data Satelite Radar Lightning CANARI Delta analysis decode Basic fields: pressure, temperature, humidity, wind. Radar to grid Rainfall state Derived fields: Type of clouds, visibility, etc. Satelite to grid Visibility Overcast GRID Type of clouds Current time Visualization For users: GIF For meteorologists:HAWK

  32. P-GRADE version of MEANDER

  33. Live demo of MEANDER based on TotalGrid P-GRADE PERL-GRID 11/5 Mbit Dedicated job ftp.met.hu HAWK netCDF 34 Mbit Shared netCDF output job netCDF input netCDF output 512 kbit Shared PERL-GRID CONDOR-PVM Parallel execution

  34. Results of the delta method • Temperature fields at 850 hPa pressure • Wind speed and direction on the 3D mesh of the MEANDER system

  35. On-line Performance Visualization in TotalGrid P-GRADE PERL-GRID 11/5 Mbit Dedicated job ftp.met.hu netCDF 34 Mbit Shared job netCDF input GRM TRACE GRM TRACE 512 kbit Shared PERL-GRID CONDOR-PVM Parallel execution and GRM

  36. PROVE visualization of the delta method

  37. GRM/PROVE in the DataGrid project • Basic tasks: • step 1: To create a GRM/PROVE version that is independent from P-GRADE and runable in the Grid • step 2: To connect the GRM monitor to the R-GMA information system

  38. GRM in the grid Submit machine Main MonitorMM Trace file PROVE Pull model => smaller network traffic than in NetLogger Site 1 Local monitor => more scalable than NetLogger Site 2 PC 1 PC 2 PC 1 Local MonitorLM Local MonitorLM Local MonitorLM shm shm shm App. Process App. Process App. Process App. Process

  39. Server host MM Host:port host 3 GUI Main Monitor host 2 host 4 Submit host Site Start-up of Local Monitors This mechanism is used in TotalGrid and in the live demo Grid broker Local job manager LAN WAN application process2 application process1 Local Monitor Local Monitor

  40. Client machine Main Monitor PROVE 2nd step: Integration with R-GMA R-GMA Site Machine 2 Machine 1 App.Process App.Process App.Process

  41. Application Main Monitor Consumer Servlet Consumer API Registry API Registry Servlet Schema API Producer API Registry API Schema Servlet Instrumented application code Sensor ProducerServlet “database of event types” Integration with R-GMA R-GMA XML SQL SELECT SQL CREATE TABLE SQL INSERT

  42. Conclusions • SZTAKI participates in several EU and Hungarian Grid projects • Main results: • P-GRADE (SuperGrid project) • Integration of P-GRADE and Condor (SuperGrid, GridLab) • demo at Berlin CCGrid • TotalGrid (Hungarian Cluster Grid) • Grid version of GRM/PROVE (DataGrid) • Meteorology application in the Grid • continuos live demo in the registration hall

  43. Thanks for your attention ? Further information: www.lpds.sztaki.hu

More Related