1 / 39

AstroGrid-D WP 5: Resource Management for Grid Jobs

AstroGrid-D WP 5: Resource Management for Grid Jobs. Report by: Rainer Spurzem (ZAH-ARI) spurzem@ari.uni-heidelberg.de and T. Brüsemeister, J. Steinacker. Meeting 13:10 – 14:30 WG5. Meeting WG5 and friends GridWay Discussion (together with Ignacio Llorente) Expected List of Topics:

hyunshik
Download Presentation

AstroGrid-D WP 5: Resource Management for Grid Jobs

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. AstroGrid-DWP 5: Resource Management for Grid Jobs Report by: Rainer Spurzem (ZAH-ARI) spurzem@ari.uni-heidelberg.de and T. Brüsemeister, J. Steinacker

  2. Meeting 13:10 – 14:30 WG5 • Meeting WG5 and friends GridWay Discussion (together with Ignacio Llorente) Expected List of Topics: • The Present Gridway Installation in Heidelberg - Solutions and Problems. Which use cases work? How? Demos or screenshots if available. • how about more than one gridway installation in Astrogrid-D simultaneously at different sites? • cooperation of information system and job submission (in general or in the special cases of our Astrogrid-D information system and Gridway? • miscellaneous (data staging postponed to next session)

  3. Meeting 13:10 – 14:30 WG5 • GridWay • Leightweight Metascheduler on top of GT2.4/GT4 • Central Server Architecture • Support of GGF DRMAA standard API for job submission and management • Simple round robin/flooding scheduling algorithm, but extensible

  4. Meeting 13:10 – 14:30 WG5 A practical example with screenshots: Information System Matchmaking GT4 Resources hydra.ari.uni-heidelberg.de Gridway Scheduler / Broker Job Status: “gwps”

  5. Meeting 13:10 – 14:30 WG5 Our View (Thanks: Hans-Martin)

  6. Meeting 13:10 – 14:30 WG5 • D5.1: central resource broker with queue • Present: use GridWay, throughway, round-robin • More Installations useful? • Questions: • Parameters needed Gridway – Information System • (queue status, module availability, data availabilty, hardware) • When is it feasible to have a real brokerage? How?

  7. Meeting 15:15 – 17:00 Use Cases • Porting Use Cases onto the Grid NBODY6++ : Astrophysical Case for direct N-body: Star Clusters, Galactic Nuclei, Black Holes, Gravitational Wave Generation • Special Hardware GRAPE, MPRACE (FPGA), future technologies (HT, Xtoll, GRAPE-DR) • GRAPE in the Grid, Astrogrid-D, International • DEISA

  8. Meeting 15:15 – 17:00 Use Cases

  9. N-Body + Grav. Waves @ ARI: Peter Berczik, Ingo Berentzen, Jonathan Downing, Miguel Preto, Gabor Kupi, Christoph Eichhorn David Merritt (RIT, USA)… in VESF/LSC collaboration: on gravitational wave modelling from dense star clusters: Pau Amaro-Seoane (AEI, Potsdam, D) G. Schäfer, A. Gopakumar (Univ. Jena, D) M. Benacquista (UT Brownsville, USA) Further collaborations: Sverre Aarseth (IoA Cambridge UK) Seppo Mikkola (U Turku, FIN) Jun Makino and colleagues in Tokyo… …support, cooperation, over many years…

  10. Globular Cluster ω Centauri (Central Region) Ground Based View

  11. Detection of Gravitational Waves? Was Einstein right?

  12. Example: VIRGO Detector in Cascina near Pisa, Italy

  13. Basic idea of any GRAPEN-body code: ~N ~N^2

  14. Hardware - GRAPE ~128 Gflops for a price ~5K USD; Memory for up to 128K particles GRAPE6a PCI board GRAPE6a, -BL - PCI Board for PC-Clusters PROGRAPE-4, FPGA based board from RIKEN (Hamada) GRAPE7 – new FPGA based board from Tokyo Univ. (Fukushige) GRAPE-DR – new board from Makino et al. NAOJ MPRACE1,2 – FPGA boards from Univ. Mannheim/GRACE (Kugel et al.)

  15. ARI 32 node GRAPE6a clusters • 32 dual-Xeon 3.0 GHz nodes • 32 GRAPE6a • 14 TB RAID • Infiniband link (10 Gb/s) • Speed: ~4 Tflops • N up to 4M • Cost: ~500K USD • Funding: NSF/NASA/RIT • 32 dual-Xeon 3.2 GHz nodes • 32 GRAPE6a • 32 FPGA • 7 TB RAID • Dual port Infiniband link (20 Gb/s) • Speed: ~4 Tflops • N up to 4M • Cost: ~380K EUR • Funding: Volkswagen/Baden-Württemberg Infiniband Dual 20Gb/s

  16. ARI-ZAH + RIT GRAPE6a clusters Performance Analysis (3.2 Tflop/s): Harfst et al. 2007, New Astron.

  17. Hardware

  18. Meeting 15:15 – 17:00 Use Cases Software: High Accuracy Integrators for Systems with long-range force + relaxation (gravothermal) • S.J.Aarseth, S. Mikkola (ca. 20.000 lines): • Hierarchical Block Time Steps • Ahmad-Cohen Neighbour Scheme • Kustaanheimo-Stiefel and Chain-Regular. • for bound subsystems of N<6 (Quaternions!) • 4th order Hermite scheme (pred/corr) • Bulirsch-Stoer (for KS) • NBODY6 (Aarseth 1999) • NBODY6++ (Spurzem 1999) using MPI/shmem, copy algorithm • Parallel Binary Integration in Progress • Parallel GRAPE Use (Harfst, Gualandris, Merritt, • Spurzem, Berczik, Portegies Zwart, 2007)

  19. Meeting 15:15 – 17:00 Use Cases High Accuracy Integrators: Record with GRAPE cluster at 2 million particles! ●Harfst, Gualandris, Merritt, Spurzem, Berczik ●Baumgardt, Heggie, Hut Baumgardt, Makino by D.C. Heggie Via www.maths.ed.ac.uk Larger N needed!

  20. Meeting 15:15 – 17:00 Use Cases ARI Cluster: ~3.2 Tlop/s sustained Harfst, Gualandris, Merritt, Spurzem, Portegies Zwart, Berczik, New Astron. 2007. Parallel PP on GRAPE6a cluster

  21. Visualisation With S. Dominiczak W. Frings John-von- Neumann Institute for Computing (NIC) FZ Jülich google for xnbody

  22. Meeting 15:15 – 17:00 Use Cases • Xnbody Visualization with FZ Jülich (Unicore) • NBODY6++ UseCase in Astrogrid-D (Globus GT4.0) Simple JSDL Job ok Parallel Job + GRAPE/MPRACE request in progress Astrogrid-D • Participation in international networks, like MODEST, AGENA (EGEE) • Goal: share and load-balance GRAPE/MPRACE resources in international grid-based frame

  23. International GRAPE-Grid Collaboration Meeting 15:15 – 17:00 Use Cases Members of Astrogrid-D: ARI-ZAH Univ. Heidelberg, D Main Astron. Obs. Kiev, UA Candidates: Univ. Amsterdam, NL Obs. Astroph. Marseille, F Fessenkov Obs., Almaty, KZ

  24. Meeting 15:15 – 17:00 Use Cases NBODY6++ Requirements • Fortran 77 with cpp Preprocessor and make • Data Access for Job Chain • Staging of binary and ASCII input/output • Optional: • Parallel Runs (PBS, mpich-mpif77, mpirun, others) • GRAPE hardware • xnbody direct visualization and interaction interface • Future: • GridMPI, Runs across sites

  25. Meeting 17:30 – 18:30 WG5 with WG3 • Common Workgroup Meeting of WG3 (Distributed Data Management) with WG5 (Resource Management for Grid Jobs) Expected List of Topics: • How can we improve data staging together? Which steps, what is needed, action items, people? • Further Interaction with other WG's e.g. WG7 user interfaces, WG6 Data Streaming, WG1 system integration • Next deliverables 5.4-5.8, others... • Open Discussion on sustainability, internationality, EGEE, followup project, breakout ideas, guided by Goals Last Year

  26. Meeting 17:30 – 18:30 WG5 with WG3 • How can we improve data staging together? Which steps, what is needed, action items, people? • Use Astrogrid-D file management system?

  27. WP5: Resource Management for Grid Jobs Tasks • Task V-1: Specification of Requirements and Architecture AIP (8), ARI-ZAH (6), ZIB (6), AEI (2), MPE (2), MPA (1) Start Sep. 05,Deliverable D5.1 Oct. 2006 COMPLETED • Task V-2: Development of Grid-Job Management (Feb. 07) ZIB (24), ARI-ZAH (12), MPA (5) Start June 06, Deliverable D5.2 Feb. 2007, D5.6 June 2008 5.2 COMPLETED • Task V-4 Adaptation of User- and Programmer Interfaces (May 07) AIP (18), ARI-ZAH (12), AEI (5), MPE (4), MPA (1) Start Dec. 06Deliverable D5.4 May 2007, D5.7 Sep. 2008 PENDING • Task V-3 Development Link to Robotic Telescopes, Requests (Feb 07) AIP (17), ZIB (6) , Start Sep. 06Deliverable D5.3 Feb. 2007, D5.5 Oct. 2007, D5.8 Sep. 2008 IN PROGRESS

  28. Meeting 17:30 – 18:30 WG5 with WG3 Next Steps in WG-5 / WG-3 • Short Term: • Improve the deployment by pushing the implementation of modules • for at least 2-5 pioneer usecases (this year) [D5.4, 5.7]. • Demonstrate the ability to deploy and run these use case on more than one • resource using Gridway (this year) [D5.4, 5.7]. • Use first primitive data staging (handing data through). • Note: Useful Document GridGateWay 2007-10-05 by HMA et al. • Middle Term: • Enable GridWay as AstroGrid-D job manager (May 08) [D5.6] • Solve the problem how to handle data management together • with Gridway (Aug 08) [TA II-5] • increase number of use cases and prospective users [D5.4] • Improve international impact / compatibility issues e.g. with EGEE

  29. wird vom open grid forum (OGF) unterstützt WG5: Current status,Job Management Entscheidung für die Job Submission Data Language (JSDL) wird vom open grid forum (OGF) unterstützt jsdlproc JSDL RSL/XML GUI (GT4.2 wird gerade entwickelt und wird JSDL direkt unterstützen) GT4.0

  30. WG5: Current status, Scheduler/Broker • GridWay • Leightweight Metascheduler on top of GT2.4/GT4 • Central Server Architecture • Support of GGF DRMAA standard API for job submission and management • Simple round robin/flooding scheduling algorithm, but extensible

  31. WG5: Current status,Scheduler/Broker Information System Matchmaking GT4 Resources hydra.ari.uni-heidelberg.de Gridway Scheduler / Broker Job Status: “gwps”

  32. WG5: Current status, Robotic Telescopes STELLA-I First Steps accomplished toward the integration into AstroGrid • Adopted the REMOTE TELESCOPE MARKUP LANGUAGE (RTML) and developed a first description of STELLA-I • This description can contain dynamic information e.g. about weather • Developed a generic transformation from RTML to RDF which we can upload to the AstroGrid information service (Therefore we modified the program OwlMap from the FRESCO project) • The user can use SPARQL queries to find appropriate telescopes. • Also SPARQL queries can be implemented in tools like the Grid-Resource Map. Robotic Telescopes STELLA-I & II in Tenerife (Canary Islands)

  33. WG5: Next Steps, Robotic Telescopes Next steps • RTML description of STELLA-II, RoboTel and other robotic telescopes • Develop a system that adds dynamic weather information • Develop transformation from RTML to telescope specific language for AIP operated telescopes to be able to send observation requests in RTML • Provide access through the AstroGrid by applying • Grid security mechanisms • VO management • Development of a scheduler for a network of robotic telescopes • A lot of testing • The AIP has a simulator for STELLA and RoboTel

More Related