390 likes | 508 Views
AstroGrid-D WP 5: Resource Management for Grid Jobs. Report by: Rainer Spurzem (ZAH-ARI) spurzem@ari.uni-heidelberg.de and T. Brüsemeister, J. Steinacker. Meeting 13:10 – 14:30 WG5. Meeting WG5 and friends GridWay Discussion (together with Ignacio Llorente) Expected List of Topics:
E N D
AstroGrid-DWP 5: Resource Management for Grid Jobs Report by: Rainer Spurzem (ZAH-ARI) spurzem@ari.uni-heidelberg.de and T. Brüsemeister, J. Steinacker
Meeting 13:10 – 14:30 WG5 • Meeting WG5 and friends GridWay Discussion (together with Ignacio Llorente) Expected List of Topics: • The Present Gridway Installation in Heidelberg - Solutions and Problems. Which use cases work? How? Demos or screenshots if available. • how about more than one gridway installation in Astrogrid-D simultaneously at different sites? • cooperation of information system and job submission (in general or in the special cases of our Astrogrid-D information system and Gridway? • miscellaneous (data staging postponed to next session)
Meeting 13:10 – 14:30 WG5 • GridWay • Leightweight Metascheduler on top of GT2.4/GT4 • Central Server Architecture • Support of GGF DRMAA standard API for job submission and management • Simple round robin/flooding scheduling algorithm, but extensible
Meeting 13:10 – 14:30 WG5 A practical example with screenshots: Information System Matchmaking GT4 Resources hydra.ari.uni-heidelberg.de Gridway Scheduler / Broker Job Status: “gwps”
Meeting 13:10 – 14:30 WG5 Our View (Thanks: Hans-Martin)
Meeting 13:10 – 14:30 WG5 • D5.1: central resource broker with queue • Present: use GridWay, throughway, round-robin • More Installations useful? • Questions: • Parameters needed Gridway – Information System • (queue status, module availability, data availabilty, hardware) • When is it feasible to have a real brokerage? How?
Meeting 15:15 – 17:00 Use Cases • Porting Use Cases onto the Grid NBODY6++ : Astrophysical Case for direct N-body: Star Clusters, Galactic Nuclei, Black Holes, Gravitational Wave Generation • Special Hardware GRAPE, MPRACE (FPGA), future technologies (HT, Xtoll, GRAPE-DR) • GRAPE in the Grid, Astrogrid-D, International • DEISA
N-Body + Grav. Waves @ ARI: Peter Berczik, Ingo Berentzen, Jonathan Downing, Miguel Preto, Gabor Kupi, Christoph Eichhorn David Merritt (RIT, USA)… in VESF/LSC collaboration: on gravitational wave modelling from dense star clusters: Pau Amaro-Seoane (AEI, Potsdam, D) G. Schäfer, A. Gopakumar (Univ. Jena, D) M. Benacquista (UT Brownsville, USA) Further collaborations: Sverre Aarseth (IoA Cambridge UK) Seppo Mikkola (U Turku, FIN) Jun Makino and colleagues in Tokyo… …support, cooperation, over many years…
Globular Cluster ω Centauri (Central Region) Ground Based View
Detection of Gravitational Waves? Was Einstein right?
Hardware - GRAPE ~128 Gflops for a price ~5K USD; Memory for up to 128K particles GRAPE6a PCI board GRAPE6a, -BL - PCI Board for PC-Clusters PROGRAPE-4, FPGA based board from RIKEN (Hamada) GRAPE7 – new FPGA based board from Tokyo Univ. (Fukushige) GRAPE-DR – new board from Makino et al. NAOJ MPRACE1,2 – FPGA boards from Univ. Mannheim/GRACE (Kugel et al.)
ARI 32 node GRAPE6a clusters • 32 dual-Xeon 3.0 GHz nodes • 32 GRAPE6a • 14 TB RAID • Infiniband link (10 Gb/s) • Speed: ~4 Tflops • N up to 4M • Cost: ~500K USD • Funding: NSF/NASA/RIT • 32 dual-Xeon 3.2 GHz nodes • 32 GRAPE6a • 32 FPGA • 7 TB RAID • Dual port Infiniband link (20 Gb/s) • Speed: ~4 Tflops • N up to 4M • Cost: ~380K EUR • Funding: Volkswagen/Baden-Württemberg Infiniband Dual 20Gb/s
ARI-ZAH + RIT GRAPE6a clusters Performance Analysis (3.2 Tflop/s): Harfst et al. 2007, New Astron.
Meeting 15:15 – 17:00 Use Cases Software: High Accuracy Integrators for Systems with long-range force + relaxation (gravothermal) • S.J.Aarseth, S. Mikkola (ca. 20.000 lines): • Hierarchical Block Time Steps • Ahmad-Cohen Neighbour Scheme • Kustaanheimo-Stiefel and Chain-Regular. • for bound subsystems of N<6 (Quaternions!) • 4th order Hermite scheme (pred/corr) • Bulirsch-Stoer (for KS) • NBODY6 (Aarseth 1999) • NBODY6++ (Spurzem 1999) using MPI/shmem, copy algorithm • Parallel Binary Integration in Progress • Parallel GRAPE Use (Harfst, Gualandris, Merritt, • Spurzem, Berczik, Portegies Zwart, 2007)
Meeting 15:15 – 17:00 Use Cases High Accuracy Integrators: Record with GRAPE cluster at 2 million particles! ●Harfst, Gualandris, Merritt, Spurzem, Berczik ●Baumgardt, Heggie, Hut Baumgardt, Makino by D.C. Heggie Via www.maths.ed.ac.uk Larger N needed!
Meeting 15:15 – 17:00 Use Cases ARI Cluster: ~3.2 Tlop/s sustained Harfst, Gualandris, Merritt, Spurzem, Portegies Zwart, Berczik, New Astron. 2007. Parallel PP on GRAPE6a cluster
Visualisation With S. Dominiczak W. Frings John-von- Neumann Institute for Computing (NIC) FZ Jülich google for xnbody
Meeting 15:15 – 17:00 Use Cases • Xnbody Visualization with FZ Jülich (Unicore) • NBODY6++ UseCase in Astrogrid-D (Globus GT4.0) Simple JSDL Job ok Parallel Job + GRAPE/MPRACE request in progress Astrogrid-D • Participation in international networks, like MODEST, AGENA (EGEE) • Goal: share and load-balance GRAPE/MPRACE resources in international grid-based frame
International GRAPE-Grid Collaboration Meeting 15:15 – 17:00 Use Cases Members of Astrogrid-D: ARI-ZAH Univ. Heidelberg, D Main Astron. Obs. Kiev, UA Candidates: Univ. Amsterdam, NL Obs. Astroph. Marseille, F Fessenkov Obs., Almaty, KZ
Meeting 15:15 – 17:00 Use Cases NBODY6++ Requirements • Fortran 77 with cpp Preprocessor and make • Data Access for Job Chain • Staging of binary and ASCII input/output • Optional: • Parallel Runs (PBS, mpich-mpif77, mpirun, others) • GRAPE hardware • xnbody direct visualization and interaction interface • Future: • GridMPI, Runs across sites
Meeting 17:30 – 18:30 WG5 with WG3 • Common Workgroup Meeting of WG3 (Distributed Data Management) with WG5 (Resource Management for Grid Jobs) Expected List of Topics: • How can we improve data staging together? Which steps, what is needed, action items, people? • Further Interaction with other WG's e.g. WG7 user interfaces, WG6 Data Streaming, WG1 system integration • Next deliverables 5.4-5.8, others... • Open Discussion on sustainability, internationality, EGEE, followup project, breakout ideas, guided by Goals Last Year
Meeting 17:30 – 18:30 WG5 with WG3 • How can we improve data staging together? Which steps, what is needed, action items, people? • Use Astrogrid-D file management system?
WP5: Resource Management for Grid Jobs Tasks • Task V-1: Specification of Requirements and Architecture AIP (8), ARI-ZAH (6), ZIB (6), AEI (2), MPE (2), MPA (1) Start Sep. 05,Deliverable D5.1 Oct. 2006 COMPLETED • Task V-2: Development of Grid-Job Management (Feb. 07) ZIB (24), ARI-ZAH (12), MPA (5) Start June 06, Deliverable D5.2 Feb. 2007, D5.6 June 2008 5.2 COMPLETED • Task V-4 Adaptation of User- and Programmer Interfaces (May 07) AIP (18), ARI-ZAH (12), AEI (5), MPE (4), MPA (1) Start Dec. 06Deliverable D5.4 May 2007, D5.7 Sep. 2008 PENDING • Task V-3 Development Link to Robotic Telescopes, Requests (Feb 07) AIP (17), ZIB (6) , Start Sep. 06Deliverable D5.3 Feb. 2007, D5.5 Oct. 2007, D5.8 Sep. 2008 IN PROGRESS
Meeting 17:30 – 18:30 WG5 with WG3 Next Steps in WG-5 / WG-3 • Short Term: • Improve the deployment by pushing the implementation of modules • for at least 2-5 pioneer usecases (this year) [D5.4, 5.7]. • Demonstrate the ability to deploy and run these use case on more than one • resource using Gridway (this year) [D5.4, 5.7]. • Use first primitive data staging (handing data through). • Note: Useful Document GridGateWay 2007-10-05 by HMA et al. • Middle Term: • Enable GridWay as AstroGrid-D job manager (May 08) [D5.6] • Solve the problem how to handle data management together • with Gridway (Aug 08) [TA II-5] • increase number of use cases and prospective users [D5.4] • Improve international impact / compatibility issues e.g. with EGEE
wird vom open grid forum (OGF) unterstützt WG5: Current status,Job Management Entscheidung für die Job Submission Data Language (JSDL) wird vom open grid forum (OGF) unterstützt jsdlproc JSDL RSL/XML GUI (GT4.2 wird gerade entwickelt und wird JSDL direkt unterstützen) GT4.0
WG5: Current status, Scheduler/Broker • GridWay • Leightweight Metascheduler on top of GT2.4/GT4 • Central Server Architecture • Support of GGF DRMAA standard API for job submission and management • Simple round robin/flooding scheduling algorithm, but extensible
WG5: Current status,Scheduler/Broker Information System Matchmaking GT4 Resources hydra.ari.uni-heidelberg.de Gridway Scheduler / Broker Job Status: “gwps”
WG5: Current status, Robotic Telescopes STELLA-I First Steps accomplished toward the integration into AstroGrid • Adopted the REMOTE TELESCOPE MARKUP LANGUAGE (RTML) and developed a first description of STELLA-I • This description can contain dynamic information e.g. about weather • Developed a generic transformation from RTML to RDF which we can upload to the AstroGrid information service (Therefore we modified the program OwlMap from the FRESCO project) • The user can use SPARQL queries to find appropriate telescopes. • Also SPARQL queries can be implemented in tools like the Grid-Resource Map. Robotic Telescopes STELLA-I & II in Tenerife (Canary Islands)
WG5: Next Steps, Robotic Telescopes Next steps • RTML description of STELLA-II, RoboTel and other robotic telescopes • Develop a system that adds dynamic weather information • Develop transformation from RTML to telescope specific language for AIP operated telescopes to be able to send observation requests in RTML • Provide access through the AstroGrid by applying • Grid security mechanisms • VO management • Development of a scheduler for a network of robotic telescopes • A lot of testing • The AIP has a simulator for STELLA and RoboTel