1 / 24

The German HEP Community Grid

The German HEP Community Grid. P.Malzacher@gsi.de for the German HEP Community Grid 27-March-2007, ISGC2007, Taipei. Agenda: D-Grid in context HEP Community Grid HEP-CG Work Packages Summary.

betrys
Download Presentation

The German HEP Community Grid

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The German HEP Community Grid P.Malzacher@gsi.de for the German HEP Community Grid 27-March-2007, ISGC2007, Taipei Agenda: D-Grid in context HEP Community Grid HEP-CG Work Packages Summary

  2. ~ 10 000 scientists from 1000 institutes out of more then 100 countries, investigate with the help of huge accelerators basic problems of particle physics.

  3. 10. Berlin Initiative Mar.-Sep. pp run Oct. HI run LCG R&D WLCG Ramp-up ... 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 DGI DGI 2 Commercial uptake of service D-Grid Initiative Community Grids HEP CG D-Grid in context: e-Science in Germany Today EDG EGEE EGEE 2 EGEE 3 ? GridKa/GGUS

  4. www.d-grid.de

  5. e-Science= Knowledge management GridComputing & & e-Learning see talk of Anette Weisbecker, in Life Sciences I HEP CG Text Grid Astro Grid Medi Grid In Grid C3 Grid C3 Grid Generic platform and generic Grid services D-Grid Integration Project

  6. RRZN PC² TUD FZJ RWTH FHG/ITWM Uni-KA FZK RZG LRZ D-Grid WPs: Middleware & Tools, Infrastructure, Network & Security, Management & Sustainability see talk of Thomas Fieseler, in Operation I see talk of Michael Rambadt, in Middleware II • Middleware: • Globus 4.x • gLite (LCG) • UNICORE • GAT and GridSphere • Data Management: • SRM/dCache • OGSA-DAI • Meta data schemas • VO Management: • VOMS and Shibboleth

  7. LHC groups in Germany • Alice: Darmstadt, Frankfurt, Heidelberg, Münster • ATLAS: Berlin, Bonn, Dortmund, Dresden, Freiburg, Gießen, Heidelberg, Mainz, Mannheim, München, Siegen, Wuppertal • CMS: Aachen, Hamburg, Karlsruhe • LHCb: Heidelberg, Dortmund

  8. German HEP instituts on the WLCG monitoring map. • WLCG:Karlsruhe (GridKa & Uni), DESY, GSI, München, Aachen, Wuppertal, Münster, Dortmund, Freiburg

  9. HEP CG partner: • Project partner: Uni Dortmund, TU Dresden, LMU München, Uni Siegen, Uni Wuppertal, DESY (Hamburg & Zeuthen), GSI • via subcontract: Uni Freiburg, Konrad-Zuse-Zentrum Berlin, • unfunded: Uni Mainz, HU Berlin, MPI f. Physik München, LRZ München, Uni Karlsruhe, MPI Heidelberg, RZ Garching, John von Neumann Institut für Computing, FZ Karlsruhe

  10. Focus on tools to improve data analysis for HEP and Astroparticle Physics.Focus on gaps, do not reinvent the wheel. • Data management • Advanced scalable data management • Job-and data co-scheduling • Extendable Metadata catalogues for Lattice QCDand Astroparticle physics • Job monitoring and automated user support • Information services • Improved Job failure treatment • Incremental results of distributed analysis • End-user data analysis tools • Physics and user oriented job scheduling, • workflows, automatic job scheduling All development is based on LCG / EGEE sw and will be kept compatible!

  11. HEP CG WP1: Data Management • Coordination P.Fuhrmann, DESY • Developing and supporting a scalable Storage Elementbased on Grid standards • (DESY, Uni Dortmund, UniFreiburg, unfunded FZK) • Combined job- and data-scheduling, accounting and monitoring of data used • (Uni Dortmund) • Development of grid-based, extendable metadata catalogues with semantically world-wide access • (DESY, ZlB, unfunded: Humboldt Uni Berlin, NIC)

  12. dCache.ORG Scalable Storage Element: dCache - thousands of pool - PB disk storage - hundreds of file transfers per second - not more than 2 FTEs • The dCache project is funded from DESY, FERMI Lab, OpenScience Grid and in part from the Nordic Data Grid Facility. • HEP CG contributes: • Professional product management: code versioning, packaging, user support and test suites. - only one host - ~ 10 TB - zero maintenance

  13. dCache.ORG dCache Controller protocol Engines Information Prot. Backend Tape Storage Managed Disk Storage Storage Control SRM EIS Streaming Data HSM Adapter (gsi)FTP http(g) Posix I/O xRoot dCap dCache: The principle

  14. CPU and data co-scheduling: online vs. near line files, information about time to get a file online

  15. HEP CG WP2:Job Monitoring + User Support Tools • Coordination: P.Mättig, Uni Wuppertal Development of a job information system (TU Dresden) Development of an expert-system to classify job -failures, automatic treatment of most common errors (Uni Wuppertal, unfunded FZK) R&D on interactive job steering and access to temporary, incomplete analysis job results • (Uni Siegen)

  16. User specific job- and resource usage-monitoring

  17. Focus on many job scenario. Ease of use. User should not need to know more than necessary, which should be almost nothing. From general to detailed views on jobs. Information like status, resource usage by jobs, output, time lines etc. Interactivity: zoom in display, clicking shows detailed information Integration into GridSphere

  18. Development of an expert-system to classify job -failures, automatic treatment of most common errors. • Motivation • Thousands of jobs/day in the LHC Computing Grid (LCG) • Job status at run-time is hidden from the • Manual error tracking is difficult and can take long • Current monitoring is more resource then user oriented (GridICE, …) • Therefore • Monitoring on script level • JEM • Automation necessary • Expert-system

  19. gLite/LCG Workernode Preexecution Test Bash Supervision of commands Python Status-reports via R-GMA Visualisation via GridSphere Expert-system for error classification Integration in the ATLAS software environment Integration in GGUS ? post D-Grid I: automatic error correction, ... JEM:Job Execution Monitor

  20. HEP CG WP3: Distributed Interactive Data Analysis • Coordination P.Malzacher , GSI • (LMU, GSI, unfunded: LRZ, MPI M, RZ Garching, Uni Karlsruhe, MPI Heidelberg) • Optimize application specific job scheduling • Analyze and test of software environment required Job management and Bookkeeping of distributed analysis • Distribution of analysis, sum-up of results Interactive Analysis: • Creation of a dedicated analysis cluster • Dynamic partitioning of Grid analysis clusters

  21. LMU: Investigating Job-Scheduler requirements for distributed and interactive analysis GANGA (ATLAS/LHCb) project shows good features for this task Used for MC production, reconstruction and analysis on LCG Start with Gap Analysis GSI: Analysis based on PROOF Investigating different versions of PROOF clusters Connect ROOT and gLite: TGlite class TGrid : public TObject { public: … virtualTGridResult *Query ( … staticTGrid *Connect ( const char *grid, const char *uid = 0, const char *pw = 0 … ClassDef(TGrid,0) };

  22. catalog files Storage queues query jobs data file splitting myAna.C merging final analysis manager outputs submit • “static” use of resources • jobs frozen: 1 job / worker node • splitting at the beginning, merging • limited monitoring (end of single job) GANGA, Job split approach

  23. files scheduler query PROOF query: data file list, myAna.C final outputs (merged) The PROOF approach catalog Storage MASTER feedbacks • farm perceived as extension of local PC • same macro, syntax as in local session • more dynamic use of resources • real time feedback • automated splitting and merging

  24. Summary: • Rather late compared to other national Grid initiatives a German e-science program is well under way. It is build on top of 3 different middleware flavors: UNICORE, Globus 4 and gLite. • The HEP-CG production environment is based on LCG / EGEE software. • The HEP-CG focuses on gaps in three work packages: data management, automated user support and interactive analysis. • Challenges for HEP: • Very heterogeneous disciplines and stakeholders. • LCG/EGEE is not basis for many other partners. • More Information • I showed only a few highlights for more info see: http://www.d-grid.de

More Related