1 / 52

Cracow Grid Workshop, October 2006 D-Grid in International Context Wolfgang Gentzsch

Cracow Grid Workshop, October 2006 D-Grid in International Context Wolfgang Gentzsch with support from Tony Hey et al, Satoshi Matsuoka, Kazushige Saga, Hai Jin, Bob Jones, Charlie Catlett, Dane Skow, and the Renaissance Computing Institute at UNC Chapel Hill, North Carolina.

ashanti
Download Presentation

Cracow Grid Workshop, October 2006 D-Grid in International Context Wolfgang Gentzsch

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Cracow Grid Workshop, October 2006 D-Grid in International Context Wolfgang Gentzsch with support from Tony Hey et al, Satoshi Matsuoka, Kazushige Saga, Hai Jin, Bob Jones, Charlie Catlett, Dane Skow, and the Renaissance Computing Institute at UNC Chapel Hill, North Carolina

  2. Grid Initiatives Initiative Time Funding People *) Users UK e-Science-I: 2001 - 2004 $180M 900 Res. UK e-Science-II: 2004 - 2006 $220M 1100 Res. Ind. TeraGrid-I: 2001 - 2004 $90M 500 Res. TeraGrid-II: 2005 - 2010 $150M 850 Res. ChinaGrid-I: 2003 - 2006 20M RMB 400 Res. ChinaGrid-II: 2007 – 2010 50M RMB *) 1000 Res. NAREGI-I: 2003 - 2005 $25M 150 Res. NAREGI-II 2006 - 2010 $40M *) 250 *) Res. Ind. EGEE-I: 2004 - 2006 $40M 800 Res. EGEE-II: 2006 - 2008 $45M 1000 Res. Ind. D-Grid-I: 2005 - 2008 $25M 220 Res. D-Grid-II: 2007 - 2009 $25M 220 (= 440) Res. Ind. *) estimate

  3. Main Objectives of the Grid Projects UK e-Science: To enable the next generation of multi-disciplinary collaborative science and engineering, to enable faster, better or different research. EGEE: To provide a seamless Grid infrastructure for e-Science that is available for scientists 24 hours-a-day. ChinaGrid: To provide a research and education platform by using grid technology for the faculties and students among the major universities in China. NAREGI: To do research, development and deployment of science grid middleware. TeraGrid: Create a unified Cyberinfrastructure supporting a broad array of US science activities using the suite of NSF HPC facilities D-Grid: Build and operate a sustainable grid service infrastructure for German research (D-Grid1) and research and industry (D-Grid2)

  4. Community Grids are all about: • Sharing Resources: - Small, medium, large enterprises share networks, computers, storage, software, data, . . . - Researchers share ditto and large experiments, instruments, sensor networks, etc. • Collaboration: - Enterprise departments with its suppliers and peers (e.g. design) - Research teams distributed around the world (HEP, Astro, Climate) • Doing things which have not been possible before: - Grand Challenges needing huge amount of computing and data - Combining distributed datasets into on virtual data pool (Genome) - “Mass Grids” for the people (distributed digital libraries; digital school laboratories; etc)

  5. Edinburgh Glasgow DL Newcastle Belfast Manchester Cambridge Oxford Hinxton RAL Cardiff London Southampton UK e-Science Grid Application independent

  6. TeraGrid A National Production CI Facility Phase I: 2001-2004 Design, Deploy, Expand ($90M over 4 years) Phase II: 2005-2010 Operation & Enhancement ($150M over 5 years beginning August 2005) UW NCAR UC/ANL PSC PU NCSA IU Caltech ORNL UNC USC/ISI SDSC TACC 20+ Distinct Computing Resources 150TF today, 400TF by 2007

  7. ChinaGrid (till now)

  8. EGEE Partner Landscape http://gridportal.hep.ph.ic.ac.uk/rtm/applet.html

  9. RRZN PC² TUD FZJ RWTH FHG/ITWM Uni-KA FZK RZG LRZ GOC, German Core Grid sites

  10. The German D-Grid Initiative *) D-Grid-1 Services for Scientists *) funded by the German Ministry for Education and Research

  11. German e-Science Initiative, Key Objectives • Building a Grid Infrastructure in Germany • Combine the existing German grid activities for infrastructure, middleware, and applications • Integration of the middleware components developed in the Community Grids • Development of e-science services for the research community • Science Service Grid • Important: • Continuing sustainable production grid infrastructure after the end of the funding period • Integration of new grid communities (2. generation) • Business models for grid services

  12. D-Grid Projects D-Grid Knowledge Management Integration Project Astro-Grid C3-Grid HEP-Grid IN-Grid MediGrid Textgrid WIKINGER ONTOVERSE WISENT Im Wissensnetz . . . Generic Grid Middleware and Grid Services eSciDoc VIOLA

  13. Grid specific Development Application CG Middle- ware Information and Knowledge Management D-Grid Structure Community Grids Grid specific Developments Application      CG Middle- ware Generic Grid Middleware and Grid Services Courtesy Dr. Krahl PT/BMBF Integration Project

  14. DGI Infrastructure ProjectWP 1: D-Grid basic software components, sharing resources, large storage, data interfaces, virtual organizations, managementWP 2: Develop, operate and support robust core grid infrastructure, resource description, monitoring, accounting, and billingWP 3: Network (transport protocols, VPN), Security (AAI, CAs, Firewalls)WP 4: Business platform and sustainability, project management, communication and coordination • Scalable, extensible, generic grid platform for future • Longterm, sustainable grid operation, SLAs based

  15. User D-Grid Middleware Application Development and User Access GAT API GridSphere Plug-In UNICORE Nutzer High-levelGrid Services SchedulingWorkflow Management Monitoring LCG/gLite Data management Basic Grid Services AccountingBilling User/VO-Mngt Globus 4.0.1 Security Resourcesin D-Grid DistributedCompute Resources NetworkInfrastructur DistributedData Archive Data/Software

  16. DGI Services, Available Dec 2006 • • Sustainable grid operation environment with a set of core • D-Grid middleware services for all grid communities • Central registration and information management for all • resources • • Packaged middleware components for gLite, Globus and • Unicore and for data management systems SRB, dCache • and OGSA-DAI • D-Grid support infrastructure for new communities with • installation and integration of new grid resources into D-Grid • Help-Desk, Monitoring System and central Information Portal

  17. DGI Services, Dec 2006, cont. • Tools for managing VOs based on VOMS and Shibboleth • • Test implementation for Monitoring & Accounting for Grid • resources, and first concept for a billing system • • Network and security support for Communities (firewalls • in grids, alternative network protocols,...) • • DGI operates „Registration Authorities“, with internationally • accepted Grid certificates of DFN & GridKa Karlsruhe • Partners support new D-Grid members with building their own • „Registration Authorities“

  18. DGI Services, Dec 2006, cont. • DGI will offer resources to other Communities, with access • via gLite, Globus Toolkit 4, and UNICORE • Portal-Framework Gridsphere can be used by future users • as a graphical user interface • For administration and management of large scientific • datasets, DGI will offer dCache for testing • New users can use the D-Grid resources of the core grid • infrastructure upon request

  19. AstroGrid

  20. C3 Grid: Collaborative Climate Community Data and Processing Grid Climate research moves towards new levels of complexity: Stepping from Climate (=Atmosphere+Ocean) to Earth System Modelling Earth system model wishlist: Higher spatial and temporal resolution Quality: Improved subsystem models Atmospheric chemistry (ozone, sulfates,..) Bio-geochemistry (Carbon cycle, ecosystem dynamics,..) Increased Computational demand factor: O(1000 -10000)

  21. HEP-Grid: p-p collisions at LHC at CERN (from 2007 on) Event rate Level 1 Trigger Rate to tape Luminosity Low 2x1033 cm-2 s-1 High 1034 cm-2 s-1 Data analysis: ~1PB/year Crossing rate 40 MHz Event Rates: ~109 Hz Max LV1 Trigger 100 kHz Event size ~1 Mbyte Readout network 1 Terabit/s Filter Farm ~107 Si2K Trigger levels 2 Online rejection 99.9997% (100 Hz from 50 MHz) System dead time ~ % Event Selection: ~1/1013 “Discovery” rate Courtesy David Stickland

  22. InGrid: Virtual Prototyping & Modeling in Industry Fluid-Structur/ Magneto-Hydro- dynamic Interaction Molding Metal Forming Fluid Processes Groundwater Transportation Gridspezifische Entwicklungen Methods and models for solving engineering problems in Grids Grid-specific developments Knowledge-based support for engineering-specific decision support Distributed simulations-based product & process optimization Support for engineering-specific Workflows Security and trust models Cooperation and business models Integration project AP 3 AP 4 AP 2

  23. Raw Data Cell Organ/Tissue Patient Population Illness Molecule Metadata Metadata Metadata Metadata Metadata Metadata Access Control Search, Find, Select Homogenization Target data Correlate, Process, Analyze Resulting Data Presentation Final Result MediGrid: Mapping of Characteristics, Features, Raw Data, etc

  24. D-Grid-2 Call (review of proposals: Sept 19) • ‘Horizontal’ Service Grids: professional Service Providers for heterogeneous user groups in research and industry • ‘Vertical’ Community Service Grids using existing D-Grid infrastructure and services, supported by Service Providers • D-Grid extensions, based on a D-Grid 1 gap analysis - Tools for operating a professional grid service - Adding business layer on top of D-Grid infrastructure - Pilot service phase with service providers and ‘customers’ !! Reliable grid services require sustainable grid infrastructure !!

  25. Global Grid Community

  26. Grid Middleware Stack, major modules UK e-Science: Phase 1: Globus 2.4.3, Condor, SRB. Phase 2: Globus 3.9.5 und 4.0.1, OGSA-DAI, Web services. EGEE: gLite distribution: elements of Condor, Globus 2.4.3 (via VDT distribution). ChinaGrid: ChinaGrid Supporting Platform (CGSP) 1.0 is based on Globus 3.9.1, and CGSP 2.0 is implemented based on Globus 4.0. NAREGI: NAREGI middleware and Globus 4.0.1 GSI and WS-GRAM TeraGrid: GT 2.4. and 4.0.1: Globus GRAM, MDS for information, GridFTP & TGCP file transfer, RLS for data replication support, MyProxy for credential mgmnt D-Grid: Globus 2.4.3 (in gLite) and 4.0.2, Unicore 5, dCache, SRB, OGSA-DAI, GridSphere, GAT, VOMS and Shibboleth

  27. The Architecture of Science Gateway Services Grid Portal Server The Users Desktop TeraGrid Gateway Services Proxy Certificate Server / vault User Metadata Catalog Application Workflow Application Deployment Application Events Resource Broker App. Resource catalogs Replica Mgmt Core Grid Services Security Notification Service Resource Allocation Grid Orchestration Data Management Service Accounting Service Policy Reservations And Scheduling Administration & Monitoring Courtesy Jay Boisseau Web Services Resource Framework – Web Services Notification Physical Resource Layer

  28. CGSP ArchitectureChinaGrid Supporting Platform, a grid middleware for ChinaGrid

  29. Digital Virtual Man Remote-sensingImage Processing Medical Image Diagnoses ImageGrid Applications RemoteVisualization GridMonitor UserManagement ApplicationSolvingEnvironment ServiceManagement Security ImageGrid Application Middleware GridSecurity Service GridResourceManagement GridInformation Service GridDataManagement Grid Application Middleware ChinaGrid Middleware(CGSP) Grid Resources

  30. NAREGI Software Stack (beta 1 2006) - WS(RF) based (OGSA) SW Stack - Grid-Enabled Nano-Applications (WP6) Grid PSE Grid Visualization Grid Programming (WP2) -Grid RPC -Grid MPI Grid Workflow (WFML (Unicore+ WF)) Distributed Information Service(CIM) Data (WP4) Super Scheduler Packaging (WSRF (GT4+Fujitsu WP1) + GT4 and other services) Grid VM (WP1) Grid Security and High-Performance Grid Networking(WP5) SuperSINET NII Research Organizations IMS Major University Computing Centers Computing Resources and Virtual Organizations

  31. gLite Grid Middleware Services Access CLI API Security Information & Monitoring Authorization Auditing Information &Monitoring Application Monitoring Authentication Data Management Workload Management MetadataCatalog File & ReplicaCatalog JobProvenance PackageManager Accounting StorageElement DataMovement ComputingElement WorkloadManagement Site Proxy Overview paper http://doc.cern.ch//archive/electronic/egee/tr/egee-tr-2006-001.pdf

  32. D-Grid Middleware User Application Development and User Access GAT API GridSphere Plug-In UNICORE Nutzer High-levelGrid Services SchedulingWorkflow Management Monitoring LCG/gLite Data management Basic Grid Services AccountingBilling User/VO-Mngt Globus 4.0.1 Security Resourcesin D-Grid DistributedCompute Resources NetworkInfrastructur DistributedData Archive Data/Software

  33. Major Challenges with Implementing Globus UK e-S, EGEE: GT 2.4 not a robust product. In the early days it took months to install, and numerous workarounds by EDG, LCG and the Condor team. UK e-S: The move from GT 2.4 to OGSA-based GT 3 to WS-based GT 4 during many of the UK grid projects was a disruption. TeraGrid: GT is a large suite of modules, most of which need to be specially built for HPC environments. The tooling on which it is based is largely unfamiliar to system administrators and requires a training/familiarization process. D-Grid: The code is very complex and difficult to install on the many different systems in heterogeneous grid environment.

  34. Challenges • Scale • What works for 4 sites and identical machines is difficult to scale to 10+ sites and 20+ machines with many architectures • Sociology • Requires high-level of buy-in from autonomous sites • (to run software or adopt conventions not invented here...) • Interoperation (e.g. with other Grids) • Requires adoption of common software stack • (see Sociology)

  35. Main Applications UK e-Science: Particle physics, astronomy, chemistry, bioinformatics, healthcare, engineering, environment, pharmaceutical, petro-chemical, media and financial sectors EGEE: 2 pilot applications (physics, life science) and applications from other 7 disciplines. ChinaGrid: Bioinformatics, image processing, computational fluid dynamics, remote education, and massive data processing NAREGI: Nano-science applications TeraGrid: Physics (Lattice QCD calculations, Turbulence simulations, Stellar models), Molecular Bioscience (molecular dynamics), Chemistry, Atmospheric Sciences D-Grid1: Astrophysics, high-energy physics, earth science, medicine, engineering, libraries

  36. Efforts for Sustainability UK e-Science: National Grid Service (NGS), Grid Operations Support Center (GOSC), National e-Science Center (NeSC), Regional e-Science Centers, Open Middleware Infrastructure Institute (OMII), Digital Curation Center (DCC) EGEE: Plans to establish a European Grid Initiative (EGI) to provide persistent grid service federating national grid programmes starting in 2008 ChinaGrid: Increasing numbers of grid applications using CGSP grid middleware packages NAREGI: Software will be managed and maintained by Cyber Science Infrastructure Center of National Institute of Informatics TeraGrid: NSF Cyberinfrastructure Office: 5 year Coop. Agreement. Partnerships with peer grid efforts and commercial web services activities in order to integrate broadly D-Grid: DGI WP 4: sustainability, services strategies, and business models

  37. The Open Middleware Infrastructure Institute (OMII) OMII is based at the University of Southampton, School of Electronics & Computer Science. Vision: to become the source for reliable, interoperable and open-source grid middleware, ensuring continued success of grid-enabled e-Science in the UK. OMII intends to: • Create a one-stop portal and software repository for open-source grid middleware, including comprehensive information about its function, reliability and usability; • Provide quality-assured software engineering, testing, packaging and maintenance of software in the OMII repository, ensuring it is reliable and easy to both install and use; • Lead the evolution of grid middleware at international level, through a managed program of research and wide-reaching collaboration with industry.

  38. The Digital Curation Center (DCC) The DCC is based at the University of Edinburgh. DCC supports UK institutions with the problems involved in storing, managing and preserving vast amount of digital data to ensure its enhancement and continuing long-term use. The purpose of DCC is to provide a national focus for research into curation issues and to promote expertise and good practice, both nationally and internationally, for the management of all research outputs in digital format.

  39. National Grid Service Interfaces OGSI::Lite

  40. TeraGrid Next Steps - Services-based • Core services: define a “TeraGrid Resource” • Authentication & Authorization Capability • Information Service • Auditing/Accounting/Usage Reporting Capability • Verification & Validation Mechanism • Provides a foundation for value-added services. • Each Resource runs one or more added services, or “kits” • Enables a smaller set of components than the previous “full” CTSS • Advanced capabilities, exploiting architectures or common software • Allows portals (science gateways) to customize service offerings • Core and individual kits can evolve incrementally, in parallel

  41. TeraGrid Science Gateways Initiative:Community Interface to Grids TeraGrid Grid-X Grid-Y • Common Web Portal or application interfaces (database access, computation, workflow, etc), standards (primarily web services) • “Back-End” use of grid services such as computation, information management, visualization, etc. • Standard approaches so that science gateways may readily access resources in any cooperating Grid without technical modification

  42. TeraGrid Science Gateway Partner Sites TG-SGW-Partners 21 Science Gateway Partners (and growing) - Over 100 partner Institutions Contact: Nancy Wilkins-Diehr (wilkinsn@sdsc.edu)

  43. Grid Interoperation Now • Multi-Grid effort (20+ projects world-wide) • Interoperation vs. Interoperability • Interoperability • “The ability of software and hardware on multiple machines from multiple vendors to communicate“ • Based on commonly agreed documented specifications and procedures • Interoperation • (for the sake of users!) “Just make it work together” • Opportunistic, exploit common software, etc. • Low hanging fruit, future interoperability • Principles • “The perfect is the enemy of the good enough” • Voltaire (based on an old Italian proverb) • Focus on security at every step (initial work aimed at auth*)

  44. SJTU PKU XJTU NEU ZSU SCUT HUST BUAA NUDT SEU Remote Fluid Image education THU dynamics SDU processing grid grid Massive grid information Bioinformatics processing grid grid ChinaGrid Supporting Platform (CGSP) High performance computing environment (campus grid) Layered Infrastructure of ChinaGrid

  45. NAREGI R&D Assumptions and Goals • Future Research Grid Metrics for Petascale • 10s of Institutions/Centers, various Project VOs • > 100,000 users, > 100,000~1,000,000 CPUs • Machines are very heterogeneous CPUs (super computers, clusters, desktops), OSes, local schedulers • 24/7 usage, production deployment • Server Grid, Data Grid, Metacomputing … • High Emphasis on Standards • Start with Globus, Unicore, Condor, extensive collaboration • GGF contributions, esp. OGSATM reference implementation • Win support of users • Application and experimental deployment essential • R&D for production quality (free) software • Nano-science (and now Bio) involvement, large testbed

  46. GGF Standards and Pseudo-standard Activities set/employed by NAREGI GGF “OGSA CIM profile” GGF AuthZ GGF DAIS GGF GFS (Grid Filesystems) GGF Grid CP (GGF CAOPs) GGF GridFTP GGF GridRPC API (as Ninf-G2/G4) GGF JSDL GGF OGSA-BES GGF OGSA-Byte-IO GGF OGSA-DAI GGF OGSA-EMS GGF OGSA-RSS GGF RUS GGF SRM (planned for beta 2) GGF UR GGF WS-I RUS GGF ACS GGF CDDLM Other Industry Standards Employed by NAREGI ANSI/ISO SQL DMTF CIM IETF OCSP/XKMS MPI 2.0 OASIS SAML2.0 OASIS WS-Agreement OASIS WS-BPEL OASIS WSRF2.0 OASIS XACML De Facto Standards / Commonly Used Software Platforms Employed by NAREGI Ganglia GFarm 1.1 Globus 4 GRAM Globus 4 GSI Globus 4 WSRF (Also Fujitsu WSRF for C binding) IMPI (as GridMPI) Linux (RH8/9 etc.), Solaris (8/9/10), AIX, … MyProxy OpenMPI Tomcat (and associated WS/XML standards) Unicore WF (as NAREGI WFML) VOMS List of NAREGI “Standards”(beta 1 and beyond)

  47. Sustainability: Beyond EGEE-II • Need to prepare for permanent Grid infrastructure • Maintain Europe’s leading position in global science Grids • Ensure a reliable and adaptive support for all sciences • Independent of short project funding cycles • Modelled on success of GÉANT • Infrastructure managed in collaboration with national grid initiatives

  48. Austria – AustrianGrid Belgium – BEGrid Bulgaria – BgGrid Croatia – CRO-GRID Cyprus – CyGrid Czech Republic- METACentre Denmark ? Estonia – Estonian Grid Finland France – planned (ICAR) Germany – D-GRID Greece - HellasGrid Hungary Ireland - Grid-Ireland Israel – Israel Academic Grid Italy - planned Latvia – Latvian Grid Lithuania - LitGrid Netherlands – DutchGrid Norway – NorGrid Poland - Pioneer Portugal –launched April’06 Romania – RoGrid Serbia – AEGIS Slovakia Slovenia - SiGNET Spain – planned Sweden – SweGrid Switzerland - SwissGrid Turkey – TR-Grid Ukraine - UGrid United Kingdom - eScience European National Grid Projects

  49. D-Grid: Towards a Sustainable • Infrastructure for Science and Industry • Govt is changing policies for resource acquisition (HBFG ! ) to enable a service model • 2nd Call: Focus on Service Provisioning for Sciences & Industry • Strong collaboration with: Globus Project, EGEE, Deisa, CrossGrid, CoreGrid, GridCoord, GRIP, UniGrids, NextGrid, … • Application and user-driven, not infrastructure-driven • Focus on implementation and production, not grid research, in a multi-technology environment (Globus, Unicore, gLite, etc) • D-Grid is the Core of the German e-Science Initiative

  50. Summary: Challenges for Research and Industry • Sensitive data, sensitive applications (medical patient records) • Different organizations get different benefits • Accounting, who pays for what (sharing!) • Security policies: consistent and enforced across the grid ! • Lack of standards prevent interoperability of components • Current IT culture is not predisposed to sharing resources • Not all applications are grid-ready or grid-enabled • Open source is not equal open source (read the small print) • SLAs based on open source (liability?) • “Static” licensing model don’t embrace grid • Protection of intellectual property • Legal issues (FDA, HIPAA, multi-country grids)

More Related