1 / 26

Presentation Summary

Overview of the GLUE Project (Grid Laboratory Unified Environment) Author: Piotr Nowakowski, M.Sc. Cyfronet, Kraków. Presentation Summary. Goals of GLUE Key GLUE contributors GLUE schema GLUE activities Unresolved issues. Goals of GLUE.

flower
Download Presentation

Presentation Summary

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Overview of the GLUE Project(Grid Laboratory Unified Environment)Author: Piotr Nowakowski, M.Sc.Cyfronet, Kraków

  2. Presentation Summary • Goals of GLUE • Key GLUE contributors • GLUE schema • GLUE activities • Unresolved issues

  3. Goals of GLUE • Promote coordination between European and US Grid projects • Define, construct, test and deliver interoperable middleware to all Grid Projects • Experiment with intercontinental Grid deployment and operational issues • Establish procedures and policies regarding interoperability • Once the GLUE collaboration establishes the necessary, minimum requirements for interoperability of middleware, any future software designed by the projects covered by the umbrella of the HICB and JTB must maintain the achieved interoperability.

  4. GLUE Organizationally • Management by iVDGL and DataTAG • Guidance and oversight by the High Energy Physics Intergrid Coordination Board (HICB) and Joint Technical Board (JTB) • Participating organizations (19 entities in all): • Grid Projects (EDG, GriPhyN, CrossGrid etc.) • LHC experiments (Atlas, CMS etc.)

  5. HENP Collaboration • The HENP (High-Energy Nuclear Physics) Grid R&D projects (initially DataGrid, GriPhyN, and PPDG, as well as the national European Grid projects in UK, Italy, Netherlands and France) have agreed to coordinate their efforts to design, develop and deploy a consistent open source standards-based global Grid infrastructure. • To that effect, their common efforts are organized in three major areas: • A HENP InterGrid Coordination Board (HICB) for high-level coordination • A Joint Technical Board (JTB) • Common Projects, and Task Forces to address needs in specific technical areas

  6. The DataTAG Project Aim: Creation of an intercontinental Grid testbed using DataGrid (EDG) and GriPhyN components. Work packages: WP1: Establishment of an intercontinental testbed infrastructure WP2: High performance networking WP3: Bulk data transfer validations and performance monitoring WP4: Interoperability between Grid domains WP5: Information dissemination/exploitation WP6: Project management

  7. DataTAG WP4 • Aims: • To produce an assessment of interoperability solutions, • To provide a test environment for LHC applications to extend existing use cases to test interoperability of Grid components, • To provide input to a common Grid LHC architecture, • To plan EU-US integrated Grid deployment. WP4 Tasks: T4.1: Develop an intergrid resource discovery schema, T4.2: Develop intergrid Authentication, Authorization and Accounting (AAA) mechanisms, T4.3: Plan and deploy an „intergrid VO” in collaboration with iVDGL.

  8. DataTAG WP4Framework and Relationships

  9. The iVDGL Project(International Virtual Data Grid Laboratory) Aim: To provide high-performance global computing infrastructure for keynote experiments in physics and astronomy (ATLAS, LIGO, SDSS etc.) • iVDGL activities: • Establishing supercomputing sites throughout the U.S. and Europe; linking them with a multi-gigabit transatlantic link • Establishing a Grid Operations Center (GOC) in Indiana • Maintaining close cooperation with partnership projects in the EU and the GriPhyN project.

  10. U.S. iVDGL Network • Selected participants: • Fermilab • Brookhaven National Laboratory • Argonne National Laboratory • Stanford LINAC Laboratory • University of Florida • University of Chicago • California Institute of Technology • Boston University • University of Wisconsin • Indiana University • Johns Hopkins University • Northwestern University • University of Texas • Pennsylvania State University • Hampton University • Salish Kootenai College

  11. iVDGL Organization Plan • Project Steering Group – advises iDVGL directors on important project decisions and issues. • Project Coordination Group – provides a forum for short-term planning and tracking of the project activities and schedules. The PCG includes representatives of related Grid projects, particularly EDT/EDG. • Facilities Team – identification of testbed sites, hardware procurement • Core Software Team – definitions of software suites and toolkits (Globus, VDT, operating systems etc.) • Operations Team – performance monitoring, networking, coordination, security etc. • Applications Team – planning the deployment of applications and the related requirements • Outreach Team – Website maintenance, planning conferences, publishing research materials etc. Note: The GLUE effort is coordinated by the Interoperability Team (aka GLUE Team)

  12. The GriPhyN Project • Aims: • To provide the necessary IT solutions for petabyte-scale data-intensive science by advancing the Virtual Data concept, • To create Petascale Virtual Data Grids (PVDG) to meet the computational needs of thousands of scientists spread across the globe. Timescale: 5 years (2000-2005) • GriPhyN applications: • The CMS and ATLAS LHC experiments at CERN • LIGO (Laser Interferometer Gravitational Wave Observatory) • SDSS (Sloan Digital Sky Survey)

  13. The Virtual Data Concept Virtual data: the definition and delivery to a large community of a (potentially unlimited) virtual space of data products derived from experimental data. In virtual data space, requests can be satisfied via direct access and/or computation, with local and global resource management, policy, and security constraints determining the strategy used. • GriPhyN IT targets: • Virtual Data technologies: new methods of cataloging, characterizing, validating, and archiving software components to implement virtual data manipulations • Policy-driven request planning and scheduling of networked data and computational resources: mechanisms for representing and enforcing both local and global policy constraints and new policy-aware resource discovery techniques. • Management of transactions and task execution across national-scale and worldwide virtual organizations: new mechanisms to meet user requirements for performance, reliability, and cost.

  14. Sample VDG Architecture

  15. Petascale Virtual Data Grids Petascale – both computationally intensive (Petaflops) and data intensive (Petabytes). Virtual – containing little ready-to-use information, instead focusing on methods of deiving this information from other data. The Tier Concept Developed for use by the most ambitious LHC experiments: ATLAS and CMS. • Tier 0: CERN HQ • Tier 1: National center • Tier 2: Regional center • Tier 3: HPC center • Tier 4: Desktop PC cluster

  16. The DataGrid (EDG) Project Aim: To enable next-generation scientific exploration which requires sharing intensive computation and analysis of shared large-scale databases, from hundreds of terabytes to petabytes, across widely distributed scientific communities. DataGrid Work Packages: WP1: Workload Management WP2: Data Management WP3: Monitoring Services WP4: Fabric Management WP5: Storage Management WP6: Integration (testbeds) WP7: Network WP8: Application – Particle Physics WP9: Application – Biomedical Imaging WP10: Application – Satellite surveys WP11: Dissemination WP12: Project Management

  17. GLUE Working Model • The following actions take place once an interoperability issue is encountered: • The DataTAG/iVDGL managers define a plan and sub-tasks to address the relevant issue. This plan includes integrated tests and demonstrations which define overall success. • The DataTAG/iVDGL sub-task managers assemble all the input required to address the issue on hand. The HIJTB and other relevant experts would be strongly involved. • The DataTAG/iVFGL sub-task managers organize getting the work done using the identified solutions. • At appropriate points the work need is presented to the HICB, which discusses it on a technical level. Iterations take place. • At appropriate points the evolving solutions are presented to the HICB. • At an appropriate point the final solution is presented to the HICB with a recommendation that it be accepted by Grid projects.

  18. GLUE Working Model - example • Issue: DataGRID and iVDGL use different data models for publishing resource information. Therefore RBs cannot work across domains. • The HIJTB recognizes this and proposes it as an early topic to address. The DataTAG/iVDGL management is advised to discuss this early on. • DataTAG management has already identified this as a sub-task. • DataTAG/iVDGL employees are assigned to the problem. • Many possible solutions exist, from consolidation to translation on various levels (the information services level or even the RB level). The managers discuss the problem with clients in order to ascertain the optimal solution. • The group involved organizes its own meetings (regardless of the monthly HIJTB meetings). [this is taking place now] • A common resource model is proposed. Once it has been demonstrated to work within a limited test environment, the HIJTB/HICB will discuss if and when to deploy this generally, taking into account the ensuing modifications which will be needed to other components such as the resource broker.

  19. GLUE Schemas • GLUE schemas: descriptions of objects and attributes needed to describe Grid resources and their mutual relations. • GLUE schemas include: • Computing Element (CE) schema – in development • Storage Element (SE) schema – TBD • Network Element (NE) Schema – TBD The development of schemas is coordinated by JTB with collaboration from Globus, PPDG and EDG WP managers.

  20. CE Schemaversion 4 – 24/05/2002 • Computing Element: an entry point into a queuing system. Each queue points to one or more clusters. • Cluster: a group of subclusters or individual nodes. A cluster may be referenced by more than one computing element. • Subcluster: a homogenous group of individual computing nodes (all nodes must be represented by a predefined set of attributes). • Host: a physical computing element. No host may be part of more than one subcluster.

  21. GLUE Schema Representation In existing MDS models, GLUE Schemas and their hierarchies can be represented through DITs (Directory Information Tree). Globus MDS v2.2 will be updated to handle the new schema. In future OGSA-based implementations (Globus v3.0) the structure can be converted to an XML document.

  22. GLUE Stage I Aims: Integration of US (iVDGL) and European (EDG) testbeds; developing a permanent set of reference tests for new releases and services. • Phase I • Cross-organizational authentication • Unified service discovery and information infrastructure • Test of Phase I infrastructure • Phase II • Data movement infrastructure • Test of Phase II infrastructure • Phase III • Community authorization services • Test of the complete service In progress

  23. Grid Middleware and Testbed • The following middleware will be tested in Stage I of GLUE: • EDG Work Packages WP1 (Workload management), WP2 (Data management), WP3 (Information and monitoring services), WP5 (Storage management) • GriPhyN middleware – Globus 2.0, Condor v6.3.1, VDT1.0, The GLUE testbed will consist of: • Computational resources: several CEs from DataTAG and iVDGL respectively. • Storage: access to mass storage systems at CERN and US Tier 1 sites. • Network: standard production networks should be sufficient.

  24. GLUE Stage I Schedule Feb 2002: Test interoperating certificates between US and EU – done May 2002: Review of common resource discovery schema – in progress Jun 2002: Full testbed proposal available for review. Review of common storage schema First version of common use cases (EDG WP8) Refinement of testbed proposals through HICB feedback Jul 2002: Intercontinental resource discovery infrastructure in test mode for production deployment in September Sep 2002: Interoperating Community and VO authorization available Implementation of common use cases by the experiments Nov 2002: Demonstrations planned Dec 2002: Sites integrated into Grid executing all goals of Stage I

  25. Unresolved Issues • Ownership of GLUE schemas • Maintenance of GLUE schemas • Ownership (and maintenance) of MDS information providers

  26. Web Addresses • GLUE Homepage at HICB: http://www.hicb.org/glue/glue.html • GLUE-Schema site: http://www.hicb.org/glue/glue-schema/schema.htm • HENP Collaboration page: http://www.hicb.org • The DataTAG Project: http://www.datatag.org • The iVDGL Project: http://www.ivdgl.org • The GriPhyN Project: http://www.griphyn.org • European DataGrid: http://www.eu-datagrid.org

More Related