210 likes | 314 Views
Introduction to the Grid: technologies and projects. Oxana Smirnova Lund University October 28, 2003, Ko šice. Outlook. Information Technology developments Grid solutions High Energy Physics challenges Development and deployment projects. Slide adapted from the Globus Alliance.
E N D
Introduction to the Grid: technologies and projects Oxana SmirnovaLund UniversityOctober 28, 2003, Košice
Outlook • Information Technology developments • Grid solutions • High Energy Physics challenges • Development and deployment projects oxana.smirnova@hep.lu.se
Slide adapted from the Globus Alliance IT progress: some facts • Network vs. computer performance: • Computer speed doubles every 18 months • Network speed doubles every 9 months • 1986 to 2000: • Computers: 500 times faster • Networks: 340000 times faster • 2001 to 2010 (projected): • Computers: 60 times faster • Networks: 4000 times faster Bottom line: CPUs are fast enough; networks are very fast – gotta make use of it! oxana.smirnova@hep.lu.se
The Grid Supercomputer PC Farm Workstation The Grid Paradigm • Distributed supercomputer, based on commodity PCs and fast WAN • Access to the great variety of resources by a single pass – certificate • A possibility to manage distributed data in a synchronous manner (e.g., LHC data analysis) • A new commodity oxana.smirnova@hep.lu.se
Slide adapted from A.Grimshaw Wider scope: a Grid System A Grid system is a collection of distributed resources connected by a network Examples of Distributed Resources: • Desktop • Handheld hosts • Devices with embedded processing resources such as digital cameras and phones • Tera-scale supercomputers oxana.smirnova@hep.lu.se
Slide adapted from A.Grimshaw Characteristics of a generic Grid system Numerous Resources Ownership by Mutually Distrustful Organizations & Individuals Connected by Heterogeneous, Multi-Level Networks Different Security Requirements & Policies Required Different Resource Management Policies Potentially Faulty Resources Geographically Separated Resources are Heterogeneous oxana.smirnova@hep.lu.se
Graph borrowed from A.Grimshaw Grid paradigm is overloaded Global Grids • Multiple enterprises, owners, platforms, domains, file systems, locations, and security policies • Legion, Avaki, Globus Enterprise “Grids” • Single enterprise; multiple owners, platforms, domains, file systems, locations, and security policies • SUN SGE EE, Platform Multicluster Cluster & Departmental “Grids” • Single owner, platform, domain, file system and location • SUN SGE, Platform LSF, PBS WARNING! Not everything that has “G” in the name is Grid! (SGE, Oracle 10g, Condor-G etc) Desktop Cycle Aggregation • Desktop only • United Devices, Entropia, Data Synapse oxana.smirnova@hep.lu.se
Grid features: • Heterogeneous • Non-interactive • Single logon • Optimized file transfer protocol • Information schema Globus: the toolkit provider • The first and only provider of a Grid toolkit (libraries and API) • An academic research project in USA and now Europe • Free software, open code • Supports Grid testbeds since late 90’s • To do: • Global resource management • Data management • User management, accounting oxana.smirnova@hep.lu.se
Slide adapted from the Globus Alliance User Reporter(registry +discovery) GIIS: GridInformationIndex Server (discovery) Gatekeeper(factory) Create process Register User User process #1 process #2 Other service(e.g. GridFTP) Proxy Proxy #2 The Globus Toolkit v2 in One Slide • Grid protocols (GSI, GRAM, …) enable resource sharing within virtual organizations; toolkit provides reference implementation ( = Globus Toolkit services) MDS-2 (Monitoring and Discovery Service) Reliable remote invocation Soft state registration; enquiry GSI (Grid Security Infrastructure) Authenticate & create proxy credential Other GSI-authenticated remote service requests GRAM (Grid Resource Allocation & Management) • Protocols (and APIs) enable other tools and services for membership, discovery, data management, workflow, … oxana.smirnova@hep.lu.se
Slide adapted from the Globus Alliance Globus-Based Grid Tools & Applications • Data Grids • Distributed management of large quantities of data: physics, astronomy, engineering • High-throughput computing • Coordinated use of many computers • Collaborative environments • Authentication, resource discovery, and resource access • Portals • Thin client access to remote resources & services • And combinations of the above oxana.smirnova@hep.lu.se
Storage Some architectural thoughts Data locationserver UserInterface Workloadmanager Workloadmanager UserInterface UserInterface Storage InformationServer InformationServer InformationServer oxana.smirnova@hep.lu.se
Who needs Grid: High Energy Physics challenges • Data-intensive tasks • Large datasets, large files • Lengthy processing times • Large memory consumption • High throughput is necessary • Very distributed user base • Distributed computing resources of modest size • Produced and processed data are hence distributed, too • Issues of coordination, synchronization and authorization are outstanding • HEP is by no means unique in its demands, but they are first, they are many, and they badly need it oxana.smirnova@hep.lu.se
ResourcesCPU Disk Experiment-Grid interaction Grid Experiment JobDescription Task ResourceBroker Input DB InformationSystem Output DB Monitoring& control MSS ReplicaLocation Paper oxana.smirnova@hep.lu.se
Slide adapted from Les Robertson GriPhyN The Virtual Data Toolkit (VDT) PPDG iVDGL The DataGRID Toolkit HEP-related Grid projects European projects Many national, regional Grid projects -- GridPP(UK), INFN-grid(I), NorduGrid, Dutch Grid, … US projects oxana.smirnova@hep.lu.se
Related Grid projects • Other Grid-related projects do not develop Open Source-like (i.e., free) software/middleware, as of today • Most notably, Legion/Avaki: Globus competitor, widely used by businesses • Entropia: like SETI@Home • IBM, Platform: Globus-based • Sun Grid Engine EE: enterprise Grids oxana.smirnova@hep.lu.se
??? SE MSS MSS ??? ??? Broker(s) Broker(s) ??? ??? SE What Grid can do today • Simplest Grid: users access distributed resources using a single certificate • More complex Grid: users’ tasks are distributed between different resources by a broker • Even more complex Grid: not only tasks, but massive amounts of data are also distributed and managed (not quite there yet, only prototypes oxana.smirnova@hep.lu.se
What is missing • Common policies, or ways of mutually respecting such • Grid accounting systems and Grid economy • Serious security solutions; role-based access control • Full-blown distributed data management systems • Tools and methods for system-wide applications environment deployment • STANDARDS! oxana.smirnova@hep.lu.se
The Grid or many Grids? • Globus Toolkit 2 is a basis for great many Grid solutions • Which use some common tools and utilities: GSI, GridFTP • But they also differ a lot, architecturally and technologically • There are several non-interoperable GT2-based Grid systems! • No satisfactory ready-made solutions developers invent their own • Being financed from different sources, developers and users are not always encouraged to adopt rival project’s solution • Instead of “How should I use Grid?”, users ask “Which Grid should I use?” • Grid standards body: Global Grid Forum (GGF) • Heavily oriented towards commercial implementations • No effective standards since 2001 • Meanwhile, Globus introduced the “Open Grid Services Architecture” (OGSA) • Globus Toolkit 3 is released • Not yet used by any of the development projects • Perhaps the first set of standards endorsed by GGF oxana.smirnova@hep.lu.se
Slide adapted from the Globus Alliance Managed shared virtual systems Computer science research Open Grid Services Arch Web services, etc. Real standards Multiple implementations Globus Toolkit Internet standards Defacto standard Single implementation The emergence of Open Grid standards Functionality, standardization Custom solutions 1990 1995 2000 2005 2010 oxana.smirnova@hep.lu.se
Open Grid Services Architecture • Standard interfaces & behaviors for distributed system management • Service orientation: Grid Services, in analogy to Web Services • Web services: persistent • Grid services: transient (issues: e.g., how are they discovered?) • Extending WSDL to GSDL (work with W3C) • Standard service specifications • Resource management • Data management • Workflow • Security • etc. • Paves the road towards interoperability and true modularity of Grid structures oxana.smirnova@hep.lu.se
Conclusion • HEP community stirred a world-wide Grid interest • Next big thing after the dot-com?.. • Despite a slow start and much hype, some real work is under way • Rather, the next big thing after the WWW ! • Still, no complete solution exists • Data management? • Accounting? • Security? • Standardization? • With courage and patience, we should go Grid oxana.smirnova@hep.lu.se