1 / 35

Grid and its applications

Grid and its applications. Oxana Smirnova Lund / CERN NorduGrid/LCG/ATLAS Reykjavik , November 17, 2004. Outlook. Grid vision and history Grid necessity: demanding applications Information Technology developments Grid solutions Development and deployment projects.

tiana
Download Presentation

Grid and its applications

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Grid and its applications Oxana SmirnovaLund / CERN NorduGrid/LCG/ATLAS Reykjavik, November 17, 2004

  2. Outlook • Grid vision and history • Grid necessity: demanding applications • Information Technology developments • Grid solutions • Development and deployment projects

  3. Grid vision and history

  4. From distributed resources … • Present situation: • cross-national projects • users and resources in different domains • separate access to each resource

  5. … to World Wide Grid • Future: • multinational projects • resources location is irrelevant • “plug-n-play” access to all the resources

  6. Grid history: users’ perspective • Metacomputing is a decades old idea • Previous attempt, including Condor, failed to appeal to users • Progress in commercial hardware has always been faster than in Open Source-like middleware  easier to buy a bigger supercomputer/cluster • Globus Toolkit 1 was heading into oblivion in early 2000 • Physicists in Europe and USA realized that the time (Y2K) for metacomputing is ripe • MONARC project (CERN) developed a multi-tiered model for distributed analysis of data • Particle Physics Data Grid (PPDG) and GriPhyN projects by US physicists started using Grid technologies • Globus was picked up by the CERN-lead EU DataGrid (EDG) project • EDG failed to satisfy user demands; many simpler solutions appeared, triggered by physicists: • NorduGrid (Northern Europe and others) • Grid3 (USA) • GLite (EU, a prototype)

  7. Driven by High Energy Physics

  8. Large Hadron Collider:World’s biggest accelerator at CERN

  9. Collisions at LHC

  10. ATLAS: one of 4 detectors at LHC

  11. ATLAS: preparing for data taking

  12. ATLAS simulation flow

  13. Piling up events

  14. Characteristics of HEP computing Eventindependence • Data from each collision is processed independently: trivial parallelism • Mass of independent problems with no information exchange Massivedatastorage • Modest event size: 1 – 10 MB (although some are up to 1-2 GB) • Total is very large – Petabytes for each experiment Mostlyreadonly • Data never changed after recording to tertiary storage • But is read often! A tape is mounted at CERN every second! Resilienceratherthan ultimate reliability • Individual components should not bring down the whole system • Reschedule jobs on failed equipment Modestfloatingpointneeds • HEP computations involve decision making rather than calculation

  15. Very demanding tasks • Data-intensive tasks • Large datasets, large files • Lengthy processing times • Large memory consumption • High throughput is necessary • Very distributed user base • Distributed computing resources of modest size • Produced and processed data are hence distributed, too • Issues of coordination, synchronization and authorization are outstanding • HEP is by no means unique in its demands, but they are first, they are many, and they badly need it

  16. Other applications • Medical and biomedical: • Image processing (digital X-ray image analysis) • Simulation for radiation therapy • Protein folding • Chemistry • Quantum • Organic • Polymer modelling • Climate studies • Space sciences • Physics: • High Energy and other accelerator physics • Theoretical physics, lattice calculations of all sorts • Neutrino physics • Combustion • Genomics • Material sciences • Even warfare And many others

  17. IT perspective

  18. IT progress: some facts • Network vs. computer performance: • Computer speed doubles every 18 months • Network speed doubles every 9 months • 1986 to 2000: • Computers: 500 times faster • Networks: 340000 times faster • 2001 to 2010 (projected): • Computers: 60 times faster • Networks: 4000 times faster Bottom line: CPUs are fast enough; networks are very fast – gotta make use of it! Slide adapted from the Globus Alliance

  19. The Grid Supercomputer PC Farm Workstation The Grid Paradigm • Distributed supercomputer, based on commodity PCs and fast WAN • Access to the great variety of resources by a single pass – certificate • A possibility to manage distributed data in a synchronous manner • A new commodity

  20. Wider scope: a Grid System A Grid system is a collection of distributed resources connected by a network Examples of Distributed Resources: • Desktop • Handheld hosts • Devices with embedded processing resources such as digital cameras and phones • Tera-scale supercomputers Slide adapted from A.Grimshaw

  21. Characteristics of a generic Grid system Numerous Resources Ownership by Mutually Distrustful Organizations & Individuals Connected by Heterogeneous, Multi-Level Networks Different Security Requirements & Policies Required Different Resource Management Policies Potentially Faulty Resources Geographically Separated Resources are Heterogeneous Slide adapted from A.Grimshaw

  22. Grid paradigm is overloaded Global Grids • Multiple enterprises, owners, platforms, domains, file systems, locations, and security policies • Legion, Avaki, Globus Enterprise “Grids” • Single enterprise; multiple owners, platforms, domains, file systems, locations, and security policies • SUN SGE EE, Platform Multicluster Cluster & Departmental “Grids” • Single owner, platform, domain, file system and location • SUN SGE, Platform LSF, PBS Desktop Cycle Aggregation • Desktop only • United Devices, Entropia, Data Synapse WARNING! Not everything that has “G” in the name is Grid! (SGE, Oracle 10g, Condor-G etc) Graph borrowed from A.Grimshaw

  23. Implementations

  24. The first and only provider of a Grid toolkit (libraries and API) An academic research project in USA and now Europe Free software, open code Supports Grid testbeds since late 90’s Grid features: • Heterogeneous • Non-interactive • Single logon • Optimized file transfer protocol • Information schema Globus: the toolkit provider • To do: • Global resource management • Data management • User management, accounting

  25. Grid protocols (GSI, GRAM, …) enable resource sharing within virtual organizations; toolkit provides reference implementation ( = Globus Toolkit 2 services) User Reporter(registry +discovery) GIIS: GridInformationIndex Server (discovery) Gatekeeper(factory) Create process Register User User process #1 process #2 Other service(e.g. GridFTP) Proxy Proxy #2 The Globus Toolkit v2 in One Slide MDS-2 (Monitoring and Discovery Service) Reliable remote invocation Soft state registration; enquiry GSI (Grid Security Infrastructure) Authenticate & create proxy credential Other GSI-authenticated remote service requests GRAM (Grid Resource Allocation & Management) • Protocols (and APIs) enable other tools and services for membership, discovery, data management, workflow, … Slide adapted from the Globus Alliance

  26. Globus-Based Grid Tools & Applications • Data Grids • Distributed management of large quantities of data: physics, astronomy, engineering • High-throughput computing • Coordinated use of many computers • Collaborative environments • Authentication, resource discovery, and resource access • Portals • Thin client access to remote resources & services • And combinations of the above Slide adapted from the Globus Alliance

  27. Storage Some architectural thoughts Data locationserver UserInterface Workloadmanager Workloadmanager UserInterface UserInterface Storage InformationServer InformationServer InformationServer

  28. GriPhyN PPDG iVDGL Some Grid projects (past and present) Only few develop actual Grid solutions Many more are starting US projects European projects Slide adapted from Les Robertson

  29. Some Grid projects timeline • Other Grid-related projects do not develop Open Source-like (i.e., free) software/middleware, as of today • Most notably, Legion/Avaki: Globus competitor, widely used by businesses • Entropia: like SETI@Home • IBM, Platform: Globus-based • Sun Grid Engine EE: enterprise Grids

  30. ??? SE MSS MSS ??? ??? Broker(s) Broker(s) ??? ??? SE What Grid can do today • Simplest Grid: users access distributed resources using a single certificate • More complex Grid: users’ tasks are distributed between different resources by a broker • Even more complex Grid: not only tasks, but massive amounts of data are also distributed and managed (not quite there yet, only prototypes

  31. What is missing • Common policies, or ways of mutually respecting such • Grid accounting systems and Grid economy • Serious security solutions; role-based access control • Full-blown distributed data management systems • Tools and methods for system-wide applications environment deployment • STANDARDS!

  32. Managed shared virtual systems Computer science research OGSA, WSRF Web services, etc. Real standards Multiple implementations Globus Toolkit Internet standards Defacto standard Single implementation The emergence of Open Grid standards Functionality, standardization Custom solutions 1990 1995 2000 2005 2010 Slide adapted from the Globus Alliance

  33. The Grid or many Grids? • Globus Toolkit 2 is a basis for great many Grid solutions • Which use some common tools and utilities: GSI, GridFTP • But they also differ a lot, architecturally and technologically • There are several non-interoperable GT2-based Grid systems! • No satisfactory ready-made solutions  developers invent their own • Being financed from different sources, developers and users are not always encouraged to adopt rival project’s solution • Instead of “How should I use Grid?”, users ask “Which Grid should I use?” • Grid standards body: Global Grid Forum (GGF) • Heavily oriented towards commercial implementations • No effective standards since 2001 • Globus introduced the “Open Grid Services Architecture” (OGSA) • Not yet used by any of the development projects • Perhaps the first set of standards endorsed by GGF • Globus Toolkit 3 is released • New step by Globus: “Web Services Resource Framework” (WSRF) • We face Globus Toolkit 4 very soon…

  34. Meanwhile: ATLAS Production System uses 3 Grids

  35. Conclusion • HEP community stirred a world-wide Grid interest • Next big thing after the dot-com?.. • Despite a slow start and much hype, some real work is under way • Rather, the next big thing after the WWW ! • Still, no complete solution exists • Data management? • Accounting? • Security? • Standardization? • With courage and patience, we should go Grid

More Related