1 / 37

What are Grids and e-Science?

What are Grids and e-Science?. David Fergusson. EGEE is funded by the European Union under contract IST-2003-508833. Acknowledgements. This talk is based on a module of the tutorials delivered by the EDG training team and slides from Andrew Grimshaw, University of Virginia

mili
Download Presentation

What are Grids and e-Science?

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. What are Grids and e-Science? David Fergusson EGEE is funded by the European Union under contract IST-2003-508833 Using Web Services: What is Grid? – June 3rd- 4th, 2004 - 1

  2. Acknowledgements • This talk is based on a module of the tutorials delivered by the EDG training team and slides from • Andrew Grimshaw, University of Virginia • Bob Jones, EGEE Technical Director • Mark Parsons, EPCC • the EDG training team • Roberto Barbera, INFN • Ian Foster, Argonne National Laboratories • Jeffrey Grethe, SDSC • The National e-Science Centre • Prepared by Dave Berry Using Web Services: What is Grid? – June 3rd- 4th, 2004 - 2

  3. Goals of this module • Introduce grid concepts and definitions • Why Grids? • A brief outline of history leading to EGEE • Provide some brief examples of middleware components • The strategic direction will be covered tomorrow Using Web Services: What is Grid? – June 3rd- 4th, 2004 - 3

  4. Overview • What is different about grids? • e-Science • Characteristics of a grid • Applications (what’s in it for the working scientist) • European grids, and the world • Grid components. Using Web Services: What is Grid? – June 3rd- 4th, 2004 - 4

  5. What is different about grids? Using Web Services: What is Grid? – June 3rd- 4th, 2004 - 5

  6. Researchers perform their activities regardless geographical location, interact with colleagues, share and access data Scientific instruments and experiments provide huge amount of data The (Science) Grid Vision The Grid: networked data processing centres and ”middleware” software as the “glue” of resources. Using Web Services: What is Grid? – June 3rd- 4th, 2004 - 6

  7. What is Grid Computing? • A Virtual Organisation is: • People from different institutions working to solve a common goal • Sharing distributed processing and data resources • Grid infrastructure enables virtual organisations “Grid computing is coordinated resource sharing and problem solving in dynamic, multi-institutional virtual organizations” (I.Foster) Using Web Services: What is Grid? – June 3rd- 4th, 2004 - 7

  8. Grids vs. Distributed Computing • Existing distributed applications: • tend to be specialised systems • intended for a single purpose or user group • Grids go further and take into account: • Different kinds ofresources • Not always the same hardware, data and applications • Different kinds of interactions • User groups or applications want to interact with Grids in different ways • Dynamic nature • Resources and users added/removed/changed frequently Using Web Services: What is Grid? – June 3rd- 4th, 2004 - 8

  9. The main drivers behind Grid • The relentless increase in microprocessor performance • you can buy multi-gigaflop systems for less than €800 • The availability of reliable high performance networking • in Europe the GEANT network links 32 countries at speeds of up to 10Gbps (and beyond) • in the UK we have gone from 100Mbps -> 10Gbps academic backbone since 2000 • 1Gbps is commonly available to the desktop • The desire to push the boundaries of scientific discovery by computational analysis and simulation – e-Science Using Web Services: What is Grid? – June 3rd- 4th, 2004 - 9

  10. 9 12 18 Exponential Growth Optical Fibre(bits per second) Doubling Time(months) Gilder’s Law(32X in 4 yrs) Data Storage(bits per sq. inch) Storage Law (16X in 4yrs) Performance per Dollar Spent Chip capacity(# transistors) Moore’s Law(5X in 4yrs) 0 1 2 3 4 5 Number of Years Triumph of Light – Scientific American. George Stix, January 2001 Using Web Services: What is Grid? – June 3rd- 4th, 2004 - 10

  11. How Different 2004 is from 1994 • Moore’s law everywhere • Instruments, detectors, sensors, scanners, … • Organising their effective use is the challenge • Enormous quantities of data: Petabytes • For an increasing number of communities • Gating step is not collection but analysis • Huge quantities of computing: >100 Top/s • Moore’s law gives us all supercomputers • Organising their effective use is the challenge • Ultra-high-speed networks: >10 Gb/s • Global optical networks • Bottlenecks: last kilometre & firewalls Using Web Services: What is Grid? – June 3rd- 4th, 2004 - 11

  12. e-Science Using Web Services: What is Grid? – June 3rd- 4th, 2004 - 12

  13. The Emergence of e-Science • Invention and exploitation of advanced computational methods • To generate, curate and analyse research data • From experiments, observations and simulations • Quality management, preservation and reliable evidence • To develop and explore models and simulations • Computation and data at extreme scales • Trustworthy, economic, timely and relevant results • To enable dynamic distributed virtual organisations • Facilitating collaboration with information and resource sharing • Security, reliability, accountability, manageability and agility Using Web Services: What is Grid? – June 3rd- 4th, 2004 - 13

  14. Why use Grids for e-Science? • Scale of the problems • Science increasingly done through distributed global collaborations enabled by the internet • Grids provide access to: • Very large data collections • Terascale computing resources • High performance visualisation • Connected by high-bandwidth networks • e-Science is more than Grid Technology It is what you do with it that counts Using Web Services: What is Grid? – June 3rd- 4th, 2004 - 14

  15. Challenges • Must share data between thousands of scientists with multiple interests • Must ensure that all data is accessible anywhere, anytime • Must be scalable and remain reliable for more than a decade • Must cope withdifferent access policies • Must ensure datasecurity Using Web Services: What is Grid? – June 3rd- 4th, 2004 - 15

  16. The Emergence of Global Knowledge Communities Slide from Ian Foster’s ssdbm 03 keynote

  17. Characteristics of a grid Using Web Services: What is Grid? – June 3rd- 4th, 2004 - 17

  18. What are the characteristics of a Grid system? Numerous Resources Ownership by Mutually Distrustful Organizations & Individuals Different Security Requirements & Policies Required Potentially Faulty Resources Resources are Heterogeneous Using Web Services: What is Grid? – June 3rd- 4th, 2004 - 18

  19. What are the characteristics of a Grid system? Standards Numerous Resources Ownership by Mutually Distrustful Organizations & Individuals Connected by Heterogeneous, Multi-Level Networks Different Security Requirements & Policies Required Different Resource Management Policies Potentially Faulty Resources Geographically Separated Resources are Heterogeneous Using Web Services: What is Grid? – June 3rd- 4th, 2004 - 19

  20. Applications (What’s in it for working scientists) Using Web Services: What is Grid? – June 3rd- 4th, 2004 - 20

  21. Grid Applications • Medical/Healthcare(imaging, diagnosis and treatment ) • Bioinformatics(study of the human genome and proteome to understand genetic diseases) • Nanotechnology(design of new materials from the molecular scale) • Engineering(design optimization, simulation, failure analysis and remote Instrument access and control) • Natural Resources and the Environment(weather forecasting, earth observation, modeling and prediction of complex systems) Using Web Services: What is Grid? – June 3rd- 4th, 2004 - 21

  22. CERN: Data intensive science in a large international facility • The Large Hadron Collider (LHC) • The most powerful instrument ever built to investigate elementary particles physics • Data Challenge: • 10Petabytes/year of data !!! • 20 million CDs each year! • Simulation, reconstruction, analysis: • LHC data handling requires computing power equivalent to ~100,000 of today's fastest PC processors! Mont Blanc (4810 m) Downtown Geneva Using Web Services: What is Grid? – June 3rd- 4th, 2004 - 22

  23. CrossGrid • 1. Interactive biomedical simulation and visualization • 2. Flooding crisis team support • 3. HEP distributed data analysis • 4. Weather forecasting and air pollutionmodelling Using Web Services: What is Grid? – June 3rd- 4th, 2004 - 23

  24. Connecting People: Access Grid Remote video Visualisation Microphones Cameras Using Web Services: What is Grid? – June 3rd- 4th, 2004 - 24

  25. European grids And the world Using Web Services: What is Grid? – June 3rd- 4th, 2004 - 25

  26. Grid projects Many Grid development efforts — all over the world • UK – OGSA-DAI, RealityGrid, GeoDise, Comb-e-Chem, DiscoveryNet, DAME, AstroGrid, GridPP, MyGrid, GOLD, eDiamond, Integrative Biology, … • Netherlands – VLAM, PolderGrid • Germany – UNICORE, Grid proposal • France – Grid funding approved • Italy – INFN Grid • Eire – Grid proposals • Switzerland - Network/Grid proposal • Hungary – DemoGrid, Grid proposal • Norway, Sweden - NorduGrid • NASA Information Power Grid • DOE Science Grid • NSF National Virtual Observatory • NSF GriPhyN • DOE Particle Physics Data Grid • NSF TeraGrid • DOE ASCI Grid • DOE Earth Systems Grid • DARPA CoABS Grid • NEESGrid • DOH BIRN • NSF iVDGL • DataGrid (CERN, ...) • EuroGrid (Unicore) • DataTag (CERN,…) • Astrophysical Virtual Observatory • GRIP (Globus/Unicore) • GRIA (Industrial applications) • GridLab (Cactus Toolkit) • CrossGrid (Infrastructure Components) • EGSO (Solar Physics) Using Web Services: What is Grid? – June 3rd- 4th, 2004 - 26

  27. Major EU GRID projects European DataGrid (EDG) www.edg.org LHC Computing GRID (LCG) cern.ch/lcg CrossGRID www.crossgrid.org DataTAG www.datatag.org GridLab www.gridlab.org EUROGRID www.eurogrid.org European National Projects: • INFNGRID, • UK e-Science Programme, • NorduGrid Using Web Services: What is Grid? – June 3rd- 4th, 2004 - 27

  28. EU DataGrid at a glance Application Testbed ~20 regular sites > 60,000 jobs submitted (since 09/03, release 2.0) Peak >1000 CPUs 6 Mass Storage Systems People 500 registered users 12 Virtual Organisations 21 Certificate Authorities >600 people trained 456 person-years of effort170 years funded Software > 65 use cases 7 major software releases (> 60 in total) > 1,000,000 lines of code Scientific Applications 5 Earth Obs institutes 10 bio-medical apps 6 HEP experiments Using Web Services: What is Grid? – June 3rd- 4th, 2004 - 28

  29. The EGEE Project • Leverage national resources for broader European benefit • 70 institutions in 27 countries, federated in regional Grids Using Web Services: What is Grid? – June 3rd- 4th, 2004 - 29

  30. Grid components Using Web Services: What is Grid? – June 3rd- 4th, 2004 - 30

  31. Grid Middleware components from several projects Packaged and tested together Foundation of EGEE/ LCG Globus Toolkit Condor Chimera EDG & LCG tools NCSA Tools Other Tools Virtual Data Toolkit Using Web Services: What is Grid? – June 3rd- 4th, 2004 - 31

  32. Globus Toolkit • Grid Security Infrastructure (GSL) • X.509 authentication with delegates and single sign-on • Grid Resource Allocation Mgmt (GRAM) • Remote allocation, reservation, monitoring, control of compute resources • GridFTP protocol (FTP extensions) • High-performance data access & transport • Grid Resource Information Service (GRIS) +Monitoring and Discovery Service (MDS) • Access to structure & state information • XIO • TCP, UDP, IP multicast, and file I/O • Others… Using Web Services: What is Grid? – June 3rd- 4th, 2004 - 32

  33. Condor • “Cycle-stealing” • Use idle CPU cycles for productive work • “High Throughput Computing” • Using all available compute power over periods of days, weeks,… • “Embarrassingly parallel” problems • Fault tolerance • Algorithms must allow for failure • Checkpointing and process migration • DAGMan • Workflow specification Using Web Services: What is Grid? – June 3rd- 4th, 2004 - 33

  34. Chimera • Technology for collaborative management of data, programs & computations • Virtual data system • Virtual data catalog • Virtual data language • Automated data derivation • Provenance tracking • Pegasus • AI planning system for Grid workflows Using Web Services: What is Grid? – June 3rd- 4th, 2004 - 34

  35. Tools • NCSA • MyProxy • GSI OpenSSH • EDG & LCG • Make Gridmap (Authorisation control) • Certificate Revocation List Updater • GLUE Schema (Monitoring) • Others • VDT System Profiler • Configuration software • KX509 (X.509 <-> Kerberos) Using Web Services: What is Grid? – June 3rd- 4th, 2004 - 35

  36. Summary COLLABORATION Display Store data Internet FLEXIBILITY Create data Process data Standards Using Web Services: What is Grid? – June 3rd- 4th, 2004 - 36

  37. Questions? Using Web Services: What is Grid? – June 3rd- 4th, 2004 - 37

More Related