370 likes | 485 Views
What are Grids and e-Science?. David Fergusson. EGEE is funded by the European Union under contract IST-2003-508833. Acknowledgements. This talk is based on a module of the tutorials delivered by the EDG training team and slides from Andrew Grimshaw, University of Virginia
E N D
What are Grids and e-Science? David Fergusson EGEE is funded by the European Union under contract IST-2003-508833 Using Web Services: What is Grid? – June 3rd- 4th, 2004 - 1
Acknowledgements • This talk is based on a module of the tutorials delivered by the EDG training team and slides from • Andrew Grimshaw, University of Virginia • Bob Jones, EGEE Technical Director • Mark Parsons, EPCC • the EDG training team • Roberto Barbera, INFN • Ian Foster, Argonne National Laboratories • Jeffrey Grethe, SDSC • The National e-Science Centre • Prepared by Dave Berry Using Web Services: What is Grid? – June 3rd- 4th, 2004 - 2
Goals of this module • Introduce grid concepts and definitions • Why Grids? • A brief outline of history leading to EGEE • Provide some brief examples of middleware components • The strategic direction will be covered tomorrow Using Web Services: What is Grid? – June 3rd- 4th, 2004 - 3
Overview • What is different about grids? • e-Science • Characteristics of a grid • Applications (what’s in it for the working scientist) • European grids, and the world • Grid components. Using Web Services: What is Grid? – June 3rd- 4th, 2004 - 4
What is different about grids? Using Web Services: What is Grid? – June 3rd- 4th, 2004 - 5
Researchers perform their activities regardless geographical location, interact with colleagues, share and access data Scientific instruments and experiments provide huge amount of data The (Science) Grid Vision The Grid: networked data processing centres and ”middleware” software as the “glue” of resources. Using Web Services: What is Grid? – June 3rd- 4th, 2004 - 6
What is Grid Computing? • A Virtual Organisation is: • People from different institutions working to solve a common goal • Sharing distributed processing and data resources • Grid infrastructure enables virtual organisations “Grid computing is coordinated resource sharing and problem solving in dynamic, multi-institutional virtual organizations” (I.Foster) Using Web Services: What is Grid? – June 3rd- 4th, 2004 - 7
Grids vs. Distributed Computing • Existing distributed applications: • tend to be specialised systems • intended for a single purpose or user group • Grids go further and take into account: • Different kinds ofresources • Not always the same hardware, data and applications • Different kinds of interactions • User groups or applications want to interact with Grids in different ways • Dynamic nature • Resources and users added/removed/changed frequently Using Web Services: What is Grid? – June 3rd- 4th, 2004 - 8
The main drivers behind Grid • The relentless increase in microprocessor performance • you can buy multi-gigaflop systems for less than €800 • The availability of reliable high performance networking • in Europe the GEANT network links 32 countries at speeds of up to 10Gbps (and beyond) • in the UK we have gone from 100Mbps -> 10Gbps academic backbone since 2000 • 1Gbps is commonly available to the desktop • The desire to push the boundaries of scientific discovery by computational analysis and simulation – e-Science Using Web Services: What is Grid? – June 3rd- 4th, 2004 - 9
9 12 18 Exponential Growth Optical Fibre(bits per second) Doubling Time(months) Gilder’s Law(32X in 4 yrs) Data Storage(bits per sq. inch) Storage Law (16X in 4yrs) Performance per Dollar Spent Chip capacity(# transistors) Moore’s Law(5X in 4yrs) 0 1 2 3 4 5 Number of Years Triumph of Light – Scientific American. George Stix, January 2001 Using Web Services: What is Grid? – June 3rd- 4th, 2004 - 10
How Different 2004 is from 1994 • Moore’s law everywhere • Instruments, detectors, sensors, scanners, … • Organising their effective use is the challenge • Enormous quantities of data: Petabytes • For an increasing number of communities • Gating step is not collection but analysis • Huge quantities of computing: >100 Top/s • Moore’s law gives us all supercomputers • Organising their effective use is the challenge • Ultra-high-speed networks: >10 Gb/s • Global optical networks • Bottlenecks: last kilometre & firewalls Using Web Services: What is Grid? – June 3rd- 4th, 2004 - 11
e-Science Using Web Services: What is Grid? – June 3rd- 4th, 2004 - 12
The Emergence of e-Science • Invention and exploitation of advanced computational methods • To generate, curate and analyse research data • From experiments, observations and simulations • Quality management, preservation and reliable evidence • To develop and explore models and simulations • Computation and data at extreme scales • Trustworthy, economic, timely and relevant results • To enable dynamic distributed virtual organisations • Facilitating collaboration with information and resource sharing • Security, reliability, accountability, manageability and agility Using Web Services: What is Grid? – June 3rd- 4th, 2004 - 13
Why use Grids for e-Science? • Scale of the problems • Science increasingly done through distributed global collaborations enabled by the internet • Grids provide access to: • Very large data collections • Terascale computing resources • High performance visualisation • Connected by high-bandwidth networks • e-Science is more than Grid Technology It is what you do with it that counts Using Web Services: What is Grid? – June 3rd- 4th, 2004 - 14
Challenges • Must share data between thousands of scientists with multiple interests • Must ensure that all data is accessible anywhere, anytime • Must be scalable and remain reliable for more than a decade • Must cope withdifferent access policies • Must ensure datasecurity Using Web Services: What is Grid? – June 3rd- 4th, 2004 - 15
The Emergence of Global Knowledge Communities Slide from Ian Foster’s ssdbm 03 keynote
Characteristics of a grid Using Web Services: What is Grid? – June 3rd- 4th, 2004 - 17
What are the characteristics of a Grid system? Numerous Resources Ownership by Mutually Distrustful Organizations & Individuals Different Security Requirements & Policies Required Potentially Faulty Resources Resources are Heterogeneous Using Web Services: What is Grid? – June 3rd- 4th, 2004 - 18
What are the characteristics of a Grid system? Standards Numerous Resources Ownership by Mutually Distrustful Organizations & Individuals Connected by Heterogeneous, Multi-Level Networks Different Security Requirements & Policies Required Different Resource Management Policies Potentially Faulty Resources Geographically Separated Resources are Heterogeneous Using Web Services: What is Grid? – June 3rd- 4th, 2004 - 19
Applications (What’s in it for working scientists) Using Web Services: What is Grid? – June 3rd- 4th, 2004 - 20
Grid Applications • Medical/Healthcare(imaging, diagnosis and treatment ) • Bioinformatics(study of the human genome and proteome to understand genetic diseases) • Nanotechnology(design of new materials from the molecular scale) • Engineering(design optimization, simulation, failure analysis and remote Instrument access and control) • Natural Resources and the Environment(weather forecasting, earth observation, modeling and prediction of complex systems) Using Web Services: What is Grid? – June 3rd- 4th, 2004 - 21
CERN: Data intensive science in a large international facility • The Large Hadron Collider (LHC) • The most powerful instrument ever built to investigate elementary particles physics • Data Challenge: • 10Petabytes/year of data !!! • 20 million CDs each year! • Simulation, reconstruction, analysis: • LHC data handling requires computing power equivalent to ~100,000 of today's fastest PC processors! Mont Blanc (4810 m) Downtown Geneva Using Web Services: What is Grid? – June 3rd- 4th, 2004 - 22
CrossGrid • 1. Interactive biomedical simulation and visualization • 2. Flooding crisis team support • 3. HEP distributed data analysis • 4. Weather forecasting and air pollutionmodelling Using Web Services: What is Grid? – June 3rd- 4th, 2004 - 23
Connecting People: Access Grid Remote video Visualisation Microphones Cameras Using Web Services: What is Grid? – June 3rd- 4th, 2004 - 24
European grids And the world Using Web Services: What is Grid? – June 3rd- 4th, 2004 - 25
Grid projects Many Grid development efforts — all over the world • UK – OGSA-DAI, RealityGrid, GeoDise, Comb-e-Chem, DiscoveryNet, DAME, AstroGrid, GridPP, MyGrid, GOLD, eDiamond, Integrative Biology, … • Netherlands – VLAM, PolderGrid • Germany – UNICORE, Grid proposal • France – Grid funding approved • Italy – INFN Grid • Eire – Grid proposals • Switzerland - Network/Grid proposal • Hungary – DemoGrid, Grid proposal • Norway, Sweden - NorduGrid • NASA Information Power Grid • DOE Science Grid • NSF National Virtual Observatory • NSF GriPhyN • DOE Particle Physics Data Grid • NSF TeraGrid • DOE ASCI Grid • DOE Earth Systems Grid • DARPA CoABS Grid • NEESGrid • DOH BIRN • NSF iVDGL • DataGrid (CERN, ...) • EuroGrid (Unicore) • DataTag (CERN,…) • Astrophysical Virtual Observatory • GRIP (Globus/Unicore) • GRIA (Industrial applications) • GridLab (Cactus Toolkit) • CrossGrid (Infrastructure Components) • EGSO (Solar Physics) Using Web Services: What is Grid? – June 3rd- 4th, 2004 - 26
Major EU GRID projects European DataGrid (EDG) www.edg.org LHC Computing GRID (LCG) cern.ch/lcg CrossGRID www.crossgrid.org DataTAG www.datatag.org GridLab www.gridlab.org EUROGRID www.eurogrid.org European National Projects: • INFNGRID, • UK e-Science Programme, • NorduGrid Using Web Services: What is Grid? – June 3rd- 4th, 2004 - 27
EU DataGrid at a glance Application Testbed ~20 regular sites > 60,000 jobs submitted (since 09/03, release 2.0) Peak >1000 CPUs 6 Mass Storage Systems People 500 registered users 12 Virtual Organisations 21 Certificate Authorities >600 people trained 456 person-years of effort170 years funded Software > 65 use cases 7 major software releases (> 60 in total) > 1,000,000 lines of code Scientific Applications 5 Earth Obs institutes 10 bio-medical apps 6 HEP experiments Using Web Services: What is Grid? – June 3rd- 4th, 2004 - 28
The EGEE Project • Leverage national resources for broader European benefit • 70 institutions in 27 countries, federated in regional Grids Using Web Services: What is Grid? – June 3rd- 4th, 2004 - 29
Grid components Using Web Services: What is Grid? – June 3rd- 4th, 2004 - 30
Grid Middleware components from several projects Packaged and tested together Foundation of EGEE/ LCG Globus Toolkit Condor Chimera EDG & LCG tools NCSA Tools Other Tools Virtual Data Toolkit Using Web Services: What is Grid? – June 3rd- 4th, 2004 - 31
Globus Toolkit • Grid Security Infrastructure (GSL) • X.509 authentication with delegates and single sign-on • Grid Resource Allocation Mgmt (GRAM) • Remote allocation, reservation, monitoring, control of compute resources • GridFTP protocol (FTP extensions) • High-performance data access & transport • Grid Resource Information Service (GRIS) +Monitoring and Discovery Service (MDS) • Access to structure & state information • XIO • TCP, UDP, IP multicast, and file I/O • Others… Using Web Services: What is Grid? – June 3rd- 4th, 2004 - 32
Condor • “Cycle-stealing” • Use idle CPU cycles for productive work • “High Throughput Computing” • Using all available compute power over periods of days, weeks,… • “Embarrassingly parallel” problems • Fault tolerance • Algorithms must allow for failure • Checkpointing and process migration • DAGMan • Workflow specification Using Web Services: What is Grid? – June 3rd- 4th, 2004 - 33
Chimera • Technology for collaborative management of data, programs & computations • Virtual data system • Virtual data catalog • Virtual data language • Automated data derivation • Provenance tracking • Pegasus • AI planning system for Grid workflows Using Web Services: What is Grid? – June 3rd- 4th, 2004 - 34
Tools • NCSA • MyProxy • GSI OpenSSH • EDG & LCG • Make Gridmap (Authorisation control) • Certificate Revocation List Updater • GLUE Schema (Monitoring) • Others • VDT System Profiler • Configuration software • KX509 (X.509 <-> Kerberos) Using Web Services: What is Grid? – June 3rd- 4th, 2004 - 35
Summary COLLABORATION Display Store data Internet FLEXIBILITY Create data Process data Standards Using Web Services: What is Grid? – June 3rd- 4th, 2004 - 36
Questions? Using Web Services: What is Grid? – June 3rd- 4th, 2004 - 37