280 likes | 438 Views
Towards Seamless Grid Computing The EGEE Experience on Interoperable Grid Infrastructures. Erwin Laure EGEE-II Technical Director Erwin.Laure@cern.ch. eScience. Science is becoming increasingly digital , needs to deal with increasing amounts of data and computational needs
E N D
Towards Seamless Grid ComputingThe EGEE Experience on Interoperable Grid Infrastructures Erwin Laure EGEE-II Technical Director Erwin.Laure@cern.ch
eScience • Science is becoming increasingly digital, needs to deal with increasing amounts of data and computational needs • Simulations get ever more detailed • Nanotechnology –design of new materials from the molecular scale • Modelling and predicting complex systems (weather forecasting, river floods, earthquake) • Decoding the human genome • Experimental Science uses ever moresophisticated sensors to make precisemeasurements • Need high statistics • Huge amounts of data • Serves user communities around the world EuroVO Workshop - April 2008
Scientific trends Scientific advances are more and more based on simulations using “virtual laboratories” Ulf Dahlsten, Former Director Emerging Technologies and Infrastructures, EU, predicts that “in five years 80 percent of all scientific papers in all areas will be made in virtual laboratories. Fifty percent of social science documents will go the same way in five to ten years.” The size of data an organization owns, manages, and depends on is dramatically increasing: • Ownership cost of storage capacity goes down • Data generated and consumed goes up • Network capacity goes up • Distributed computing technology matures and is more widely adopted EuroVO Workshop - April 2008
EGEE Main Objectives Operate a large-scale, production quality grid infrastructure for e-Science Attract new resources and users from industry as wellas sciences • Flagship grid infrastructure project co-funded by the European Commission • Now in 2nd phase with 91 partners in 32 countries EuroVO Workshop - April 2008
EGEE – What do we deliver? • Infrastructure operation • Sites distributed across many countries • Large quantity of CPUs and storage • Continuous monitoring of grid services & automated site configuration/management • Support multiple Virtual Organisations from diverse research disciplines • Middleware • Production quality middleware distributed under business friendly open source licence • Implements a service-oriented architecture that virtualises resources • Adheres to recommendations on web service inter-operability and evolving towards emerging standards • User Support - Managed process from first contact through to production usage • Training • Expertise in grid-enabling applications • Online helpdesk • Networking events (User Forum, Conferences etc.) EuroVO Workshop - April 2008
250 sites 48 countries 50,000 CPUs 13 PetaBytes >5000 users >200 VOs >140,000 jobs/day Archeology Astronomy Astrophysics Civil Protection Comp. Chemistry Earth Sciences Finance Fusion Geophysics High Energy Physics Life Sciences Multimedia Material Sciences … 32% EuroVO Workshop - April 2008
Users and resources distribution EuroVO Workshop - April 2008
EGEE Grid Management Structure Operations Coordination Centre (OCC) management, oversight of all operational and support activities Regional Operations Centres (ROC) providing the core of the support infrastructure, each supporting a number of resource centres within its region Grid Operator on Duty Resource centres providing resources (computing, storage, network, etc.); Grid User Support (GGUS) At FZK, coordination and management of user support, single point of contact for users EuroVO Workshop - April 2008
Example: GridMap Monitoring Visualization EuroVO Workshop - April 2008 9
Infrastructures geographical or thematic coverage Support Actions key complementary functions Applications improved services for academia, industry and the public Registered Collaborating Projects 25 projects have registered as of September 2007:web page EuroVO Workshop - April 2008
GIN EGEE working with related infrastructure projects EuroVO Workshop - April 2008
LHC Use of Multiple Grid Infrastructures EuroVO Workshop - April 2008
Why is Interoperability difficult? • Grid infrastructures use different technologies • And even if same technologies are used they are usually heavily customized • Only a few widely adopted standards • gridFTP, X.509 (but used differently!) • Prototypes: BES, JSDL, … • Production Grids are difficult to change – adopting standards takes time • Standards need to be stable before adoption • Apart from technological differences, access policies also differ • Dialog among major Grid infrastructure providers started at last OGF22. • Strong interactions between infrastructures and application community needed • HEP was driving interop efforts for LHC • Other applications can build on these experiences EuroVO Workshop - April 2008
LSF PBS/Torque Sun Grid Engine Load Leveler Condor Access to Compute Resources Nordugrid CREAM ARC EGEE GRAM v2/v4 Unicore OSG DEISA GRAM v4WS NAREGI Teragrid Naregi There are as many Computing Interfaces as Batch Systems! EuroVO Workshop - April 2008
How to Start • Understanding the differences • Compatibility matrix • Domains that have to be linked for interoperability • Security • Information Services • Job Management • Data Management • For interoperation you have to add • Monitoring • Accounting • Operational links and joint policies • Trouble ticket systems • Operational security EuroVO Workshop - April 2008
ARC OSG EGEE Job Submission GridFTP GRAM GRAM Service Discovery LDAP/GIIS LDAP/GIIS LDAP/BDII Schema ARC GLUE v1 GLUE v1.2 Storage Transfer Protocol GridFTP GridFTP GridFTP Storage Control Protocol SRM SRM SRM Security GSI/VOMS GSI/VOMS GSI/VOMS Interoperability Matrix • Understand both middleware stacks • Identify the “common” interfaces • Create an interoperability matrix EuroVO Workshop - April 2008
Different Strategy • Long term solution • Common interfaces • Standards • Medium term solutions • Gateways • Adaptors and Translators • Short term solutions • Parallel Infrastructures • User driven • Site driven EuroVO Workshop - April 2008
Parallel Infrastructures • User Driven • The user joins both grids • Uses different clients • Depending on which interface • More work for the User • Required for each infrastructure • Keyhole approach • Restricts functionality • Method initially used by ATLAS • Split workload between grids EuroVO Workshop - April 2008
Parallel Infrastructures • Site Driven • The site joins both grids • Deploys both interfaces • User only sees their grid interface • More work for the site • Can only be supported by large sites • Reduced resources • Use By FZK • Participating in EGEE, Nordugrid and D-Grid EuroVO Workshop - April 2008
Gateway • A gateway is a bridge between grid infrastructures • Single point of failure • Gateway breaks, grid disappears • Scalability bottleneck • All the load through one service • Useful as a proof concept and to demonstrate the need • NAREGI approach using glite-CE Gateway EuroVO Workshop - April 2008
Adaptors and Translators • Adaptors allow connection • Translators understand/modify information • They are built into the middleware • The middleware can then work with both interfaces • Useful feature even when using standards! • Requires modification to the grid middleware • Existing service interfaces can still be used • Using in the GIN information System; most portals Plugin API Plugin EuroVO Workshop - April 2008
GIN Worldwide Grids APAC DEISA EGEE Naregi NDGF NGSOSG Pragma Teragrid EuroVO Workshop - April 2008
How mature are we? Gartner Group Grid on the Computing in HighEnergy Physics conferences timeline Beijing 2001 San Diego 2003 Victoria 2007 Mumbai 2006 Padova 2000 Interlaken 2004 Slide courtesy of Les Robertson, LCG Project Leader EuroVO Workshop - April 2008
GRID . • INFRASTRUCTURE KNOWLEDGE . INFRASTRUCTURE NETWORK . INFRASTRUCTURE From e-Infrastructures to Knowledge Infrastructures • Network infrastructure connects computing and data resources and allows their seamless usage via Grid infrastructures • Federated resources and new technologies enable new application fields: • Distributed digital libraries • Distributed data mining • Digital preservation of cultural heritage • Data curation → Knowledge Infrastructure EuroVO Workshop - April 2008
Astrophysics community WeatherForecast community Biomedics community . . . . . . . ICT for Science: e-Infrastructures Connecting the finest minds Sharing and federating the best scientific resources Building global virtual communities Sharing and federating scientific data Sharing computers, instruments and applications Linking at the speed of the light Mario Campolargo Acting Director Emerging Technologies and Infrastructures, EU European Information Space: Infrastructures, Services and Applications Workshop, Rome, 29-30 October 2007
Routine Usage Testbeds Utility Service Evolution National European e-Infrastructure Global EuroVO Workshop - April 2008
European Grid Initiative • Need to prepare permanent, common Grid infrastructure • Ensure the long-term sustainability of the European e-Infrastructure independent of short project funding cycles • Coordinate the integration and interaction between National Grid Infrastructures (NGIs) • Operate the production Grid infrastructure on a European level for a wide range of scientific disciplines EuroVO Workshop - April 2008
Summary • EGEE provides a dependable production quality Grid infrastructure to a wide variety of scientific disciplines. • Collaborations on technical and political topics are key to implement a truly world-wide infrastructure • Need to cover full spectrum: from individual sites, small scale Grids to world-wide infrastructures • Grids are increasingly becoming an essential part of the scientific computing infrastructure – sustainability needs to be ensured EuroVO Workshop - April 2008