410 likes | 429 Views
Learn about the emergence of distributed computing, establishment of Grid Services, middleware development, packaging efforts, and leveraging Grid Services for science and engineering research. Explore the history, challenges, and future of Grid technology.
E N D
Grid Services: Middleware Infrastructure for use of Distributed Resources John Towns Principal Investigator, NLANR Distributed Applications Support Team Division Director, Scientific Computing, NCSA / Univ of Illinois jtowns@nlanr.net
Outline • Emergence of Distributed Computing • Middleware develops • Establishment of Grid Services • What are Grid Services? • How do they relate to Web Services? • Current Middleware Development Projects • Grid Services Middleware • Toolkits • Packaging Efforts • Deployment/Leverage of Grid Services • Infrastructure Projects Deploying Grid Services Infrastructure • Projects Leveraging Grid Services for Science and Engineering Research and Development iGrid 2002
Late 1980’s – Early 1990’s • Late 1980s - Metacomputing • Focus on Distributed Computation - running applications across several supercomputing resources • Early 1990’s - Gigabit Testbeds • Networking research testbeds pushing limits of communication bandwidth to Gigabit/s levels • BLANCA, CASA, Aurora and other testbeds in the US • Additional such testbeds follow in other countries iGrid 2002
Communications Libraries • Data communications libraries developed for wide area networks • Some use of pre-existing PVM-type libraries • Typically not good for wide area • Development of software optimized for larger messages, higher latencies • Data Transfer Mechanism (DTM) • PVM extensions • Plethora of messaging libraries is a problem • Some unification in MPI standardization process iGrid 2002
Early Infrastructure • Poor network infrastructure • Network testbeds were exactly that – finite lifetimes, experimental environment • All distributed application runs manually scheduled on networks used • Poor distributed computing infrastructure • Distributed applications were experiments and difficult to schedule time on “production” compute resources • All distributed application runs manually scheduled on supercomputing systems used • Poor software environment • Communications libraries were relatively immature • Disparity of communications libraries required installation by applications teams on all systems of interest • Little support from system admins • Little support for anything beyond distributed simulations on supercomputers iGrid 2002
Why “The Grid”? • New Applications Based on High-speed Coupling of People, Computers, Databases, Instruments, etc. • Computer-enhanced Instruments • Collaborative Engineering • Browsing of Remote Datasets • Use of Remote Software • Data-intensive Computing • Multi-supercomputer Simulation • Large-scale Parameter Studies Source: Ian Foster, ANL iGrid 2002
The Grid:Blueprint for a New Computing Infrastructure • Published in 1999 • Ian Foster, Carl Kesselman (Eds) • ISBN 1-55860-475-8, www.mkp.com/grids • 22 chapters by expert authors including: • Andrew Chien, • Jack Dongarra, • Tom DeFanti, • Andrew Grimshaw, • Roch Guerin, • Ken Kennedy, • Paul Messina, • Cliff Neuman, • Jon Postel, • Larry Smarr, • Rick Stevens, • and many others “A source book for the history of the future” -- Vint Cerf iGrid 2002
The Grid • “Dependable, Consistent, Pervasive Access to [High-end] Resources” • Dependable: • Can Provide Performance and Functionality Guarantees • Consistent: • Uniform Interfaces to a Wide Variety of Resources • Pervasive: • Ability to “Plug In” From Anywhere Source: Ian Foster, ANL iGrid 2002
I-WAY • SC95 and the Information Wide Area Year • 17 sites and 10 networks connected using early middleware at SC95 • 60+ applications, 15+ disciplines • Lots of lessons learned iGrid 2002
Middleware Emerges • Globus Develops • 1994-1996 • Initial development and experimentation • 1997-1999 • Creation of Initial Globus Toolkit (1998) • First Adoption / Deployment Successes • Partnerships With NCSA, NASA, others • UNICORE • 1997 • Development begins in Germany as a national research project • 1999-2000 • Proof of concept prototype released (1999) • First successes • Other related projects in mid-1990’s • OSF’s Distributed Computing Environment (DCE) • Object Management Group's Common Object Request Broker Architecture (CORBA) • Microsoft's COM/DCOM • Many others… iGrid 2002
Science Portals & Workbenches Science Portals Twenty-First Century Applications Capability Computing Access Grid Computational Grid Access Services & Technology Computational Services Grid Middleware (resource independent) Grid Fabric (resource dependent) Build the GRID Networking, Devices and Systems Layered Approach to Building the GRID P e r f o r m a n c e Alliance Grid Model iGrid 2002
Grid Testbeds The Alliance National Technology Grid I-WAY NASA’s Information Power Grid iGrid 2002
Grid Applications • What’s the Grid about? • initially, most thought just “parallel MPI jobs” • that missed some of the real opportunities • but, how does the Grid add maximal value? • What applications most need Grids? • remote instruments and sensors • inhospitable environments • remote telescopes, environmental monitoring, … • equipment and logistics monitoring • distributed data archives • multi-spectral astronomy (NVO), LIGO, LHC, genomics, … • discipline engineering (e.g., earthquake engineering) iGrid 2002
Teams Require Grid Technologies Paris Hong Kong ZIB NCSA AEI WashU Thessaloniki • How Do We: • Maintain/develop Code? • Manage Computer Resources? • Carry Out/monitor Simulation? iGrid 2002
Link NEXRAD Radars with Nested Simulation • Distributed data acquisition (NEXRAD radars) • Distributed dynamic computing • Distributed decision making and data dissemination • Intelligent networking and data routing NEXRAD Regional Linking and Abilene iGrid 2002
Some Projects Needing Grids • NEES earthquake engineering and simulation • integrated simulation, experiment, and collaboration • LHC CERN large hadron collider (LHC) • multiple detectors and international teams • petabytes of data • ALMA millimeter telescope array • mountain top remote site • remote data analysis and management • NEON national ecological observatory network • remote sensing and data correlation • EarthScope • USArray and San Andreas Fault Observatory at Depth iGrid 2002
Grid Services Emerge in Middleware • Global Grid Forum • Formation of Open Grid Services Infrastructure Working Group (OGSI-WG) • Globus • 2000-2002 • Push Concept of “Grid Services” into Network • Development of Application-Specific Toolkits • UNICORE • 2001-2002 • Development of Grid Services compatibility with Globus iGrid 2002
What are Grid Services? • Defined by the Open Grid Services Architecture (OGSA) • The Physiology of the Grid: An Open Grid Services Architecture for Distributed Systems Integration • Draft 2.9, 6/22/2002 • http://www.gridforum.org/ogsi-wg/drafts/ogsa_draft2.9_2002-06-22.pdf • Based on the Open Grid Services Infrastructure (OGSI) • Grid Service Specification • Draft 3, 7/17/2002 • http://www.gridforum.org/ogsi-wg/drafts/GS_Spec_draft03_2002-07-17.pdf • defines the standard interfaces and behaviors of a Grid service • builds on a Web Services base • Open Grid Services Infrastructure Working Group • OGSI-WG formed within the Global Grid Forum • Refinement of infrastructure-related portions of OGSA • OGSA builds on Web services; likely to incorporate specifications defined elsewhere • W3C, IETF, OASIS, others… iGrid 2002
So what are Web Services? • Web Services • self-contained, self-describing, modular ``applications'' (components) that can be published, located, and typically invoked using standard HTTP port 80 • new generation of capabilities using HTTP • not specific to the Web • can perform a a variety of functions and can make use of other Web Services • Web Services consists of • Simplest form • HTTP and XML • Can also (generally do) include any of • SOAP, WSDL, WS-I, XML Query, Z39.50, JDBC, Jini, … iGrid 2002
More on Web Services • Define a means to: • describe software components to be accessed • access these components • enable the identification of relevant services/service providers • They are neutral with respect to: • programming languages • programming models • system software • Web Services provide: • uniform and widely accessible interface and access glue over services • a veneer for programmatic access to existing services • interoperability between middleware solutions iGrid 2002
OK… so what are Grid Services then? • Grid Service: • a Web service that conforms to a set of conventions (interfaces and behaviors) that define how a client interacts with a Grid service • Interfaces address • discovery • dynamic service creation • lifetime management • notification • manageability • Conventions address • naming • upgradeability • Effect • Provide a useful abstraction of capabilities and a simple means of interaction that is independent of implementation iGrid 2002
Relevant Web Services Components • Plenty of web services standards being defined; most relevant for Grid Services are: • Simple Object Access Protocol (SOAP) • means of messaging between a service (provider) and a service requestor (client) • Web Services Description Language (WSDL) • an XML document for describing Web services as a set of endpoints operating on messages containing either document-oriented (messaging) or RPC payloads • WS-Inspection • a simple XML language for locating service descriptions published by a service provider iGrid 2002
Grid Service: Characteristics • Everything is a service • a network enabled entity that provides some capability through the exchange of messages • Dynamic entities • Can be dynamically created/destroyed • Can be upgraded dynamically • Maintain internal state for life of service • Implement one or more interfaces • MUST provide a GridService interface iGrid 2002
Grid Services are Dynamic • Can be dynamically created/destroyed • Explicitly created/destroyed • System failure causing inaccessibility of destruction • Each instance assigned a unique global handle to identify it • Grid service instance Grid Service Handle • Can be upgraded dynamically • ie. support new new protocol versions or to add alternative protocols • Must maintain information related to a specific instance during upgrade • Provides independence in upgrading Grid services Grid Service Reference iGrid 2002
Grid Services: State • Each Grid Service maintains internal state for the life of service • Grid Service Handle • Globally unique name assigned to each Grid service instance • Differentiates between different instances of the service • Invariant over the lifetime of a service instance • Grid Service Reference • Instance-specific information required to interact with a specific service instance • Can change over the lifetime of the service instance iGrid 2002
Grid Services: Interfaces • Grid Service Interface • Set of operations invoked by exchanging a defined sequence of messages • Correspond to portTypes in WSDL • MUST provide a GridService interface • Required for all Grid Services • Standard WSDL operation • Mechanism for obtaining service data • Basic information about a Grid service instance (XML representation) including: • Grid Service Handle • Grid Service Reference • May provide a Registry Interface • Mechanism to support service discovery using a registry service • Used to register a Grid service with a registry service • GridService interface used by registry service to get information about the Grid service • Other defined Grid service interfaces • NotificationSource, NotificationSink, Factory, HandleMap iGrid 2002
Accessing Grid Services Example • Use WSDL to describe • multiple protocol bindings • encoding styles • messaging styles • etc. • Avoid binding specific interactions • Client interface and proxy allow for generalized representation from client application • Allows flexibility in using alternate services support specific bindings • Some performance implications iGrid 2002
Making it More Interesting • Composing services • A service can be a complex composition of other services • Create higher level services: • Accounting service • Workflow service • Authentication service • Data management service • Archive management • Data transfer • Remote Access service • telnet, ssh iGrid 2002
Basic Grid Services Examples • GSI (Grid Security Infrastructure) • PKI-based single sign-on • mutual authentication • users and resources • mapping to local user identifiers and accounts • data privacy and integrity • GSI-enabled SSH • secure, remote access • GridFTP • secure, reliable, high-performance remote access • third-party transfer between storage systems iGrid 2002
Intermediate Grid Services Examples • MDS (Metacomputing Directory Service) • secure information service • distributed access to resource state and status information • GRAM (Grid Resource Allocation & Management) • secure remote access • resource allocation and management • MPICH-G2 • Grid-enabled Message Passing Interface (MPI) • based on the MPICH implementation of MPI • Distributed accounting • distributed access and management of accounting data iGrid 2002
Advanced Grid Services Examples • Replica Management Tools • secure, distributed management • location of replicas of large scientific datasets • GRAM-2 (GRAM extensions) • advance resource reservations • networks, storage, and graphics pipelines • co-reservation of multiple resources • CAS (Community Authorization Service) • group access control and policy management • Condor-G (brokering “super scheduler”) • single submission point for all resources within a virtual organization • co-allocation of multiple resources iGrid 2002
OGSA • Builds on: • Globus Toolkit • Web Services • New definition of a Grid: • “an extensible set of Grid services that may be aggregated in various ways to meet the needs of virtual organizations, which themselves can be defined in part by the services that they operate and share” iGrid 2002
Grid Services Middleware • The Globus Project • http://www.globus.org/ • The Condor Project • http://www.cs.wisc.edu/condor/ • The Legion Project • http://legion.virginia.edu/ • UNICORE • http://www.unicore.de/ iGrid 2002
Building on Grid Services: Toolkits • GridLab • Grid Application Toolkit • APIs for accessing Grid services from e.g. application codes, portals, data managements systems, … • http://www.gridlab.org/ • Grid Application Development Software (GrADS) Project • http://hipersoft.cs.rice.edu/grads/ iGrid 2002
Integration Efforts • NSF Middleware Initiative (NMI) • NSF integration award • http://www.nsf-middleware.org/ • package and deploy NMI software and documents • transparently use and share distributed resources • develop effective scientific collaborations • GRIDS Center Software Suite • http://www.grids-center.org/ • Virtual Data Toolkit • From GriPhyN and iVDGL • Will include NMI and software from GriPhyN for virtual data • http://www.lsc-group.phys.uwm.edu/vdt/ • Grid Starter Kit • UK e-Science Grid product • Globus, SRB, Condor • http://esc.dl.ac.uk/StarterKit/ iGrid 2002
http://www.nordugrid.org/about.html http://lhcgrid.web.cern.ch/LHCgrid/ http://datatag.web.cern.ch/datatag/ http://eu-datagrid.web.cern.ch/eu-datagrid/ http://www.teragrid.org http://www.eurogrid.org/ http://www.nesc.ac.uk/ http://server11.infn.it/grid/ NASA Information Power Grid http://www.ipg.nasa.gov/ http://doesciencegrid.org/ http://grangenet.net Deploying Grid Services Also see: http://www-fp.mcs.anl.gov/~foster/grid-projects/#International Projects and Activities http://www.gridcomputing.com/ iGrid 2002
http://gridtest.hpcnet.ne.kr/ http://www.astrogrid.org Particle Physics Data Grid http://www.ppdg.net/ http://www.gridpp.ac.uk/ http://www.gridlab.org/ http://www.apbionet.org/ http://www.eu-crossgrid.org/ International Virtual Data Grid Laboratory http://www.ivdgl.org/index.php http://www.apgrid.org/ http://www.ascportal.org/ http://www.earthsystemgrid.org http://www.griphyn.org/index.php Grid Projects iGrid 2002
Corporate Buy-In • IBM and Globus Announce Open Grid Services for Commercial Computing • Feb 20, 2002 • http://www-916.ibm.com/press/prnews.nsf/jan/2C818325D8D4D23585256B660050DF6F • United Devices Announces Support for the Open Grid Services Architecture • May 15, 2002 • http://www.ud.com/company/press/press_releases/05132002.htm • Sun Releases Enhanced Grid Computing Services • February 15, 2002 • http://www.internetnews.com/ent-news/article.php/7_975901 • Sun ONE Grid Engine Software • http://wwws.sun.com/software/gridware/ • Platform Globus • http://www.platform.com/products/globus iGrid 2002
Forums • The Global Grid Forum • http://www.globalgridforum.org • The European Grid Forum, EGrid • Part of GGF • http://www.egrid.org/ • Grid Forum Korea • http://www.gridforumkorea.org/ iGrid 2002
References • The Grid: Blueprint for a New Computing Infrastructure • http://www.mkp.com/grids • The Physiology of the Grid: An Open Grid Services Architecture for Distributed Systems Integration • Draft 2.9, 6/22/2002 • http://www.gridforum.org/ogsi-wg/drafts/ogsa_draft2.9_2002-06-22.pdf • Grid Service Specification • Draft 3, 7/17/2002 • http://www.gridforum.org/ogsi-wg/drafts/GS_Spec_draft03_2002-07-17.pdf • An Introduction to Web Services and related Technology for building an e-Science Grid • UK Grid Engineering Task Force – Web and Grid Services Working Group • http://esc.dl.ac.uk/WebServices/ iGrid 2002