350 likes | 470 Views
Grid Networks. What is Grids?. Cluster of clusters – geographically distributed and connected with high-speed MAN and WAN links. Made up of tens to thousands of small commodity servers interconnected with scalable, high-performance Ethernet networks. Typical Grid Computing Model.
E N D
What is Grids? • Cluster of clusters – geographically distributed and connected with high-speed MAN and WAN links. • Made up of tens to thousands of small commodity servers interconnected with scalable, high-performance Ethernet networks.
Typical Grid Computing Model http://www.doc.ic.ac.uk/~sjn5/INDOUK/TYM-GCG.pdf
Why Grids? • A biochemist exploits 10,000 computers to screen 100,000 compounds in an hour • 1,000 physicists worldwide pool resources for petaop analyses of petabytes of data • Civil engineers collaborate to design, execute, & analyze shake table experiments • Climate scientists visualize, annotate, & analyze terabyte simulation datasets • An emergency response team couples real time data, weather model, population data http://www.nesc.ac.uk/talks/talks/BiGUM1.pdf
Why Grid? (contd.) • A multidisciplinary analysis in aerospace couples code and data in four companies • A home user invokes architectural design functions at an application service provider • An application service provider purchases cycles from compute cycle providers • Scientists working for a multinational soap company design a new product • A community group pools members’ PCs to analyze alternative designs for a local road http://www.nesc.ac.uk/talks/talks/BiGUM1.pdf
The Grid Problem • Flexible, secure, coordinated resource sharing among dynamic collections of individuals, institutions, and resource From “The Anatomy of the Grid: Enabling Scalable Virtual Organizations” • Enable communities (virtual organizations”) to share geographically distributed resources a s they pursue common goals – assuming the absence of • Central location, • Central control, • Omniscience, • Existing trust relationships. http://www.nesc.ac.uk/talks/talks/BiGUM1.pdf
Why Now? • Moore’s law improvements in computing produce highly functional end-systems • The Internet and burgeoning wired and wireless provide universal connectivity • Changing modes of working and problem solving emphasize teamwork, computation • Network exponentials produce dramatic changes in geometry and geography http://www.nesc.ac.uk/talks/talks/BiGUM1.pdf
Network Exponentials • Network vs. computer performance • Computer speed doubles every 18 months • Network speed doubles every 9 months • 1986 to 2000 • Computers: x 500 • Networks: x 340,000 • 2001 to 2010 • Computers: x 60 • Networks: x 4000 Moore’s Law vs. storage improvements vs. optical improvements. Graph from Scientific American (Jan. 2001) by Cleo Viett, source Vined Khoslan, Kleiner, Caufield and Perkins. http://www.nesc.ac.uk/talks/talks/BiGUM1.pdf
Broader Context • “Grid Computing” has much in common with major industrial trusts • Business-to-business, Peer-to-peer, Application Service Providers, Storage Service Providers, Distributed Computing, Internet Computing… • Sharing issues not adequately addressed by existing technologies • Complicated requirements: “run program X at site Y subject to community policy P, providing access to data at Z according to policy Q” • High performance: unique demands of advanced & high-performance systems http://www.nesc.ac.uk/talks/talks/BiGUM1.pdf
The Globus ProjectTM • Close collaboration with real Grid projects in science and industry • Development and promotion of standard Grid protocols to enable interoperability and shared infrastructure • Development and promotion of standard Grid software APIs and SDKs to enable portability and code sharing • The Globus ToolkitTM: Open Source, reference software based for building grid infrastructure and applications • Global Grid Forum: Development of standard protocols and APIs for Grid computing http://www.nesc.ac.uk/talks/talks/BiGUM1.pdf
Layered Grid Architecture “Coordinating multiple resources”: Ubiquitous infrastructure services, App-specific distributed services “Sharing single resources”: Negotiating access, controlling use “Talking to things”: communication (Internet protocols) & security “Controlling things locally”: Access to, & control of, resources http://www.nesc.ac.uk/talks/talks/BiGUM1.pdf
The Single System Model User Interface / API Authentication Authorization Accounting Resource Discovery Process Management Message Passing Data Management Operating System Storage Compute http://www.ncbiogrid.org/resources/slides/grid-overview.ppt
What Makes a Cluster? • Uses a Distributed Resource Manager (DRM) to manager job scheduling • Tightly coupled - High speed, low latency interconnect network • Shared storage for home directories, high throughput scratch space, applications • Fairly homogenous - Configuration management is important! • Single administrative domain • User accounts managed with traditional mechanisms http://www.ncbiogrid.org/resources/slides/grid-overview.ppt
High Speed Interconnect The Cluster Model Master Node User Interface/API 3A RD PM MP DM Cluster DRM Configuration Management Shared Storage Cluster DRM Cluster DRM Cluster DRM Cluster DRM 3A RD PM MP DM 3A RD PM MP DM 3A RD PM MP DM 3A RD PM MP DM Operating System Operating System Operating System Operating System Storage Compute Storage Compute Storage Compute Storage Compute Cluster Node Cluster Node Cluster Node Cluster Node http://www.ncbiogrid.org/resources/slides/grid-overview.ppt
How is an Enterprise Grid Different from a Cluster? • Heterogeneous - Clusters, SMP, even workstations of dissimilar configurations, but all are tied together through a grid middleware layer • Lightly coupled - Connected via 100 or 1000Mbps Ethernet • Introduces a resource registry and grid security service • But usually only a single registry and security service for the grid • Not necessarily a single administrative domain http://www.ncbiogrid.org/resources/slides/grid-overview.ppt
Enterprise LAN or WAN Cluster Interface Cluster Interface Cluster Interface Cluster Interface Cluster Interface Cluster Interface AA AA AA AA AA AA RD RD RD RD RD RD PM PM PM PM PM PM MP MP MP MP MP MP DM DM DM DM DM DM Operating System Operating System Operating System Operating System Operating System Operating System Storage Storage Storage Storage Storage Storage Compute Compute Compute Compute Compute Compute The Enterprise Grid Model User Interface/API 3A RD PM MP DM Grid Interface Resource Registry Security Infrastructure Grid Interface Grid Interface Grid Interface Grid Interface 3A RD PM MP DM 3A RD PM MP DM 3A RD PM MP DM 3A RD PM MP DM Cluster DRM Cluster DRM Operating System Operating System Storage Compute Storage Compute SMP SMP http://www.ncbiogrid.org/resources/slides/grid-overview.ppt
How is a Global Grid Different from an Enterprise Grid? • "Grid of Grids" - Collection of enterprise grids • Loosely coupled between sites • Mutually distrustful administrative domains • Multiple grid resource registries and grid security services http://www.ncbiogrid.org/resources/slides/grid-overview.ppt
WAN LAN LAN LAN The Global Grid Model Site B SMP Cluster Cluster Cluster Site A Grid Grid Grid Grid UI/API Grid RR SI RR SI UI/API Grid Site C Grid Grid Grid Grid SMP SMP Cluster Cluster RR SI UI/API Grid Grid Grid Grid Grid SMP SMP SMP Cluster http://www.ncbiogrid.org/resources/slides/grid-overview.ppt
Grid Platforms Examples: Globus
Grid Platform Example: Globus Toolkit V2 • Primary development occurred at Argonne National Labs • Principals were Ian Foster and Carl Kesselman • Open source • But architecture development was a closed process • Toolkit approach: different “bundles” that can be installed depending upon what functions are desired • API through CoG (Commodity Grid) kits • Java, Python, CORBA, Perl, Matlab, Web services, JSP (JavaServer Page) http://www.ncbiogrid.org/resources/slides/grid-overview.ppt
Globus Toolkit V2 • Majority of its use is in university and government research environments • Some vendors offer value-added versions • IBM Grid Toolbox • Platform Globus • NSF Middleware Initiative (NMI) is packaging pre-built Globus with other relevant components • NWS (Network Weather Service) • KX.509/KCA (Kerberos-X.509 integration) • Condor-G as a “metascheduler” • GSI-enabled OpenSSH * GSI :Grid security Infrastructure http://www.ncbiogrid.org/resources/slides/grid-overview.ppt
Globus Toolkit V2 “Pillars” Resource Management (GRAM) Information Services (MDS) Data Management (GASS) Grid Security Infrastructure (GSI) http://www.ncbiogrid.org/resources/slides/grid-overview.ppt
Globus Toolkit V2 Stack GRAM MDS GASS/GridFTP HTTP LDAP FTP GSI TLS/SSL TCP/IP http://www.ncbiogrid.org/resources/slides/grid-overview.ppt
Globus Toolkit V2 Key Components • Grid Resource Allocation Manager (GRAM) • Server-side: “gatekeeper” process that controls execution of job managers • Client-side: “globusrun” UI to launch jobs • Monitoring and Directory Service (MDS) • GRIS: Grid Resource Information Service collects local info • GIIS: Grid Index Information Service collects GRIS info • Global Access to Secondary Storage (GASS) • GridFTP, implemented through “in.ftpd” daemon and “globus-url-copy” command • Files accessed through a URI, e.g. gsiftp://node1.ncbiogrid.org/data/ncbi/ecoli.nt http://www.ncbiogrid.org/resources/slides/grid-overview.ppt
Globus Toolkit V2 Key Components: GSI • Uses a TLS/SSL-based PKI infrastructure • All server resources (i.e. gatekeeper, GRIS) and users have a public key that has been digitally signed by the CA (the “certificate”) and a private key • “grid-cert-request” to generate key pair • User/sysadmin sends the public key to CA • CA signs the public key with its private key and returns to the signed certificate to the user/sysadmin • The user/sysadmin stores the signed certificate in the local filesystem • Certificate contains: the subject name, the subject’s public key, the CA’s name, and the CA’s signature http://www.ncbiogrid.org/resources/slides/grid-overview.ppt
Globus Toolkit V2 Key Components: GSI • Logging in to the grid (“grid-proxy-init”): • User creates a temporary public-private key pair • User’s private key is used to digitally sign the temporary public key -- this becomes the “proxy” certificate • This creates a chain of trust from the CA to the user to the proxy certificate • The proxy certificate and associated private key are transmitted with a job • The proxy certificate can be used to issue commands on remote servers on the user’s behalf (“delegation”) • On remote servers, there is a “grid-mapfile” that maps user cert subject names to local userids http://www.ncbiogrid.org/resources/slides/grid-overview.ppt
Globus Toolkit V2 Additional Components • Grid Packaging Tools (GPT) • Used to build (“gpt-build”), install (“gpt-install”) and localize (“gpt-postinstall”) Globus components • MPICH-G2 • A Globus V2 enabled version of MPI (Message Passing Interface) • Based on MPICH • Utilizes GSI, MDS and GRAM http://www.ncbiogrid.org/resources/slides/grid-overview.ppt
Network Grid Node Grid Node Grid Node Grid Node gatekeeper gatekeeper gatekeeper gatekeeper GRIS GRIS GRIS GRIS in.ftpd in.ftpd in.ftpd in.ftpd Globus Toolkit V2 Network Services Client Node GRAM Client GIIS Server Certificate Authority http://www.ncbiogrid.org/resources/slides/grid-overview.ppt
GRAM, MDS and GASS Interactions GRAM MDS GASS process resource process resource process resource GIIS GridFTP in.ftpd LDAP LDAP job manager GRIS gatekeeper RSL/DUROC/HTTP 1.1 LDAP LDAP gsiftp job allocation job management resource discovery data transfer data control user / proxy Client http://www.ncbiogrid.org/resources/slides/grid-overview.ppt
Strengths: Mindshare and collaboration in both industry & academia Open source Standards-based underpinnings (e.g. SSL, LDAP) Flexibility and CoG API's Driving OGSA with heavy resource commitment from IBM Weaknesses: Significant effort required to get applications working on a grid Not production quality at this time No “metascheduler” -- user has to explicitly tell their jobs where to run Globus Toolkit V2 Strengths and Weaknesses http://www.ncbiogrid.org/resources/slides/grid-overview.ppt
References • Dr. Carl Kesselman, “Grid Computing” carl@isi.edu Information Sciences Institute, University of Southern California Joint work with Ian Foster, ANL and U Chicago • Bryan Carpenter, Geoffrey Fox, and Marlon Pierce, “e-Science e-Business e-Government and their Technologies Introduction” dbcarpen@indiana.edu, gcf@indiana.edu, mpierce@cs.indiana.edu Pervasive Technology Laboratories, Indiana University http://www.grid2004.org/spring2004
References • Fran Berman and Anthony J.G. Hey, “Grid Computing: Making The Global Infrastructure a Reality,” Wiley, ISBN: 0-470-85319-0, March 2003 • “High-Performance Computing with Scalable Server Cluster and Grid Networks,” FORCE10 http://www.force10networks.com/applications/pdf/ClusterGridapV1_0.pdf
References • Ian Foster, et al., “The Anatomy of the Grid,’ http://www.globus.org/research/papers/anatomy.pdf • Ian Foster, et al, “Computational Grid,” http://www-fp.globus.org/research/papers/chapter2.pdf • “Grid Networks,” ITU, http://www.itu.int/osg/spu/newslog/categories/gridNetworks/
References • Steve Tuecke, “National eScience Core Programme & Grid Highlights,” http://www.nesc.ac.uk/talks/talks/BiGUM1.pdf • J. Charles Kesler, “Grid Overview,” http://www.ncbiogrid.org/resources/slides/grid-overview.ppt