460 likes | 636 Views
An Overview of Computational Grid Technologies. Marlon Pierce Community Grids Laboratory Indiana University mpierce@cs.indiana.edu. Grids in I533 Context. Client Environments: Portals, Taverna, etc. Security, Reliability, etc. Workflow, Information, Sharing, Ontology Services.
E N D
An Overview of Computational Grid Technologies Marlon Pierce Community Grids Laboratory Indiana University mpierce@cs.indiana.edu
Grids in I533 Context Client Environments: Portals, Taverna, etc Security, Reliability, etc Workflow, Information, Sharing, Ontology Services PubChem, etc Gaussian, Data Mining Logical File Systems General Data Services General Exec Services General File Services Web Service Core Specifications (Verbal description on next slide)
Grids in I533 Context • I533 covers a diverse set of topics. • (Web) Services are the core abstraction • Execution Services: computational chemistry, data mining, text processing • Data Services: PubChem, OGSA-DAI • Information and metadata services: Ontologies, information discovery and sharing. • Orchestration services (workflow): Taverna, BPEL, etc. • Grids are collections of services with some glue • Decentralized security, information system agreements (from monitoring to metadata), abstract execution protocols, etc. • Service Oriented Architecture
Brief History of Grids • The term “Grid Computing” was coined by Dr. Larry Smarr, then director of NCSA, back in 1992. • The original concept: computing power should be available on demand, for a fee. • Just like the electrical power grid. • Today, Grids are thought of as federations of services that span organizations. • Grids are usually driven by science applications. • Most core funding from the DOE, NSF, UK e-Science, and other scientific agencies in the EU, Japan, China, Korea, etc. • These agencies all cooperate to some degree. • DOD has its own version of things, the Global Information Grid, that is currently unrelated. • IBM, MS, Oracle, Sun, etc have varying degrees of interest. ...
Grid Computing Research • Historically, grid computing has been targeted at simplifying access to high performance computing and giant scientific data sets. • Example: NSF TeraGrid includes both hardware and software along with a common administration infrastructure. • www.tergrid.org • IU is one of the partners. • There are many overviews of Grid computing. • See for example Globus World presentations from 2004, 2005 • Show lots of “gee whiz” pictures of big science problems using the Grid. • Usually mention seti@home, and more recently, Google and Bittorent. • These annoy me. • Seti@home has nothing to do with Grid computing.
Grid Computing Research • Grid computing is large scale distributed computing research. • “Middleware” • It’s not the pervasive computing power Grid originally envisioned. • As long as its research, we get to keep working on it. • I’ll examine some key technologies for building a Grid installation, but not “the” Grid. There is no Grid! Dr. Dave Semeraro has his doubts.
Some Desirable Grid Characteristics • Grids are collections of services. • Accessing computational facilities to run codes. • Accessing remote databases, data warehouses and file systems. • Transferring large data sets. • Accessing remote instruments and sensors. • Collections are created from multiple partners: Virtual Organizations • Must support decentralized management. • Common security abstraction layer • Authentication: required and solved. • Authorization: Research 4Ever! • Common information infrastructure • Monitoring hardware and networks: required and solved • Finding resources (i.e. “Semantic Grid”) Research 4Ever! • Ex: TeraGrid combines NCSA, SDSC, IU, TACC, ORNL, Purdue, ... • Generations • Generation 1: UNIX daemons, command-line clients, protocol-based. • Generation 2: Based on Web Service standards
Virtual Organisation VirtualOrganisation Virtual Organisation Physical Organisation Physical Organisation Virtual Organization View of Deployment I. Foster, www.usipv6.com/ppt/fosteripv6andGridJune2003.ppt Physical Organisation Physical Organisation
Making Interoperable Tools • There are a large number of Grid-related research projects and tools. • They need some common protocols • Not just wire protocols but also security procedure protocols. • Two most important • GSI: A global security system • GRAM: a global method for executing remote operations. • Grid standards and would-be standards are defined through the Global Grid Forum. • We will concentrate on the Globus Toolkit in these lectures, but GSI and GRAM are important to several other projects. • Condor, SRB, Sun Grid Engine, etc.
Globus Services Landscape We’ll start here. www.griphyn.org/documents/document_server/uploaded_documents/doc--1515--GT4_GriPhyN.ppt
Grid Security Infrastructure An overview
Grid Security Infrastructure Keywords • Public Key Infrastructure (PKI) • Most Grid use asymmetric encryption keys • Based on OpenSSL but with GSSAPI extensions • Users have a public key and a private key. • Public keys can decrypt messages encrypted by private keys and vice versa. • Public key: encrypts a message • Private key: signs a message. Only you have the private key, so only you can generate that specific signature. • I encrypt with your public key and sign with my private key. • Only you can unencrypt, and you know it came from me. • PKI tools are part of Java’s SDK, so try them out. • Certificate Authorities: establishing trust. • Can you trust a public key? • Yes, if you trust the signer. • Large Grids have CAs. • You can run your own with SimpleCA. • CAs can be hierarchical.
More Keywords: GSS API • Generic Security Service API (GSSAPI) • PKI is slow and symmetric keys are much faster. • GSSAPI establishes a “context” between two communicators by sharing a secret symmetric session key. • Very similar protocol to WS-SecureConversation • Java implementation part of standard SDK release. • Try it out, but it requires Kerberos • GSI uses the GSSAPI to establish security contexts. • We will see how to program clients in the next lecture.
Single Sign On and Delegation • Single Sign On • A “Grid” implies that you can access lots of machines, but not necessarily anonymously. • Charged for usage: supercomputer centers issue allocations. • SSO is the ability to login once, get a ticket, and access many machines without constantly providing username and password. • GSI is very similar to a somewhat older system called Kerberos, which you can still get. • Delegation is the security concept that supports this. • In practice, GSI handles delegation by resigning credentials. • Take advantage of hierarchical CA organization for trust.
Credential Delegation in GSI Butler et al, http://www.globus.org/alliance/publications/papers/butler.pdf
A Public Key rainier.extreme.indiana.edu% more usercert.pem Bag Attributes localKeyID: 01 00 00 00 subject=/DC=org/DC=doegrids/OU=People/CN=Marlon Pierce 64229 issuer= /DC=org/DC=DOEGrids/OU=Certificate Authorities/CN=DOEGrids CA 1 -----BEGIN CERTIFICATE----- MIIDJjCCAg6gAwIBAgICFMYwDQYJKoZIhvcNAQEFBQAwaTETMBEGCgmSJomT8ixk ----------------------[Stuff deleted]--------------------------------- rlCbtrvQjT79qYIutfFSxwre52OV7p7f/3Uufj0wO4f4hq5Jt05uofQU -----END CERTIFICATE-----
A Private Key rainier.extreme.indiana.edu% more userkey.pem Bag Attributes localKeyID: 01 00 00 00 1.3.6.1.4.1.311.17.1: Microsoft Enhanced Cryptographic Provider v1.0 friendlyName: 6f50c542f27d23ca349e371673b2ff8d_2586cc29-aa58-4f69-b023-bbcac12e129e Key Attributes X509v3 Key Usage: 10 -----BEGIN RSA PRIVATE KEY----- Proc-Type: 4,ENCRYPTED DEK-Info: DES-EDE3-CBC,42533BEF0D5016EB xxQ8IF5UL1rFeWm4hbZBNYNB5TpHl8FqeRPOJk03fltcHyETdndP4GJqLNxHMcxk fy9As9v49HDSpHde/3jMu9L9q8LXSkG6WmFZgI35nsqjCTcstMdNnZ2P+jxp9sk7 -----------------------[Stuff Deleted]----------------------------------------------------------- 1rts6i6ZDYFzsCpnu+rOsa0kolp+r0zRI0uiiIbOxU9jOtVTiHPsUg== -----END RSA PRIVATE KEY-----
MyProxy Credential Repository • Private keys are troublesome and dangerous. • You need to put one on every machine that you may use for initial login. • This increases chance it will get stolen. • Can be placed on expensive smart cards. • Solution: MyProxy Server • On-line credential repository. • Issues short-term keys to any client that knows the username and password. • Very convenient for Web portal applications. J. Basney, http://grid.ncsa.uiuc.edu/myproxy/talks.html
Grid as a Virtual Organization • Now that we have an SSO, we can set this up across many different partner sites. • Use one super-CA or at least mutually trust our partner CAs. • That is, my org will trust messages signed by your CA. • This is the beginnings of a “Virtual Organization”. • Real organizations contribute resources to the VO. • VOs can be long-lived. • TeraGrid, Open Sciences Grid • Ad-hoc Grids are more of a research issue.
GSI in Action: GridFTP • GSI is not a service itself. • You use it to build secure services. • These services inherit several capabilities • They can authenticate to each other. • Messages are secure • Encrypted, non-repudiated, tamper-proof, replay-proof, etc. • You can delegate two remote services to take an action on your behalf. • GridFTP is an example of a GSI enabled service. • File operations and transfers, based on standard IETF FTP protocol. • Supports parallel TCP • Supports striping: several GridFTP servers can act as a logical GridFTP server, each working on a different data subset. • A nice summary: www.nesc.ac.uk/talks/563/Day2_1020_GridFTP.ppt
GridFTP Third Party Transfer Cartoon GridFTP Client Credential “Move File X to Host B.” Host A GridFTP Source Server Host B GridFTP Destination Server Delegated Credential
GridFTP Clients • Command line clients • globus-url-copy • uberftp • Programming interfaces: build your own client. • Java and Python CoG Kits • Java CoG reviewed next lecture.
What Is GRAM? • GRAM is a protocol for mapping generic user requests to specific actions. • Heritage: must execute jobs on supercomputers. • Interactive: use Unix fork. • Queue Systems: PBS, LSF, Condor, Sun Grid Engine, etc. • This must take place as the user. • Allocation accounting, logging, general peace of mind at stodgy HPC centers. • Note this is very different from e-Business. • You don’t need a database account to buy something from Amazon.
Pre-Web Service GRAM Components MDS client API calls to locate resources Client MDS: Grid Index Info Server Site boundary MDS client API calls to get resource info GRAM client API calls to request resource allocation and process creation. MDS: Grid Resource Info Server Query current status of resource GRAM client API state change callbacks Globus Security Infrastructure Local Resource Manager Allocate & create processes Request Job Manager Create Gatekeeper Process Parse Monitor & control Process RSL Library Process Yikes...
GRAM Job Specifications • The major purpose of GRAM is to execute one or more remote commands on the user’s behalf. • Abstract UNIX shell, PBS, Condor, etc. • So how do you specify the command? • Pre-Web Service Grids (i.e. based on Globus 2) uses the Resource Specification Language (RSL). • Web Service Grids (i. e. based on Globus 4) use the XML Job Description Language.
GRAM Client Tools • You can execute remote commands using clients tools • We will develop Java clients next time. • GT 2 command line examples (with RSL) • globusrun: all purpose client • globus-job-run: interactive jobs • globus-job-submit: batch jobs • globus-job-cancel: stop batch jobs • GT 4 command line examples (with JDL) • globusrun-ws: all purpose client • globus-job-run-ws: interactive job submission • globus-job-submit-ws: batch job submission • globus-job-clean-ws: stop batch jobs.
Sample RSL String • The following runs the UNIX echo and the • This is an argument to globusrun. • Use this to execute “echo” and “mpi-hello”. (* Multijob Request *) +(&(executable = /bin/echo) (arguments = Hello, Grid From Subjob 1) (resource_manager_name = resource-manager-1.globus.org) (count = 1) ) ( &(executable = mpi-hello) (arguments = Hello, Grid From Subjob 2) (resource_manager_name = resource-manager-2.globus.org) (count = 2) (jobtype = mpi) )
A Very Simple Job Description <job> <executable>/bin/echo</executable> <directory>/tmp</directory> <argument>12</argument> <argument>abc</argument> <argument>this is an example string </argument> <environment> <name>PI</name> <value>3.141</value> </environment> <stdin>/dev/null</stdin> <stdout>stdout</stdout> <stderr>stderr</stderr> </job> http://www.globus.org/toolkit/docs/4.0/execution/wsgram/user-index.html#s-wsgram-user-commandline
More Details on Job Submission • The full Job Description Schema is here: • http://www.globus.org/toolkit/docs/4.0/execution/wsgram/schemas/gram_job_description.html#SchemaProperties • You can do much more complicated things. • Run sequences of jobs. • Stage files with GridFTP. • Delegate jobs to other GRAMs. • But this is controversial. • Lots of people have worked on job management workflow systems. • Several based on Apache Ant, for example. • BPEL is the Web Service standard.
Globus Services Landscape Now we are up here. www.griphyn.org/documents/document_server/uploaded_documents/doc--1515--GT4_GriPhyN.ppt
Grids and Web Services • The requirements of Grids are very similar to those of Service Oriented Architecture-based systems. • Grid and Web Service integration began in 2002. • Open Grid Services Architecture: “Physiology of the Grid” paper for Foster et al. • Aborted start in Globus Toolkit 3, OGSI • Current Globus Toolkit 4 much more successful. • OGSA-DAI, Condor, and SRB all have Web Service interfaces. • Many UK e-Science projects also follow a similar approach. • Sometimes referred to as the “WS-I+” approach to distinguish it from the Globus/IBM approach. • See http://grids.ucs.indiana.edu/ptliupages/publications/WebServiceGrids.pdf • See OMII releases
GT4 GRAM Structure: WSRF/WSN Poster Child Service host(s) and compute element(s) GT4 Java Container Compute element Local job control GRAM services GRAM services Local scheduler Job functions sudo GRAM adapter Delegate Transfer request Delegation Client Delegate GridFTP User job RFT File Transfer FTP control FTP data Remote storage element(s) GridFTP www.griphyn.org/documents/document_server/uploaded_documents/doc--150VDS_1.4_Plans.2005.0429.ppt
IPCReceiver DataChannel DataChannel MasterDSI SlaveDSI Protocol Interpreter SlaveDSI Protocol Interpreter IPCReceiver Data Channel Data Channel MasterDSI IPC Link IPC Link Reliable File Transfer: Third Party Transfer www.griphyn.org/documents/document_server/uploaded_documents/doc--150VDS_1.4_Plans.2005.0429.ppt • Fire-and-forget transfer • Web services interface • Many files & directories • Integrated failure recovery RFT Client SOAP Messages Notifications(Optional) RFT Service GridFTP Server GridFTP Server
Grid Web Service Extensions • WSDL and SOAP form the core of Grid services. • WS-Addressing and WS-Security family are important. • Globus and friends are working to extend core Web Service standards through OASIS. • WS-Resource Framework (WSRF): modeling stateful resources. • WS-Notification: Web Service version of one-to-many messaging.
Stateful Resources and Grids • Web Service Architectures and thus Grids are really message oriented, not RPC based. • All state should be in the SOAP message. • This allows messages to go through many SOAP intermediaries. • Request/response does not really map to Grid requirements. • Services may take hours or days to complete, so need callbacks. • Ex: computational chemistry codes on TeraGrid, RFT for many TB of data. • Services may need to push information to listeners. • “Big file 1 is done, now move big file 2” • Grid resources may also come and go. • Instruments typically generate data at scheduled times. • Down for maintenance, upgrades, reconfiguration, etc. • WSRF and WS-Notification attempt to solve these Grid requirements.
Web Service Resource Framework • WSRF is a collection of WSDL specifications and associated messages. • WS-Resource • WS-ResourceProperties • WS-ResourceLifetime • WS-ServiceGroup • WS-BaseFault • See http://www.oasis-open.org/committees/tc_home.php?wg_abbrev=wsrf
WS-Resource • The WS-Resource decouples a (stateful) resource from the Web Service that accesses it. • For example, a database is a resource that may be accessed through a Web Service. • The resource may be defined by metadata. • Our database needs to provide clues to the type of data it contains. • Need this for discovery. • This metadata is contained in WS-ResourceProperties
Provide a metadata property framework for describing resources. Provide a Web Service interface for performing operations on these properties. Query and retrieve properties. Update values on a resource (controversial). Subscribe to property changes. Use XML Schemas to hold WSDL message definitions that define the resource properties. Associate these messages with WSDL portTypes. The actual values of the Schema are in an XML document. Store it in memory, put it in a database, derive it at query time, ... Goals of WS-ResourceProperties This requires some understanding of WSDL and SOAP. Upcoming lecture will cover this.
Goals of WS-ResourceLifetime • Resources may have lifetimes. • For example, your quantum chemistry calculation may take a few hours. • This may be associated with a WS-Resource. • WS-ResourceLifetime defines methods for • Destroying a resource at some future time (and t=0 allowed). • Learning the lifetime of a resource. • Extending the lifetime of a resource.
WS-Notification Core Specs • WS-BaseNotification • Specs for controlling publications and subscriptions of events (i.e. resource property changes.) • Subscribers subscribe directly to publishers. • WS-Topics • Topics are used to organize messages. • You may publish or subscribe to a topic rather than a specific resource endpoint. • WS-BrokeredNotification • Brokers decouple publishers from subscribers.
WS-Notification • Stateful resources will need to notify one or more listeners when their state changes. • For example, a Web lecture has many events. • Beginning and end of the lecture. • Changes in slides. • To my knowledge, no one has tried this. • Real examples based on WS-GRAM, RFT.
A Skeptical View of WSRF • WSRF has several independent implementations. • WSRF.NET (UV), Python (LBL), Perl (UK), C/C++ (ANL) ,... • But is this critical mass? • What about MS, Oracle, and other big Web Service players. • OASIS specification approval is glacial. • Many specs, even if approved, have died on the vine for lack of backing. • Many more are a mess because of complicated dependencies. • WS-Addressing has released many versions, screwing up many dependent specs. • Competing specs exists. • MS’s WS-Eventing, for example. • “Semantic Grid” using an entirely different approach for metadata. • RDF, OWL provide more natural modeling of metadata than tree-based XML Schemas. • Ignores UDDI as an information system. • I ran out of room.
Future Challenges • Real time interaction • Joy of use • Intuitive user interface • Global scalability • 1000s of simultaneous users • Addictive • (Observation courtesy Prof. Fran Berman)