1 / 15

Grid HTTP/HTTPS extensions 16 December 2002

Grid HTTP/HTTPS extensions 16 December 2002. Andrew McNab, University of Manchester mcnab@hep.man.ac.uk. Overview. HTTPS as a grid protocol HTTP as a data protcol Multistream HTTP: curl-url-get Grid HTTP/HTTPS usage G-HTTPS Trusted Caches fileGridSite HTTPS server

trisha
Download Presentation

Grid HTTP/HTTPS extensions 16 December 2002

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Grid HTTP/HTTPS extensions 16 December 2002 Andrew McNab, University of Manchester mcnab@hep.man.ac.uk

  2. Overview • HTTPS as a grid protocol • HTTP as a data protcol • Multistream HTTP: curl-url-get • Grid HTTP/HTTPS usage • G-HTTPS • Trusted Caches • fileGridSite HTTPS server • Third Party Transfers • curlfs for SlashGrid • Summary

  3. HTTPS as a Grid protocol • HTTPS is an interesting and important protocol for several reasons: • it is by far the most widely deployed secure protocol • has a large amount of high quality software that we could leverage • has excellent interaction with Firewalls, Network Address Translation and Application Proxies • has the potential to solve some of the problems sites have with private IP farms • along with HTTP, is the basis for Web and Grid Services • HTTPS consists of HTTP/1.1 over an SSL connection • security done by SSL layer, using X509 certificates (including GSI) • HTTP/1.1 (rfc2616) and extensions like WebDAV (rfc2518) have a rich set of methods (GET, PUT, DELETE, COPY etc) headers (“Expires:” etc) and Errors (“413 Request Entity Too Large”) • so a standard way exists for many of the transfer operations we need

  4. HTTP as a data protocol • Same advantages as HTTPS: large amount of existing high quality software, and good operation with Firewalls, NAT etc. • If we build secure HTTPS information/control services, easy to provide HTTP data services: • Do GET during HTTPS session, but server responds with redirect to HTTP data server? • So GridFTP Control & Data channels --> HTTPS Negotiate and HTTP Data connections • Kernel-based “zero-copy” HTTP servers like tux are very efficient • need to do something like that to fully use a machine’s gigabit interface • HTTP connection and a GridFTP data channel are same at TCP layer • but may want a way to specify TCP parameters to be used by HTTP server responding with data

  5. Multistream HTTP • HTTP can support application-level multiple streams and striping by using the standard Range: header from RFC 2616 (HTTP/1.1) to set up many partial fetches. • This mechanism is supported by almost all modern web servers • eg Apache and RedHat’s tux kernel httpd • Multiple streams implemented by client splitting into threads • Each thread requests a block of the file from the server • As each request completes, thread finds next unfetched block and requests it • Striping by doing the same mechanism, but with more than one server • curl-url-get demonstrates both of these • source is 300 lines of C, in EDG CVS

  6. curl-url-get examples • Rough tests done, copying files from Manchester to CERN • elapsed times in seconds, average of 10 copies of each type, alternated Size curl-url-get globus-url-copy streams 292M 64.6±6.1 62.1±4.9 20 292M 96.0±9.2 74.8±3.8 5 29M 7.1±1.4 6.9±1.8 20 29M 31.6±0.4 15.9±0.9 1 2.9M 0.49±0.07 2.24±0.10 20 2.9M 3.30±0.16 2.61±0.18 1 2.9K 2.15±0.04 20 2.9K 0.11±0.00 1.05±0.10 1

  7. Extensions to HTTPS/HTTP • HTTPS/HTTP already have most of the functionality we need for Grid information/control/data transport • some of these come from several sources (eg the WebDAV RFC2518 not just HTTP/1.1 itself) and can be done different ways • so want to specify a sufficient subset for interoperability • However, can identify some extensions that are also needed: • delegation to HTTPS • some way of returning access control information along with data • other metadata too • may want to specify TCP parameters for bulk data tranfer

  8. “G-HTTPS” • A proposal by Akos and me, for backwards compatible extensions to HTTPS • discussed on wp2-sec and wp7-security lists • Adds GSI proxy delegation to HTTPS using additional methods (eg PUT-PROXY) and headers (eg Delegation-ID) • Allows services to return generalised metadata in headers or by URL • initially this allows services to return the GACL ACL of a response for more efficient caching (ie sharing cached copies with other users.) • essential to include expiration and caching policy information too • Aim is to avoid breaking existing HTTPS systems and to achieve “pass through” compatibility: • even if HTTPS client or server software doesn’t understand extensions, they can make them available to the application which does

  9. Example of delegation by HTTPS • Client issues GET-PROXY-REQ request, perhaps with a message body specifying any extensions required in the proxy cert • Server generates a key and a certificate request, returns this in the response message body. • Client signs this, and returns it in the body of a PUT-PROXY request • Need a Delegation-ID header in the above exchanges so can keep track of the delegation session • may want to maintain delegation sessions for the same user at one server, but with different amounts of delegation • Subsequent GET, PUT etc actions carry on using the Delegation-ID • Non G-HTTPS server will respond with “501 Method not implemented” to above methods

  10. Application of delegation: Trusted Caches • Many information services are going to need delegation, but Trusted Caches are one purely file transfer application of this • Existing HTTPS isn’t cache-able: • connection from client to origin server for trust to mechanism work • So best you get is opaque proxying/tunneling of SSL • With delegation, can improve this: • identifies a caching server it trusts (in its VO maybe?) • delegates a credential to it • makes an HTTP proxy request via HTTPS: GET http://a.b.c/def • caching server fetches this using delegated credential, gives it to client • if can get an ACL for this file, may be able to return file from cache in subsequent requests • also means that only real HTTPS works, not other things hidden in SSL

  11. fileGridSite • Read (GET) well supported by HTTPS servers. • However, write (PUT, DELETE, MOVE, COPY) usually left to CGI programs, servlets etc. • Access control also usually limited to client IP or HTTP passwords. • fileGridSite adds Grid authorisation and write operation support to Apache • a cut-down version of GridSite (used for https://marianne.in2p3.fr) • file rather than webpage orientated (no fancy headers on HTML etc) • uses GACL to handle the Access Control Lists • can work with mod_ssl-GSI so clients can authenticate with a GSI proxy • Turns an Apache webserver into a Grid HTTPS fileserver with the key functionality of a GridFTP server.

  12. fileGridSite examples with curl • Curl is a standard HTTP/HTTPS command line client (cf wget) • Get a file using GSI proxy in /tmp/x509up_u100 • curl --capath /etc/grid-security/certificates/ --cert /tmp/x509up_u100 https://a.b.com/example1.txt • Copy a file to the fileGridSite server with HTTP PUT: • curl --capath /etc/grid-security/certificates/ --cert /tmp/x509up_u100 --upload-file /tmp/example2.txt https://a.b.com/example2.txt • Delete a file with HTTP DELETE: • curl --capath /etc/grid-security/certificates/ --cert /tmp/x509up_u100 --request DELETE https://a.b.com/example2.txt • Create a directory with PUT to …/ • curl --capath /etc/grid-security/certificates/ --cert /tmp/x509up_u100 --request PUT https://a.b.com/newdir/

  13. Adding delegation to fileGridSite • Doing this as a demonstration of G-HTTPS extensions • Delegation needed for Third Party Transfers • Use COPY from WebDAV RFC2518 which allows source or destination to be absolute URL’s • Spec actually allows “fourth party” too, involving two remote URL’s and the transfer being tunneled through the server. • Delegation also useful for fileservers which need credentials to access local storage • to get token for local AFS cell (Lyon have had to work around this with GridFTP servers)

  14. curlfs for SlashGrid • curl is built on top of a general library, libcurl • handles persistent HTTP and HTTPS connections, SSL setup etc • To add HTTP and HTTPS filesystems to SlashGrid, have made a libcurl filesystem plugin: curlfs • This maps parts of the URL space into the local filesystem: • https://a.b.com/newdir/ ---> /grid/https/a.b.com/newdir/ • Works with any standard HTTP or HTTPS server • rpm -i /grid/http/datagrid.in2p3.fr/distribution/globus/beta-21/RPMS/* • SlashGrid framework provides GSI proxy or full cert/key to curlfs so it can make authenticated requests. • Write with HTTP/1.1 PUT and DELETE being added to curlfs • Will complement fileGridSite support for these on server side

  15. Summary • HTTPS as a grid protocol • G-HTTPS extensions being worked out • HTTP as a data protocol • even a quick multistream HTTP hack seems very competitive • fileGridSite HTTP(S) server has been written • supports read/write with standard utilities like curl • third party transfers being added as demonstration of delegation • curlfs written for SlashGrid: maps URL’s into filesystem • Source code for curl-url-get, fileGridSite, curlfs is in EDG CVS • See http://www.gridpp.ac.uk/authz/ for more details

More Related