250 likes | 420 Views
RESTful Web Services for Scientific Computing. Joshua Boverhof, LBNL Shreyas Cholia, NERSC/LBNL OSCON 2011 July 28 2011, Portland OR. NERSC. National Energy Research Scientific Computing Center DOE Office of Science HPC User Facility at Lawrence Berkeley Lab
E N D
RESTful Web Services for Scientific Computing • Joshua Boverhof, LBNL • Shreyas Cholia, NERSC/LBNL • OSCON 2011 • July 28 2011, Portland OR
NERSC • National Energy Research Scientific Computing Center • DOE Office of Science HPC User Facility at Lawrence Berkeley Lab • Provides high performance compute, data, network and information services to scientists across the world
Web Gateways • Old way - SSH + command line + batch system • People now expect web interfaces for everything • Usability - scientific computing should be as easy as online-banking • don’t want generic options/tools not applicable to your science • don’t want to deal with backend, middleware, UNIX CLI etc.
NERSC Scientific Gateways • DeepSky • Astronomical Image Database 11 million images (70TB) • The Gauge Connection • QCD Lattice Gauge Data • CXIDB • X-Ray Image Data Bank • 20th Century Reanalysis • Reanalysis of 20th Century Climate Data • Dayabay • Dayabay Neutrino Detector Gateway • ESG • Earth System Grid Climate Gateway and Data-node
Motives for developing NERSC Web Toolkit (NEWT) • Make it very easy for science teams to build web gateways to their data and computation • We have already built several science specific gateways - want to encapsulate common patterns • Provide Web APIs for access to backend resources for portal and web front-end developers.
NEWT Web Stack • Web Service • Built with Django Web Framework • Exposes NERSC Resources as HTTP URLs • Generally use REST conventions • Access HPC Resources over the web using HTTP + JSON • Frontend Development • javascript Library “newt.js” • AJAX
Things you can do ... • Authenticate using NERSC credentials • Check machine status • Upload and download files • Submit a compute job • Monitor a job • Get user account information • Store app data • Issue UNIX commands
Files Batch Jobs Status CouchDB NIM Authentication MyProxy CA Client: Web Application - HTML 5/AJAX Accounting Information Internal DB: session, cred, user information System Resources (via Globus) Persistent Store (NoSQL DB) NEWT Django Shell Commands Architecture http request JSON data
RESTful Conventions • Resources represented as a set of URLs • HTTP verbs • GET: Idempotent operation, retrieve resource representation • PUT: Idempotent operation, set resource representation • DELETE: Idempotent operation, delete resource • POST: Avoid overloading to use as RPC. Typically use as a factory resource.
Login Resource: Authenticate • $.newt_ajax({ • url: ”/auth/", • type: ”POST", • data: {'username':username, 'password':password}, • success: (res, textStatus, jXHR) {} • });
Login Resource • $.newt_ajax({ • url: "/login/", • type: ”GET", • success: function(data){}, • }); • 200 OK • {"username": ”joe", "session_lifetime": 14384, "auth": true}
Queue Resource: PBS job submission • $.newt_ajax({ • url: "/queue/hopper/", • type: "POST", • data: {"jobfile": filename}, • success: function(data){ • $("#output").append(data.jobid); • }, • }); • This is a jQuery JavaScript function that calls the NEWT API. NEWT returns a JSON object that looks like • {"status": "OK", "error": "", "jobid" : "hop1234.id" }
Queue Resource: PBS job submission • $.newt_ajax({ • url: "/queue/franklin/", • type: "POST", • data: {"jobscript”: “#PBS -l mppwidth=8\n mpirun -n 8 /bin/hostname”}, • success: function(data){ • $("#output").append(data.jobid); • }, • }); • This is a jQuery JavaScript function that calls the NEWT API. NEWT returns a JSON object that looks like • {"status": "OK", "error": "", "jobid" : "7259874 " }
Command Resource: Fork job submission • $.newt_ajax({ • url: “/command/franklin", • type: "POST", • data: {”executable": “/bin/date”}, • success: function(data){ • $("#output").append(data.jobid); • }, • }); • This is a jQuery JavaScript function that calls the NEWT API. NEWT returns a JSON object that looks like • {"output": "Wed Jul 20 22:51:58 PDT 2011", "error": ""}
Simple Usage curl • $ curl -k -c cookies.txt -X POST -d "username=boverhof&password=$PASS" https://portal-auth.nersc.gov/newt/auth; • {"username": "boverhof", "session_lifetime": 14397, "auth": true} • $ curl -k –b cookies.txt -X GET https://portal-auth.nersc.gov/newt/status/franklin; • {"status": "up", "system": "franklin"} • $ curl -k –b cookies.txt -d "executable=/bin/date" https://portal-auth.nersc.gov/newt/job/franklin/fork/; • {"status": null, "executable": "/bin/date", "user_id": 18, "url": "https://franklingrid.nersc.gov:60886/81661/1311833735/", "jobmanager": "fork", "submitted": "2011-07-20T06:15:35", "machine": "franklin", "finished": null, "output": null, "id": 47789} • $ curl -k –b cookies.txt -X GET https://portal-auth.nersc.gov/newt/job/jobs/47789; • {"status": "DONE", "executable": "/bin/date", "user_id": 18, "url": "https://franklingrid.nersc.gov:60886/81661/1311833735/", "jobmanager": "fork”, "submitted": "2011-07-20T06:15:35", "machine": "franklin", "finished": "2011-07-20T06:15:36", "output": "Wed Jul 20 23:15:36 PDT 2011\n", "id": 47789}
Django settings: Pluggable Authentication • Authenticate using NERSC credentials to a myproxy-server • Add AuthenticationMiddleware • django.contrib.auth.middleware.AuthenticationMiddleware • Configure authentication backend • AUTHENTICATION_BACKENDS = ( 'newt.authnz.myproxy_backend.MyProxyBackend’ ) • Implement authentication backend • class MyProxyBackend: • def authenticate(self, username=None, password=None): • # Myproxy logon
Django settings: File Upload • File Upload: Upload to portal, store in temporary file, then transfer to remote file system. • Configure file upload handler ( settings.py ) • FILE_UPLOAD_HANDLERS= ( 'newt.file.uploadhandler.RemoteCopyTemporaryFileUploadHandler’ ) • Implement authentication backend • from django.core.files.uploadhandler import TemporaryFileUploadHandler as _TemporaryFileUploadHandler • class RemoteCopyTemporaryFileUploadHandler(_TemporaryFileUploadHandler): • def upload_complete(self): • # Transfer to remote filesystem
Implementation Details ( Hacks ) • Django v1.[1,2,3?] support for HTTP verbs lacking • PUT: Data is not loaded, used code “coerce_put_post” from django-piston • Looking at using Tastypie, a webservice API framework for Django. It provides a convenient, yet powerful and highly customizable, abstraction for creating REST-style interfaces.
NOVA: VASP portal https://newt.nersc.gov https://portal-auth.nersc.gov/nova/