330 likes | 463 Views
Implementing the Data Access Protocol in Python. Dr. Rob De Almeida. Table of Contents. History Current implementation Client Server Plugins & responses WSGI & Paste Future. History. pyDAP is a free implementation of the Data Access Protocol written in Python from scratch
E N D
Implementing the Data Access Protocol in Python • Dr. Rob De Almeida
Table of Contents History Current implementation Client Server Plugins & responses WSGI & Paste Future
History pyDAP is a free implementation of the Data Access Protocol written in Python from scratch It is the product of naïveness and determination :)
Why Python? Object-oriented high level programming language that emphasizes programmer effort (vs. computer effort) Increasing usage in science (CDAT, MayaVi) and web (Google, YouTube) Advantages: interpreter, batteries included, easy prototyping, dynamically typed, concise, fun
pyDAP 1.0 Started in 2003 “Afternoon project”: client only, downloaded data from ASCII response and worked only with Grids and Arrays Reverse-engineering of the protocol Should've really been version 0.0.1
pyDAP 1.x Binary data using Python's xdrlib Server architecture based on a common core that could run as CGI, Twisted or using Python's BaseHTTPServer
pyDAP 2.0 Complete rewrite, based on the DAP 2.0 specification draft Developed during the Google Summer of Code 2005 Own implementation of XDR Server built based on WSGI specification* This should've been version 1.0
pyDAP 2.1 Fully buffered server, able to handle infinite datasets Automatic discovery of plugins Automatic installation of dependencies Runs with Python Paste*
pyDAP 2.2.5.8 Released last Friday (2007-02-16) Approximately 3k LOC for client and server, including docstrings, comments and its own XDR implementation Support for additional plugins (for new data formats) and responses (for new output) that are auto-discoverable Stub support for DDX on the client and server
Client Based on the httplib2 module HTTP / HTTPS Keep Alive Auth: digest, basic, WSSE, HMAC digest Caching Compression: deflate, gzip Intuitive interface
Sample client session >>> from pynetcdf import NetCDFFile >>> dataset = NetCDFFile(“coads.nc”) >>> sst = dataset.variables['SST'] >>> print sst.shape (12, 90, 180) >>> print sst.dimensions ('TIME', 'COADSY', 'COADSX') >>> print sst[0,40,40] 28.0669994354 >>> from dap.client import open >>> dataset = \ ... open(“http://server/coads.nc”) >>> sst = dataset['SST'] >>> print sst.shape (12, 90, 180) >>> print sst.dimensions ('TIME', 'COADSY', 'COADSX') >>> print sst[0,40,40] [[[ 28.06699944]]]
Client usage Commonly used to automate the download of data from OpeNDAP servers and storing in a different format (scripting) Dapper-compliance validator for testing servers
Server “Writing a server is like writing a client backwards” Thin layer between plugins and responses (both auto-discoverable) Implemented as a WSGI application* Deployed using Paste Deploy*
Plugins and responses http://localhost:8080/file.nc.das
Installing plugins & responses pyDAP uses EasyInstall: easy_install dap.plugins.netcdf easy_install dap.responses.html Easy to create new plugins (for small values of “easy”): paster create -t dap_plugin myplugin Generates template with skeleton code New plugin can be easily distributed
Available plugins CSV netCDF (reference implementation) SQL (compatible with most databases but generates “flat” dataset) Matlab 4/5 GrADS grib HDF5 and GDAL (experimental) grib2? (Rob Cermak)
Available responses dds, das, dods ASCII variant HTML form JSON WMS / KML EditGrid / Google Spreadsheets netCDF?
JSON Lightweight alternative to XML for data exchange Based on a subset of Javascript Easy to parse on the browser Parsers and generators for C, C++ C#, Java, Lisp, Lua, Objective C, Perl, PHP, Python, Ruby, Squeak and several other languages Coincidentally, also a subset of Python JSON == valid Python code
A JSON response Content-description: dods_json XDODS-Server: dods/2.0 Content-type: application/json {"test": {"attributes": {"NC_GLOBAL": {}, "author": "Roberto De Almeida"}, "type": "Dataset", "a": {"type": "Int32", "shape": [10], "data": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]}}}
WMS Returns maps (images) from requested variables and regions Works with geo-referenced grids and sequences Layers can be composed together Data can be constrained: /coads.nc.wms?SST // annual mean /coads.nc.wms?SST[0] // january
WMS example request http://localhost:8080/netcdf/coads.nc.wms?LAYERS=SST&WIDTH=512
KML Generates XML file using the Keyhole Markup Language, pointing to the WMS response Nice and simple interface for quick visualizing data
WSGI Python Web Standard Gateway Interface Simple and universal interface between web servers (like Apache) and web applications (like pyDAP) Allows the sharing of middleware between applications (gzip, authentication, caching, etc.)
Paste & Paste Deploy Python module that facilitates the development and deployment of web applications Allows the deployment of pyDAP using a simple INI file that specifies server, middleware and application configuration
Running a server [server:main] use = egg:PasteScript#wsgiutils host = 127.0.0.1 port = 8080 [filter-app:main] use = egg:Paste#httpexceptions next = pyDAP [app:pyDAP] use = egg:dap name = Test DAP server root = %(here)s/data verbose = 0 template = %(here)s/template x-wsgiorg.throw_errors = 1 dap.responses.kml.format = image/png
Future pyDAP 2.3 almost ready Dapper compliance Faster XDR encoding/decoding Initial support for DDX response and parser Build a rich web interface (AJAX) based on JSON + WMS + KML responses Not only to pyDAP, but to other OPeNDAP servers using pyDAP as a proxy Rename “pyDAP” to some cute furry animal from South America?
Acknowledgments OPeNDAP for all the support James Gallagher for all my questions about the spec on the mailing list Everybody who submitted bugs (bonus points for submitting patches!)