1 / 12

Webification o f Science Data

Jet Propulsion Laboratory Zhangfan Xing. Webification o f Science Data. July 10, 2014. Scope of Our Work. Enable digital resources for the web platform Resource Virtualization Webification (w10n): exposure of data Servicification (serv10n): exposure of libs and exes

leyna
Download Presentation

Webification o f Science Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Jet Propulsion Laboratory Zhangfan Xing Webification of Science Data July 10, 2014

  2. Scope of Our Work Enable digital resources for the web platform • Resource Virtualization • Webification (w10n): exposure of data • Servicification (serv10n): exposure of libs and exes • Resource Visualization • Libraries and widgets • End user applications • Resource Discovery Examples of resources: • directories, files, data bases, remote services, etc. • class methods, scripts, command line executables, etc. This talk only covers w10n of science data

  3. W10n Specification (I) Idea:An arbitrary data store is virtualized as a tree. Its inner components, such as attributes and data arrays, are made directly addressable and accessible by meaningfulURLs. They are exposed in a fully ReSTful way. Summary: • Resource is viewed as a tree of nodes and leaves. • Nodes and leaves are accessible via semantic URLs. • A node has meta info; A leaf has meta and data info. • Fully ReSTful style HTTP request/response. Read/Write. Where it has been applied: • Science: Earth, Planetary, Astronomy, Heliophysics, etc. • Mission operation: engineering, system control, etc. • Business information: documents, databases, etc.

  4. W10n Specification (II) URL Syntax: conventional with two extensions • Forward-slash ‘/’ denotes meta info of a node or leaf • Square-bracket pair ‘[…]’ denotes data info of a leaf node meta: http://host:port/path/store/../node/?query_string leaf meta: http://host:port/path/store/../node/leaf/?query_string leaf data: http://host:port/path/store/../node/leaf[indexer]?query_string where indexer denotes a portion of leaf data info. Request: • Standard HTTP methods: GET, PUT, etc. Response: • Meta info always in JSON; Data info in JSON or others. Node meta info: {'name':string, 'attributes':[...], 'nodes':[...], 'leaves': [...], 'w10n':[...]} Leaf meta info: {'name':string, 'attributes':[...], 'type':string, 'w10n':[...]} Leaf data info: {'name':string, 'type':string, ’data’:..., 'w10n':[...]}

  5. A Simple W10n Example File system directory as a data sotre: • Sub-directories  w10n nodes • File entries w10n leaves • Attributes include timestamp, size, etc. More URL syntactic sugar: extended to supports GLOB (aka wildcard) pattern. http://example.com/test/data/ http://example.com/test/data/*/ http://example.com/test/data/*/*/ http://example.com/test/data/*/*/*/ http://example.com/test/data/*/*p*/*/ http://example.com/test/data/[a-zA-Z]*[0-9]/*/ http://example.com/test/data/[a-zA-Z]*[0-9]/*.nc/

  6. W10n of Science Data (I) It is w10n applied to science data stores, thus w10n-sci. Science data store examples: • A remote sensing granule file in HDF format • A climate model outputin NetCDF format • A CSV file with field work records • A Mysql database with in-situ observations • A data model, e.g., GDAL, ImageIO, etc. • A remote service such as OPeNDAP • W10n-sci maps an arbitrary science data store into a hierarchical tree, with leaves that are multi-dimensional arrays of • primitive type, e.g., int16, float32, etc. • composite type, e.g., {int16, float32}

  7. W10n of Science Data (II) Request: • HTTP GET method with unambiguous w10n URLs as • node meta: http://host:port/path/store/../node/?query_string • leaf meta: http://host:port/path/store/../node/leaf/?query_string • leaf data: http://host:port/path/store/../node/leaf[indexer]?query_string • where indexer can be: • range: start:stop:step,start:stop:step,… • list: n0,n1,n2,… • specific constraint: -20<=lon<=20, -45<=lat<=45, quality>3 • any declarative language you want to support • and query_string can be output=*, traverse, flatten, etc. Response: Meta info always in JSON Data info in JSON, big/little-endian binary, NetCDF Leaf (array) meta Info: {'name':string, 'attributes':[...], 'type':string or {...}, ’shape’:[...], 'w10n':[...]} Leaf (array) data Info: {'name':string, 'type':string or {...}, ’data’:..., 'w10n':[...]}

  8. A W10n-Sci Example • Extensions to classic URL syntax: • forward-slash '/’ denotes meta info • square-bracket '[]’ denotes data ino blue: classic URL path red: w10n meta/data info identifier green: classic URL queryString orange: w10n data indexer sample.h5 === SMAP_L3_SM_P_20010501_D04003_001.h5 Meta of root node http://…/sample.h5/ Meta of sub nodes http://…/sample.h5/Metadata/ http://…/sample.h5/Metadata/AcquisitionInformation/radiometer/ Meta of a sub node containing leaves http://…/sample.h5/Soil_Moisture_Retrieval_Data/ Meta of a leaf http://…/sample.h5/Soil_Moisture_Retrieval_Data/soil_moisture/ Data of a leaf http://…/sample.h5/Soil_Moisture_Retrieval_Data/soil_moisture[] http://…/sample.h5/Soil_Moisture_Retrieval_Data/soil_moisture[]?output=json http://…/sample.h5/Soil_Moisture_Retrieval_Data/soil_moisture[]?output=nc Sliced data of a leaf http://…/sample.h5/Soil_Moisture_Retrieval_Data/soil_moisture[0:406,0:964] http://…/sample.h5/Soil_Moisture_Retrieval_Data/soil_moisture[10:20:2,60:80:3]

  9. Selected Implementations Server Pomegranate (http://pomegranate.nasa.gov) Formats supported: NetCDF, HDF 4/5, GRIB, FITS, etc. Output format : NetCDF, big/little-endian binary, JSON. Juneberry Formats supported: Vicar/PDS, FITS, TIFF, JPEG, GIF, etc. Output format: common image format like GIF, PNG, etc. Client Any http-aware programming language or environment works, e.g., • command line tools, e.g., curl and wget • python, php, java, and javascript • Matlaband IDL • Mobile applications • Advanced html5 applications such as http://rex.jpl.nasa.gov

  10. Enable Earth Science Data via W10n-Sci Contact: xing@jpl.nasa.gov Step 1. Install Server • Visit http://pomegranate.nasa.gov to install and configure Pomegranate. Or, alternatively, • Download Taiga, a turnkey solution, from http://scifari.org/taiga. • Run command taiga-service config with a directory of your data files. • Run command taiga-service start to start up web service. Step 2. Use Your Browser • Type in service endpoint URL from Step 1 above • Explore Or, alternatively, • Go to http://rex.jpl.nasa.gov • “Tools” -> “File Finder” • Type in the URL reported by last command in Step 1 c • Explore and plot Helpful URLs: • http://data.jpl.nasa.gov/earth-science • http://data.jpl.nasa.gov/earth-help • http://data.jpl.nasa.gov/planetary-science • http://data.jpl.nasa.gov/planetary-help

  11. Acknowledgements • Over years, our effort has been kindly supported by many NASA/JPL sponsored projects/tasks, including • ACCESS/Altimetry Service and Tools • AIST/OSCAR • MLS/Task Plan #88-9539 Rev A • MGSS-IOS/Web services • MSL/OPGS • PO.DAAC/Technology • SMAP/PHASE CD Implementation • PDS System Operation

  12. Simple, but not simpler! Thank you very much! Questions? Contact: xing@jpl.nasa.gov

More Related