1 / 12

UncertWeb lessons learnt

UncertWeb lessons learnt. Dan Cornford and the UncertWeb team Computer Science, Aston University, Birmingham, United Kingdom, UncertWeb Tools Workshop, 10 Jan 2013, Aston. UncertWeb successes. Tools and other software: hope you found them useful! they will outlive the project

stesha
Download Presentation

UncertWeb lessons learnt

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. UncertWeb lessons learnt Dan Cornford and the UncertWeb team Computer Science, Aston University, Birmingham, United Kingdom, UncertWeb Tools Workshop, 10 Jan 2013, Aston

  2. UncertWeb successes • Tools and other software: • hope you found them useful! • they will outlive the project • but maintaining online web based tools is not easy … in a university once the authors have gone! • Applications • four really strong examples • plus an integration example • but they often don’t always use the tools or infrastructure! • sometimes it is easier to re-invent the wheel, and not all tools are always needed!

  3. UncertWeb successes - II • Information models: • profiles for O&M, GML are supported by our tools • for spatially and temporally distributed data • UncertML provides a useful vocabulary • NetCDF scales to larger data sets when gridded • a unified approach (+ tool support) to representing all types of data would be very useful, but quite a challenge! • Infrastructure and services: • improvements to web service deployment suggested • brokering approach a sound architecture

  4. Did we create the ‘Model Web’? • No – there are only a few models ‘out there’ • model ‘interfaces’ and deployments are complicated and variable, and not designed to integrate typically • this is just a problem – most models were not written for the web • each model is written uniquely making automated exposure hard • sometimes we used encodings that were too complicated (O&M / GML) • always use the most simple encoding – if it is a scalar use a simple type, use O&M for e.g. spatial / temporally referenced data • Standards / frameworks should be distilled from implementations • should be developed during / after implementation and not before as a theoretical exercise

  5. Exposing models on web • Exposing real models is hard • UncertWeb profile provides a range of data types, but: • tools for support (e.g. conversion to/from commonly used types), and actual usage still very low! • mapping from model inputs / outputs to encodings time consuming, manual and complicated • not even always clear / unique how to encode all inputs / outputs • and which inputs / outputs are best exposed on the web interface? • We spent far more time on this than anticipated • Very hard to automate!

  6. UncertWeb encodings • For big data ASCII and XML too verbose, also JSON • binary encodings are needed for big models, and some simply not suited to web deployment • bring the model to the data? e.g. as done with R scripts? • when working with big data, avoid thin clients doing lots of data handling! • Use appropriately simple types for encoding model inputs and outputs • scalar, vector, matrix (+ type) cover most inputs • O&M / GML add value to spatial and temporal data • NetCDF useful for gridded data • future will ideally combine O&M and NetCDF (and UncertML) • some tools only support simple encodings anyway!

  7. UncertWeb Architecture • This was a challenge • OGC stack is complex • and not well enough supported by complete implementations • we have defined our own profiles – this is a big contribution • WPS too generic • provide richer description (more metadata) • Our proposed annotation of services needs further testing and refinement • Still no ‘universal’ modelling framework out there • brokering approach a good idea, but requires more work • bit ‘chicken and egg’ – we hope good tools make it worth exposing models, but exposing models remains time consuming • diversity likely to remain, as in programming languages but worse!

  8. Uncertainty in Environmental Data and Models • UncertML is useful and will outlive the project • might need one more tweak to be optimal! • does not scale to massive data, and a better separation between the dictionary and encoding would be a plus • core idea is valid, probabilistic approach is right • Quantifying uncertainty on all inputs and models proved challenging! • tools help, but managing resulting uncertainties not easy • many uncertainties treated rather superficially • still a lot to do on quantifying uncertainties in models and data

  9. Research funding • Research funding too short term • if you really want to support the “model web” • encourage global cooperation, but find a leader to push this and then fund them • make sure they have a vision but also listen! • fund it for the long term – make a decision and stick with this, maybe with some heavy management! • involve companies who can actually develop and maintain commercial strength software – be prepared to fund them • universities are good at developing prototypes, but cannot maintain things longer term • funding for open source / free use ... commercial companies then add value • fund things that integrate, not develop new solutions • the challenge now is one of software engineering in my view, not theory

  10. Open questions • Need to make it easier to expose models on the web • tools to convert to web formats (JSON / O&M / NetCDF) • scalability, both of models and data models • consider chunking big data, parallel execution, automated replication of model instances • semantics and automating workflow composition • reliability, maintainability, security • Tools further developed • Greenland (vis client) … others? • More complex real model workflows • user driven real workflows will raise more issues! • time stepping, model discrepancy, calibration, data assimilation, uncertainty quantification

  11. What next? • Other initiatives: PURE Experimental Zone • EVO … • other projects? • We’ll do our best to maintain and enhance the tools • all are open source, so you can take them on too! • we hope to have new projects in the future to further develop the tools and basic technology • linking with other frameworks, e.g. OpenMI attractive … but not funded!

  12. Summary • Keep in touch • we want the tools to be used, and we want to find out when they do and don’t work! • you can help – everything is open source • Thanks for attending • hope you found it useful, or at least interesting … The research leading to these results has received funding from the European Union Seventh Framework Programme (FP7/2007-2013) under grant agreement n° [248488].

More Related