1 / 20

SimDAL discussions

Explore the integration of SimDAL and SimTAP for easy simulations access, metadata publication, and data query. Discuss critical design points, SimDM utilization, and the benefits and limitations of SimTAP.

rkillinger
Download Presentation

SimDAL discussions

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SimDAL discussions • NCSA, Urbana, May 22th 2012 David Languignon

  2. Content • (Re) introduce SimDAL • Highlight critical design points to discuss today

  3. SimDAL goals • publish simulations results/metadata an easy way • access simulations results/metadata an easy & standard way • same for ‘raw’ simulation material (raw code output etc...)

  4. Proposal Overview • SimDAL components Registries

  5. Discovery Use Case Summary • Simulation discovery • What kind of model is being offered ? • What parameters characterize the model ? • What is the physical meaning of those parameters ? • Simulation’s output datasets discovery • What results can be retrieved ? • What kind of results can be retrieved (meaning) ? All the information is available in SimDM

  6. Problem • SimDM is hard to • Fill data in (publisher) • Query (user)

  7. SimTAP to the rescue • “Formally, SimTAP is a TAP service on top of a table schema that is constrained by one or more instances of the SimDM:resource/protocol/Protocol class defined in the simulation data model” • as formulated by G. Lemson

  8. SimTAP to the rescue • a protocol description file (SimDM Protocol entity) • + the corresponding xsd (SimDM XML serialization) • An algorithm to make a Simple, Flat DataModel from Protocol description • a TAP service to access the new DataModel implementation • a mapping between : • the new DataModel class attributes (# tables fields) • the original Protocol elements

  9. SimTAP daily use • Query (users) must be easy • simple TAP query (flat model) • use of metadata of protocol.xml + mapping file (semantic) • Publication (publishers) must be easy • protocol.xml file • mapping file • plaintext data file used tol fill the database (csv like)

  10. SimTAP pros • Mainly intended for single (flat) table queries • tiny subset of SQL (ADQL) : easy to use ! • fast, no joins (datawarehouse like schema) • tables easy to fill in • Publisher only has to make 2 simple files • TAP compliant • effort/code re-use ! • compliant with VO tools (TopCat etc...)

  11. SimTAP cons • another DM • may miss some SimDM informations

  12. Specifications cutout: string list list list -> dataset to extract a subdataset of datasetId restricted according to attributes_restriction and where only attributes_list attributes of subdataset's objects are present. Apply provided options. cutout(dataset_id, attributes_list, attributes_restrictions_list, options_list) Acces raw data : cutout • huge data : need subset extraction • uws service : async for large data extraction

  13. cutout : example original_dataset <- { id:Halo23_ramses_34, data: [ {mass:1.23e2, nbr_part:3.45e5, ener_pot:2.01, x:1, y:2,z:0}, {mass:1.03e2, nbr_part:2.89e5, ener_pot:1.71, x:23,y:4,z:4}, {mass:3.673e3, nbr_part:9.45e5, ener_pot:2.41, x:4,y:5,z:3}, {mass:1.2e1, nbr_part:1.45e3, ener_pot:0.81, x:3,y:7,z:3} ] } attribute_list <- [mass,nbr_part] attribute_restriction_list <- [ {attribute : x,condition: gt,restriction:0}, {attribute : x,condition: lt,restriction:15}, {attribute : y,condition: gt,restriction:3}, {attribute : y,condition: lt,restriction:8}, {attribute : z,condition: gt,restriction:2}, {attribute : z,condition: lt,restriction:4}, {attribute : mass, condition: ordered, restriction:asc} ] cutout(Halo23_ramses_34,attribute_list, attribute_restriction_list) should produce : [ {mass:1.2e1, nbr_part:1.45e3, ener_pot:0.81, x:3,y:7,z:3}, {mass:3.673e3, nbr_part:9.45e5, ener_pot:2.41, x:4,y:5,z:3} ]

  14. Discussion cutout • Which output data format should we standardize ? • votable • fits • hdf5 • vtk

  15. Registries Discussion : What to put in registries ? • “SimDAL” service url • skos concepts list • redundant with protocol.xml but allows faster and direct research at registry level • protocol.xml..... see SimTAP presentation

  16. Discusion Preview • Should we define a preview feature in (Sim)TAP ? • per column preview (preview field in TAP_SCHEMA) • per line preview (column name standardized but not mandatory) • URL toward • www browser displayable file • xml (Datalink ?) listing several browser displayable files • VoTable integration ? • through VOTable LINK with content-role = “preview”

  17. <?xmlversion="1.0"encoding="utf-8"?> <DATALINK> <LINK> <URL>http://roxxor.obspm.fr/deuvo-ui/dfiles//simtap.objects_34_halo/votable?select=x%2Cy%2Cz%2Cmass%2Cnpart&where=npart+%3E+2e4</URL> <MIME>application/xml</MIME> <DESCRIPTION>subdataset of the fof halo finder postprocessing on top of a Ratra-Peebles universe simulation (boxlength 162, resolution 1024, z=1.5) output. Constraints are number of particles gt 2e4</DESCRIPTION> <SIZE>unknown</SIZE> </LINK> <LINK> <URL> http://roxxor.obspm.fr/deuvo-ui/dfiles//simtap.objects_32_halo/votable?select=x%2Cy%2Cz%2Cmass%2Cnpart&where=npart+%3E+2e4</URL> <MIME>application/xml</MIME> <DESCRIPTION>subdataset of the fof halo finder postprocessing on top of a Ratra-Peebles universe simulation (boxlength 162, resolution 1024, z=2.33) output. Constraints are number of particles gt 2e4</DESCRIPTION> <SIZE>unknown</SIZE> </LINK> </DATALINK> preview : example

  18. Discussion Groups • No group feature in standard TAP • Very useful (required ?) for numerical simulations • Huge amount of columns • Need a grouping feature (at least for display) • group ok in VoTable The information is in SimDM, so must be in SimDAL

  19. Discusison SKOS • How to integrate skos concepts in TAP ? • just put it in place of ucd ? • How to integrate skos in VoTable ? • ucd • through VOTable LINK with content-role = “skos”

  20. Discussion summary • SimTAP DM derivation algorithm : (G. Lemson) • registries : skos list • preview : url target, integration in VOTABLE • groups : how to integrate in TAP_SCHEMA • skos : how to integrate in TAP_SCHEMA, VOTABLE • cutout : output format

More Related