410 likes | 532 Views
How to prepare data for integration in SeaDataNet V1?. M. Fichaut, R. Lowry, R. Schlitzer. Overview. 2 parts First part gives an overview of SeaDataNet system and of the available tools that can be used by SeaDataNet partners, and details some practical use cases
E N D
How to prepare data for integration in SeaDataNet V1? M. Fichaut, R. Lowry, R. Schlitzer
Overview • 2 parts • First part gives an overview of SeaDataNet system and of the available tools that can be used by SeaDataNet partners, and details some practical use cases • Second part is dedicated to ODV version 4 presentation
SDN V1 – Data centres • In SeaDataNet version 1 : 2 types of data centres • Pilot data centres (11 TTT members + volunteers) • Automatic data download from their system to SeaDataNet portal • Requires a minimal technical infrastructure “Application server like TOMCAT or IIS” and software implementation including “Download manager” and “Coupling table” • Other data centres • Manual preparation of data for downloading by SeaDataNet Web portal
SeaDatanet Vocabulary XML Validator Metadata in Database XML Metadata Files Metadata Input MIKADO Metadata In Excel files CSR MIKADO Coupling table Data in Database EDMED Collection of ASCII files Format SDN Data Input Collection of ASCII files Format X SEADATANET PORTAL NEMO EDMERP Med2MedSDN CDI Download Manager Local copy of data to download Data request ODV Data download Partner system : pilot data centre European portal
SeaDatanet Vocabulary XML Validator Metadata in Database Metadata Input MIKADO Metadata In Excel files CSR MIKADO Data in Database EDMED Data Input Collection of ASCII files Format X SEADATANET PORTAL NEMO EDMERP Med2MedSDN Data request by email CDI Manual preparation of data Local copy of data to download ODV Data download Partner system : other data centre European portal XML Metadata Files Collection of ASCII files Format SDN
Summary • SeaDataNet Vocabulary • SeaDataNet formats • SeaDataNet reformatting tools : NEMO and Med2MedSDN • MIKADO tool and XML validator • Interaction of these tools with the download manager • Some use cases • ODV version 4
SeaDataNet vocabulary • SeaDataNet vocabularies populate many metadata fields and the parameter descriptions in data • They are delivered through a Vocabulary Server • May be viewed through a client on the SeaDataNet web site (http://seadatanet.maris2.nl/v_bodc_vocab/welcome.aspx) • May be accessed programmatically as described in Athens (IMDIS 2008 conference) • Master copy of vocabularies always accessible from a well-known location (BODC) • Vocabularies developed through the group governance of the SeaDataNet TTT or wider international bodies (SeaVoX, ICES platforms)
SeaDataNet vocabulary Vocabularies in metadata Most partners will encounter vocabularies in metadata through Mikado and NEMO tools Most common problem will be that an entry required in a vocabulary isn’t there For example a ship required for a CSR record isn’t present in the C174 list. If this happens, contact the SeaDataNet help desk They will advise what you should do and contact Roy if necessary
SeaDataNet vocabulary • Vocabularies in metadata • Adding new entries involves: • Proposals for change are discussed on the appropriate e-mail list • Editing the master vocabulary database • Publication of the changes • This takes time so please send requests as soon as possible and be patient.
SeaDataNet vocabulary • Vocabularies in metadata • Ship codes • If the ship isn’t present in the full ICES list as published on the ICES web site (the SeaDataNet Ship and Platform Codes at http://www.ices.dk/datacentre/reco/reco.asp) a new code must be obtained from ICES • This has caused delays • New on-line application system now available that will streamline the process (next April)
SeaDataNet vocabulary • Vocabularies in data • Parameters are labelled using terms from the P011 vocabulary • This is comprehensive, but very large (21,000 terms) • Thesaurus navigation tool on the SeaDataNet web site (http://seadatanet.maris2.nl/v_bodc_vocab/vocabrelations.aspx?list=P081) helps a lot • Mapping for MEDATLAS parameter codes under construction and accessible through NEMO and Med2MedSDN tools • Report other mapping problems to the SeaDataNet help desk • Roy will provide whatever help he can
SeaDataNet formats • ASCII formats • defined for vertical profiles, times-series and trajectories • ODV mandatory • MEDATLAS optional • NetCDF format • CF (Climate and Forecast) compatible • For gridded data (model output, satellite data and data syntheses) • Also for other types of data difficult to handle in ASCII formats, due to their large volume or structural complexity • Still being defined • http://www.seadatanet.org/standards_software/data_transport_formats
SeaDataNet extensions to ODV and MEDATLAS (1) • SeaDataNet format extensions fulfil two functions • Provide a linkage between data and metadata • ODV : 2 additional columns : LOCAL_CDI_ID and EDMO_CODE of the data centre providing the CDI • MEDATLAS : 2 additional comment lines with key-words : * LOCAL_CDI_ID = * EDMO_CODE = • Provide a linkage to standardised SeaDataNet semantic information such as detailed parameter descriptions • ODV and MEDATLAS : additional comment lines
SeaDataNet extensions to ODV and MEDATLAS (2) • Additional Comment lines for parameter mapping • ODV //SDN_parameter_mapping //<subject>SDN:LOCAL:DEPH</subject><object>SDN:P011::ADEPZZ01</object><unit>SDN:P061::ULAA</unit> //<subject>SDN:LOCAL:TEMP</subject><object>SDN:P011::TEMPPR01</object><unit>SDN:P061::UPAA</unit> • MEDATLAS *SDN_parameter_mapping *<subject>SDN:LOCAL:PRES</subject><object>SDN:P011::PRESPR01</object><unit>SDN:P061::UPDB</unit> *<subject>SDN:LOCAL:TEMP</subject><object>SDN:P011::TEMPPR01</object><unit>SDN:P061::UPAA</unit>
Tools to generate SeaDataNet ASCII formats • NEMO • JAVA tool to reformat ASCII files to SeaDataNet ODV and MEDATLAS formats - available under Windows • Version 1.2.0 and user manual available at : • http://www.seadatanet.org/standards_software/software/nemo • Med2MedSDN • Java tool to translate MEDATLAS files to SeaDataNet MEDATLAS files - available under Windows • Version 1.0 and user manual available at : • http://www.seadatanet.org/standards_software/software/Med2MedSDN
NEMO main features • Reformat any ASCII file of vertical profiles, time-series or trajectories to a SeaDataNet ASCII format (ODV, MEDATLAS). • The input ASCII files can be : • one file per station for vertical profiles or time series • one file for one cruise for vertical profiles, time series or trajectories • Related to cruises or not • If not related to cruise, only ODV re-formatting is available
NEMO main principles • Users of NEMO describe the entry files format so that NEMO is able to find the information which is necessary in the SeaDataNet formats. • One pre-requirement is that all entry files processed at the same time by NEMO must be at the same format : the information about the stations must : • be located at the same position : same line in the file, same position on the line or same column if CSV format • be in the same format, for example : for all the stations the latitude is : • on line 3 on the station header, • from character 21 to character 27, or 3rd column in CSV • the format is +DD.ddd • Other pre-requirement is that data must be provided in columns in the data files.
NEMO : 5 steps • Description of the file • Description of the cruise : input manually or import of CSR XML V1 • Description of the station header • Description of the measured parameters • File conversion • Model can be saved and reused
NEMO new functionalities • Trajectories taken into account • SeaDataNet extensions to ODV and MEDATLAS formats • Possibility to keep quality flags if existing in input files and to map them to SeaDataNet QC flags scale • Generation of a CDI summary file directly usable by MIKADO to generate XML CDI exports • Generation of the coupling file to make the mapping between a LOCAL_CDI_ID (one profile, one time-series or one trajectory) and the name of the file containing this LOCAl_CDI_ID. This coupling file is used by the download manage
NEMO next version will • Correct the known bugs and the new ones if detected by users • Take into account the last release of ODV format with date ISO-8601 and data type ‘*’ • Improve time response for conversion of large volume files and for vocabulary update • Take into account the ODV multi-station files as input of NEMO • Be tested under Unix and Linux • Be released in June 2009
Med2MedSDN main features • Reformats MEDATLAS files to MEDATLAS SeaDataNet format • Java tool, bilingual (French, English) • Adds the additional SeaDataNet information : mapping for parameters and LOCAL_CDI_ID and EDMO_CODE • Able to reformat one file or a large number of files (in one directory) • Linked to SeaDataNet vocabularies through Web services for parameters mapping and for list of EDMO codes • Need of internet connexion while updating lists
Errors are registered in a log file which can be open through Med2MedSDN main screen by clicking on “see log” in the error window One line in the log file is composed as following: Date, Name of the Software, Error severity level, Error message Med2MedSDN log files
Med2MedSDN next version will • Take into account the creation of the coupling file for SeaDataNet download manager • Be released in June 2009
Tool to generate XML meta-data files • MIKADO • JAVA tool to generate XML descriptions of SeaDataNet catalogues • EDMED : catalogue of Marine Environmental Datasets • EDMERP : Marine Environmental Research Projects • CSR : Cruise Summary Reports • CDI : Common Data Index • Version 1.5 and user manual available at : • http://www.seadatanet.org/standards_software/software/mikado
MIKADO version 1.5 • New functionalities • Download EDMED files directly from central BODC catalogue through Web services: for the time being, awaiting the new EDMED V1 user interface developments • World map to manage Marsden squares for CSR • Data centre type options for CDI (SeaDataNet, ECOOP) : to allow other data Website than SeaDataNet • Mapping download from BODC : to get existing mappings from BODC web site • Sybase driver for JDBC • Vocabulary update without restarting Mikado
Next versions of MIKADO • Version 1.6 • Being tested now • Available next May • Able to generate coupling.txt file used by the download manager, for data stored in ASCII files or in relational data base • Version 1.7 • EDIOS • Released by the end of 2009
NEMO to MIKADO to SeaDataNet CDI ASCII SDN files Collection of ASCII files XML CDI files • Explanation in NEMO user manual SeaDataNet CDI CDI summary CSV file MIKADO Summary_CDI_NEMO.xml
Coupling.txt File Modus 1,3 Med2MedSDN Coupling.txt File Modus 1,2,3 Links with SDN Download manager Coupling table Coupling.txt File Modus 1,3 • Modus 1 : data in mono-station file • Modus 2 : data in database • Modus 3 : Data in multi-station file Download manager SeaDataNet portal SEADATANET PORTAL MIKADO
SeaDatanet Vocabulary XML Validator Metadata in Database XML Metadata Files Metadata Input MIKADO Metadata In Excel files CSR MIKADO Coupling table Data in Database EDMED Collection of ASCII files Format SDN Data Input Collection of ASCII files Format X SEADATANET PORTAL NEMO EDMERP Med2MedSDN CDI Download Manager Local copy of data to download Data request ODV Data download Partner system : pilot data centre European portal
Use cases • Pre-requirement for all use cases is : • Preparation of the mapping between your metadata and : • SeaDataNet vocabularies : Sea areas, BODC parameters (PDV), Platform classes, SDN device categories …. • some mapping is already available on BODC Web site : • MEDATLAS to PDV, MEDATLAS units to BODC storage units • EDMO : Marine organisations • EDMERP : Marine environmental projects (Incremental mapping managed by MIKADO) • Quality checks of the data must be done using ODV or other software, before sending metadata to CDI
Use case 1 – collection of XBTs or CTDs or Time-series files – no relational database • Verify that all files of the collection have homogeneous format • Run NEMO • to convert the files to SeaDataNet ODV • to generate a CDI summary file • [To generate the coupling.txt file for these data] • Run MIKADO to generate the XML CDI files with the configuration file delivered with NEMO for the CDI summary file • Use the XML validator to validate your XML files • [Implement the coupling file] • Send the XML CDI files to central catalogue
Use case 2 – collection of MEDATLAS files and metadata in relational database • Run Med2MedSDN • to convert MEDATLAS files to MEDATLAS SDN files • [To generate the coupling.txt file table for these MEDATLAS SDN files] • Run MIKADO on the metadata database • to generate the XML CDI descriptions of the stations of these MEDATLAS files. • [To generate the coupling.txt file table for these MEDATLAS SDN files] • Use the XML validator to validate your XML files • [Implement the coupling file] • Send the XML CDI files to central catalogue
Use case 3 – collection of ASCII files and metadata in relational database • Run NEMO • to convert ASCII files to ODV [and MEDATLAS] SDN files • [to generate the coupling.txt file table for these SDN files] • Run MIKADO on the metadata database • to generate the XML CDI descriptions of the stations of these files. • [to generate the coupling.txt file table for these SDN files] • Use the XML validator to validate your XML files • [Implement the coupling file] • Send the XML CDI files to central catalogue
Use case 4 – XBTs, CTDs, Time-series measurements – data and metadata in a relational database • Run MIKADO • To create the configuration to retrieve metadata on these data in the database • To export the XML CDI corresponding files • Run MIKADO to create the coupling table with the appropriate select statement to retrieve these measurements in the database • Use the XML validator to validate your XML files • Implement the coupling file • Send the XML CDI files to central catalogue
NEMO and Med2MedSDN demonstrations are possible , • just ask!!! • Questions or problems on MIKADO are welcome too.
And now • All about ODV, version 4, by Reiner ……………