490 likes | 623 Views
Tools for data management in the frame of SeaDataNet project. M. Fichaut. SeaDatanet Vocabulary. XML Validator. Metadata in Database. XML Metadata Files. Metadata Input. MIKADO. Metadata In Excel files. CSR. MIKADO. Coupling table. Data in Database. EDMED. Collection
E N D
3rd SeaDataNet training course – Ostende – 16-19 June 2008 Tools for data management in the frame of SeaDataNet project M. Fichaut
SeaDatanet Vocabulary XML Validator Metadata in Database XML Metadata Files Metadata Input MIKADO Metadata In Excel files CSR MIKADO Coupling table Data in Database EDMED Collection of ASCII files Format SDN Data Input Collection of ASCII files Format X SEADATANET PORTAL NEMO EDMERP Med2MedSDN CDI Download Manager Local copy of data to download Data request ODV Data download Partner system : pilot data centre European portal
SeaDatanet Vocabulary XML Validator Metadata in Database Metadata Input MIKADO Metadata In Excel files CSR MIKADO Data in Database EDMED Data Input Collection of ASCII files Format X SEADATANET PORTAL NEMO EDMERP Med2MedSDN Data request by email CDI Manual preparation of data Local copy of data to download ODV Data download Partner system : other data centre European portal XML Metadata Files Collection of ASCII files Format SDN
Overview (1) • MIKADO • Objectives and technical characteristics of Mikado • Mikado’s main features • Manual input of metadata • Use of the common SDN vocabularies • Automatic XML generation • Connection to the local database • Query writing • Mapping to the local database • Batch mode • Coupling table for download manager • XML VALIDATION SERVICE
Overview (2) • Med2MedSDN • Objectives and main features • NEMO • Objectives, main features and principles of NEMO • Technical characteristics • Description of the different steps to follow to be able to reformat ASCII files to SeaDataNet format • Link between NEMO and MIKADO • Link between NEMO and SDN Download Manager – coupling table
3rd SeaDataNet training course – Ostende – 16-19 June 2008 MIKADO Tool for the generation of XML descriptions of SeaDanaNet catalogue records
Objective • MIKADO is used to generate XML catalogue descriptions, it creates XML ISO-19115 files using SDN common vocabularies for metadata exchange of • CSR - Cruise Summary Reports • EDMED - Marine Environmental Data sets • CDI - Common Data Index • EDMERP - Marine Environmental Research Projects • [EDIOS – Permanent Ocean-observing System] • Is freely available on SeaDataNet Web site http://www.seadatanet.org/standards_software/software/mikado
Technical characteristics • Written in Java Language (Version >= 1.6) • Available under multiple environments : • Microsoft : Windows 2000, XP, VISTA , APPLE • Unix - Solaris • Linux. • Use of the SeaDataNet common vocabularies web services to update lists of values • needs network connection in order to have up to date lists of values. • but Mikado works offline once the listsare up-to-date
MIKADO main features (1) • MIKADO can be used in 2 different ways • One manual way, to input manually information for the catalogues and CDI in order to generate XML files. • One automatic way, to generate XML descriptions automatically, from information catalogued in a relational database or in an Excel file. Automatic way is needed for those who have many entries referenced in a relational database • Only one interface for all catalogues • Same look • Same principles
MIKADO main features (2) CSR EDMED EDMERP CDI [EDIOS] Manual XML files for SeaDataNet catalogues MIKADO Java code Automatic DATABASE JDBC Java DataBase Connectivity EXCEL File Native Drivers MYSQL ORACLE POSTGRES SYBASE MSServer Bridge Drivers using Microsoft ODBC (ACCESS, EXCEL, SQL SERVER) Other Drivers Downloaded from ad hoc Websites (Copied in the dist/lib MIKADO directory)
Automatic check of the version of the vocabulary lists : once when MIKADO starts If “On” is clicked in the Vocabulary Update Menu MIKADO downloads locally the latest version of each list Possible to enable-disable the automatic check If “Off” is clicked Manual check Update once now MIKADO and SDN vocabulary lists (2)
MIKADO – Manual input • Available for 4 catalogues : EDMED, CSR, CDI, EDMERP • Each input generates an XML file that can be sent to the central catalogue • For EDMERP and CSR : EDMERP CMS and CSR online can also be used, but MIKADO is useful • if you have problems with the NETWORK connection • if you want to keep locally an XML description of your catalogues • For EDMED and CDI, there is no online input tools.
MIKADO manual : LOCAL Identifier • The LOCAL Identifier is vital because it is kept in the central catalogue and is the entry point to know if the record is new or if it is an update. • This LOCAL ID exists for all the catalogues and is under the responsibility of the data centre who generates the XLM descriptions
MIKADO manual – vocabulary lists (1) • Common vocabulary BODC, • list C320 (Country ISO codes)
MIKADO – Automatic XML generation • Principle • Read the information about CSR, EDMED, EDMERP or CDI in a database or in an Excel file • MIKADO has predefined variables which correspond to the XML tags definition for each catalogues • MIKADO helps user to write the SQL orders to fulfill these variables with the information available in the database or in the Excel file
MIKADO – Automatic XML generation • 4 STEPS • Connect to a database or an Excel file and test the connection • Write the queries to retrieve information in the database or in the Excel file, test the queries • Save the queries in a “Configuration file” • Generate the XML files using the “Configuration file”
MIKADO automatic - connection • Help for the connection to the database • Pre-filled information for some databases • Check of the connection
MIKADO automatic – connection OK • Green message in the Check box
MIKADO – automatic – connection KO • Red message in the Check box
MIKADO automatic – queries • Expendable trees • Main query Return the LOCAL ID • Single queries Return 1 row • Multiple queries Return 1 to n rows • Single an multiple queries related to each LOCAL ID returned by the main query.
MIKADO automatic – queries • Write the queries • SQL syntax (for Oracle, Excel, MySQL, …) and SQL variables must be adapted to your own data base • Check the Queries • Green OK • Red KO : read the error message
MIKADO automatic – single queries • All the XML variables are listed in the expendable tree • In bold : mandatory fields • 1 to n single subquery can be written • In green : fields already fulfilled • Add or delete variables in a query • Delete a full query • Check the query
MIKADO automatic – multiple queries • All the XML variables are listed in the expendable tree • Number of queries is pre-defined • The list of variables for each of these multiple queries is also pre-defined • In bold : mandatory field • In green : fields already fulfilled
MIKADO automatic – multiple queries • In a group of variables (same XML block, same pre-defined set of variable in a query), the non mandatory variables can be left to null
MIKADO automatic - Save the queries • When all the queries are written • They can be saved in an XML file to be re-used later on
MIKADO automatic - generate the XML files • Select the catalogue you want to generate • Open the corresponding configuration file • Choose the output directory • Choose the type of export files • Export the XML files • Progress bar • Cancel allowed
MIKADO automatic - local mapping • While generating the XML files for all the catalogues • Each time that MIKADO does not recognized a value (entrykey or entryterm) which should come from the common vocabulary, it asks the user for mapping • MIKADO manages a demand-driven continuous (incremental) extension of a local mapping : mapping of the local database to the common vocabulary • Mapping tables can be modified • Delete rows • Modify the LOCAL value
MIKADO automatic - local mapping • Example • CSR generation • Mapping of the platform type
MIKADO automatic - local mapping • Modification of the local mapping • If wrong entries have been input, it is possible to: • Delete one entry • Delete all the entries • Change the local code
MIKADO in batch mode • MIKADO can be run in batch mode using existing configuration files • Several arguments can be added on the command line • Java –Djava.endorsed.dirs=”dist/lib” –jar dist/Mikado.jarmikado-home=[path] argument2= … argumentn= • Log file to register the errors
Batch mode : mandatory arguments (1) • batch-type : XmlFiles, Zipfile, Both • batch-mode : CDI, EDMERP, CSR, EDMED • conf-file : path and name of the configuration file with the SQL queries • output-dir : path and name of the output directory where the XML files will be written • continue-on-error : true or false • If true : if one record with mapping missing, or one record with mandatory field(s) null in the database, warning for this record in the logfile, and MIKADO processes next records.. • If false : MIKADO stops at the first error.
Batch mode : optional arguments (2) • log-file : Path and name of the Log file of MIKADO. By default, mikado.log in Mikado home directory • trace : to have a summary of the time response of each SQL query (useful for tuning of the queries bad if time responses) • max-file-in-zip : number maximum of XML files per zip file generated by Mikado. By default, one zip file contains 1000 XML files. • zip-prefix : To personalize the names of the Zip files generated by Mikado. By default the zip file name are SeaDataNet_[catalogue]_[x].zip
Coupling file for Download manager (1) • MIKADO is able to generate this coupling file • The coupling file is used by SeaDataNet download manager to make the mapping between a LOCAL_CDI_ID (one profile, one time-series or one trajectory) and • the name of the file containing this LOCAL_CDI_ID (MODUS1 and 3) : if the metadata is in a data base and the data in files • the SQL Query to retrieve the meta data and the data of this LOCAL_CDI_ID in the local database (MODUS 2)
Coupling file for Download manager (2) • The principle to create this coupling file is the same than to create XML files for catalogue descriptions • User has • to create a configuration file that will be used for the generation of the coupling file. • to write the queries to retrieve the filename or the data for each LOCAl_CDI_ID
MIKADO – User manual • User manual is provided : • File : SDN_MIKADO_UserManual_v1_1_4.pdf • Detailed explanation for MIKADO use, lots of snapshots
VALIDATION of XML files • XML validation Services have been developed in the frame of SeaDataNet by the Russian NODC • It is a Web validation Service available at http://www.seadatanet.org/validator
3rd SeaDataNet training course – Ostende – 16-19 June 2008 XML validation tool Web service for validation of the XML files generated by MIKADO
View all CDI schema versions • View statistics about detected errors • Run the validation with the last version of XML schema
Upload the file you want to validate • Validation works only file per file
The file is displayed on the screen • Press Validate
Errors and warning are displayed • Warning are not obstacles for XML delivery • Errors must be corrected