1 / 17

An SDMX based unified data catalogue (UDC)

An SDMX based unified data catalogue (UDC). MSIS – Meeting on the Management of Statistical Information Systems. Gabriele Becker / Massimo Bruschi Statistical Information Systems Monetary & Economic Department Bank for International Settlements. 1. The SDMX vision.

race
Download Presentation

An SDMX based unified data catalogue (UDC)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. An SDMX based unified data catalogue (UDC) MSIS – Meeting on the Management of Statistical Information Systems Gabriele Becker / Massimo Bruschi Statistical Information Systems Monetary & Economic Department Bank for International Settlements 1

  2. The SDMX vision • Need: up-to-date numbers, data documentation, good quality data • Data can be offered by: NSOs, CBs, IOs • How to choose, filter out duplication, get the “fresher” ? • Data providers (originators) offer their data “in SDMX” • Dissemination = reporting = data sharing… single storage ! • SDMX registries help users and organisations to find data • How “real” is this SDMX vision? • What do we still need to learn?

  3. The Unified Data Catalogue (UDC) concept • Can we “implement” the vision ? • UDC: a single data catalogue that allows to discover, select and retrieve statistical data from all registered data sources • discovery implies access to metadata: • DSD – data structure definitions • concepts and code-lists • category schemes • An SDMX registry is a natural repository • Unified Data Catalogue feasibility study to analyse this

  4. UDC study: Objectives • Provide centralised access to a variety of internal and external data-sources • Generic search facilities against “registered” data sources • Directly retrieve data and metadata from all data sources • Use SDMX technical standards, SDMX registry, web services • Broaden SDMX knowledge within BIS (business area and IT colleagues)

  5. User stories • Registrations • Constraints • GUI features • Navigation / Search • Query & retrieval • Output handling • Automation • Security

  6. UDC prototype architecture • Simplistic approach: to search and retrieve data from a data source all what we need to know are the data structures and the source query language • If a source follows the SDMX-IM we also need a (web) service connected to it able to respond to SDMX Query • SDMX-enabled data source: “native” or “adaptable” • SDMX-ML file + DSD + “file-query-handler” = simplest SDMX enabled source

  7. SDMXfiles SDMX data sourceweb-service mappable data source web service SDMX Registryweb appl. Plan: schematic architecture Internalorexternalsources SDMXquery adapter web service Registrations SDMX UDC GUI

  8. Components of the UDC prototype • SDMX Registry (“off the shelf” SDMX Tool) • Data structure definitions of all “connected” data sources • Registrations for all data flows for all connected data sources • URLs to SDMX-files and SDMX query services • Updated via SDMX-ML messages or interactively (“KeyMaster”) • UDC (developed for the study) • GUI to navigate the registry information • Queries the data sources • Retrieves data and presents them to the user • SDMX query web services (developed for the study) • For the different types of data sources • Data query services (partly existing, partly developed) • For each of the connected queryable data sources

  9. medts.aLinux BIS Data Bank DBQL output SDMX-MLproxy daemon .xml .xml .xml .xml What we did: detailed architecture mstat.aWin mstat.sWin v.ds03Linux MSTAT Cubes MarkIT SQL database SDMX-ML data files TS web service SQL storedprocedures SDMX-MLquery web service/databank/query SDMX-MLquery web service/mstat/query SDMX-MLquery web service/markit/query SDMX Registryweb appl. UDC web appl. SDMX-MLfilebrowser R/O Registry PCWin Internet ExplorerUDC GUI

  10. UDC GUI key features • Browse the Categories / Data-flows / Provision registrations • Browse selected DSD: dimensions, attributes, code-lists • Build queries based on DSD (code selection) • Run query and view results (simple table) • Download results and DSDs in SDMX-ML format • Search by Concept / Codelist

  11. Search by Concept/Codelist - 1 1 2 3

  12. Search by Concept/Codelist - 2 4 5 6

  13. UDC Prototype: some results • UDC can provide (unsecured) access to • BIS Data Bank: time series repository, SDMX-EDI IM, LINUX, FAME, Sybase, own query language + query adapter • MSTAT OLAP: IBFS data multi-dimensional cubes, MS Windows, SQL Server, SDMX Query to OLAP / MDX adapter • MSTAT Sandbox, research data in relational base, MS Windows, SQL Server, DSD on unstructured dataset + SMDX / SQL adapter • SDMX-ML generic files + generic file adapter • Practical use of registration, provisioning, constraints processing, … • SDMX vision is real … with some practical issues

  14. Issues found (Aug. 2009, SDMX 2.0) • Not possible to register compact or utility files in registry used • Not possible to register files using message groups and annotations as not supported in registry used • Missing functionality in SDMX Query message • Some issues with registry implementation used • Constraints processing on registry did not work • ECB does not provide DSDs on their website (files are OK) • Cross-platform communication with security not solved • In general: access authorisation to query-able data sources is unresolved

  15. Conclusions • SDMX vision is real: the UDC works • Enhancements to standards already part of SDMX 2.1 • Enhancements to registry implementation (eg industrial strength required) • Non-SDMX issues (cross-platform connectivity and access authentication) exist and need to be looked into • Current SDMX offerings from other organisations are rather diverse (message types, features used, version implemented) • Diverse offerings make requirements for a UDC more complex

  16. Next steps for the BIS • UDC can be a central part of future BIS environment • Road to UDC will take a few years • Continue the feasibility study in the next year • Refine UDC • More data sources • More user facilities for search and navigation • Work with SDMX standards experts on issues found • Work with other SDMX data providers

  17. Thank you! gabriele.becker@bis.org massimo.bruschi@bis.org

More Related