280 likes | 399 Views
Emerging Trends in Data Exchange and Data Hubbing Jacob Assa, UN Statistics Division. Regional Workshop on Data Dissemination and Communication. Manila, the Philippines June 20-22, 2012. United Nations Statistics Division 2012. Outline of the Presentation. Data Dissemination in Context
E N D
Emerging Trends in Data Exchange and Data Hubbing Jacob Assa, UN Statistics Division Regional Workshop on Data Dissemination and Communication Manila, the Philippines June 20-22, 2012 United Nations Statistics Division 2012
Outline of the Presentation • Data Dissemination in Context • Dissemination History at UNSD • Dissemination versus Communication • Data Exchange and SDMX • Data Hubbing Nationally and Globally
Data Dissemination in Context • Virtual Value Chain : (Svend and Hollensen, 2001) • Dissemination – last but not least step • Often done as an afterthought • Can be made more efficient and effective: • From Data Publishing to Data Exchange • From Data Silos to Data Hubbing Define information problem Organize, select and compile information Synthesize information Distribute information Value
Dissemination History in UNSD • League of Nations 1919-1948 – print publications • United Nations • 1948-1995 – print publications (yearbooks, manuals) • 1995-2000 – CD-ROM, static web pages • 2000-2008 – online databases, dynamic web queries (UN Comtrade, UN Common Database) • 2008 – launch of UNdata – UN System data portal • 2010 – World Statistics Pocketbook app for iPhones and iPads • 2012 – launch of CountryData – UN national data portal
Dissemination versus Communication One-way vs. two-way communication • Considerable evolution of statistical communication over recent years • Traditionally, statistical organizations focused on • Dissemination through printed publications • One-way communication through few media channels • Newspapers • Radio Television • Since 1990s, acknowledged need to do more than just disseminate data • Employing communication professionals • Widespread use of the Internet • New methods of communication and dissemination
Dissemination versus Communication New methods of communication: • Web 2.0 technologies • Blogs • Wikis • Social networks • Interactive websites • Allow users to upload data and create graphs • Sharing and discussion with other users
Data Exchange - Unstructured • Paper questionnaires • Excel sheets • CSV files • Email • Semi-structured • XML files • However, XML in itself is simply a mark-up language and does not standardize data structure between exchanging parties
XML - Example Philippines, GDP in constant 2000 US$ (World Bank)
Data Exchange - Structured Statistical Data and Metadata Exchange (SDMX) • What is it? • An initiative to foster standards for the electronic exchange of statistical information • Goal - explore e-standards that could increase efficiency gains and avoid duplication • Sponsored by BIS, ECB, EUROSTAT, IMF, OECD, UN, WB • What it is not • Not a technology…but implemented using technology (XML EDIFACT syntax and GESMES/TS message) • How does it work? • Exchange partners agree on Data Structure Definitions • Data and metadata exported and imported accordingly
Benefits of SDMX Protection of existing technology investments • Many different types: • Data warehouses • OLAP cubes • GESMES/TS • Publication systems • SDMX standardizes formats and protocols at the point where data and metadata go between counter-parties
SDMX Registry/Repository SDMX Registry Interfaces Register Indexes data and metadata REGISTRY Data Set/Metadata Set Query Describes data and metadata sources and reporting processes Submit Subscription/Notification REPOSITORY Provisioning Metadata Query Submit REPOSITORY Structural Metadata Describes data and metadata structures Query
Impact of the SDMX Registry • The SDMX Registry allows for one of the major efficiency gains possible with SDMX: • Shifting from “push”-based reporting to “pull”-based reporting • This can save lots of time and duplication of effort
What is a Data Structure Definition? • Specifies a set of concepts which describe and identify a set of data • Tells which concepts are the dimensions (identification and description) and which are attributes (just description) • Tells which code lists provide the possible values for the dimensions and attributes
What is Data Hubbing? • In general, a hub is the central part of a wheel where the spokes come together. The term is familiar to frequent fliers who travel through airport "hubs" to make connecting flights from one point to another • In data communications, a hub is a place of convergence where data arrives from one or more directions and is forwarded out in one or more other directions http://searchnetworking.techtarget.com –
Data Hubbing at the National Level Cambodia – DFID Project Objectives • Improve coordination in the National Statistical System • Collate development data in one place/hub • Make access to national data easier • Reduce data request burden • Use of latest IT software and practices
Line Ministries National Statistical Office United Nations Line Ministry Database National Repository DB Post notification DevInfo Upload Mapping tool Publish Scripts SDMX-ML Download XLS National Indicator Registry Register files Project Dissemination Model
Data Hubbing at the International Level (1) The Joint External Debt Hub (JEDH) Jointly developed by • Bank for International Settlements (BIS) • International Monetary Fund (IMF) • Organization for Economic Cooperation and Development (OECD) • World Bank (WB)
JEDH Site before SDMX BIS WEBSITE IMF OECD World Bank (Various Formats) (3-month production cycle)
JEDH with SDMX Retrieves data from sites BIS SDMX “Agent” SDMX-ML SDMX-ML Loaded into JEDH DB [Info about data is registered] IMF SDMX-ML Discover data and URLs SDMX Registry OECD SDMX-ML Data provided in real time to site World Bank SDMX-ML JEDH Site SDMX-ML (Debtor database)
Data Hubbing at the International Level (2) UNdata Portal • Before, a researcher interested analyzing the effects of population, health and education on per capita income growth would need to visit: • UNSD website for population figures • WHO website for health indicators • UNESCO website for education indicators • UNSD/World Bank/IMF website for income data • Now all these indicators are available in one place through a single user interface
Data Hubbing at the International Level (3) European Central Bank (ECB) • Push vs. pull plus a hybrid approach • Central Hub to which all member banks submit their SDMX data • The ECB then pulls the entire dataset from the Central Hub • SDMX-based visualizations
Resources • UNSD - Handbook of Statistical Organization(3rd ed.)http://unstats.un.org/unsd/dnss/hb/default.aspx • UNECE - Making Data Meaningful (2 parts) http://www.unece.org/stats/documents/writing/ • SDMX - http://sdmx.org/ Contacts United Nations Statistics Hotline - statistics@un.org Jacob Assa, UNSD - assaj@un.org