980 likes | 991 Views
Learn how to map data between Di Database and MDG DSD using the mapping tool. Understand codelists, dimensions, and structures for accurate mapping.
E N D
CountryDataSDMX forDevelopment Indicators MDG DSD vs. the Di Database: Using the Mapping Tool
MDG Data Structure Definition (DSD) background • Developed by SDMX Task Team of IAEG on MDGs • Supports exchange of MDG Indicator data between international agencies (UN, UNICEF, UNESCO, …) • Implemented in SDMX 2.0 • Latest version (2.4) finalised in Feb 2013
DevInfo (Di) background • Data dissemination software supported and promoted by UNICEF • DevInfo7 (Di7) launched in Nov 2012 • SDMX 2.1 & 2.0 compliant • Web base software • 9 out of 11 project countries using DevInfo • Stable version compare to previous releases
Simple relation between Di & DSD Di Database MDG DSD • Frequency (Default = “Annual”) • Reference Area • Series • Units of measurement • Unit multiplier (Default = 0) • Location (Default = “Total”) • Age group (Default = “All Ages”) • Sex (Default = “Both Sexes”) • Source Type (Default = “NA”) • Source details • Time Period • Time period details • Nature of data points (Default = “C”) • Footnotes • Area • Indicator • Unit • Subgroup (i.e. Sex, Age, Location etc.) • Source • Time Period • Footnotes
Mapping to the DSD • DSD dimensional structure means values are mandatory for LOCATION, SEX & AGE GROUP. • Due the nature of this domain (i.e. MDGs), not obvious which values should be used in these dimensions • For example, what is SEX for “Births attended by skilled personnel”: • Not Applicable? Total? Female?
Mapping to the DSD • Inconsistent mappings lead to duplications and other anomalies • In CountryData, mappings for indicators/ time series are agreed before data exchange (see mapping for MDGs from 1st workshop) • However, this is just one side of the story…
Mapping to the DSD • Understanding the structure and contents of the origin database is fundamental to the mapping process • Mapping to the DSD requires the data to enter into certain ‘restrictions’ it’s not bound by in the database (and vice versa).
Mapping to the DSD • The mapping tool in di software is designed to work with the di database as simply as possible… • the tool is based on mapping between the codelists of the DSD and origin database; • certain situations require some further manual effort to map a time series; • and sometimes a “fix” is required to the database where the data simply isn’t valid or it’s duplicated. • Therefore it’s good to review di structure to understand where these issues usually occur.
Di Data Architecture • Area, hierarchical dimension • IUS = Indicator, Unit and Subgroup • Time series data are stored with the combination of the 3 dimensions • Indicator • Unit • Subgroup: Combination of one or more sub-dimensions • Source & Time Period • Together with IUS “uniquely” defines each data value • Footnote • “Free text” field stored with data value
Di INDICATOR • IUS: Indicator Unit Subgroup • Indicator, for example: • Infant Mortality • AIDS Death • Malaria Death • Similar to SERIES in the DSD • Contains only Indicator specific values
Di UNIT • IUS: Indicator Unit Subgroup • Unit: • Percentage • Number • USD • Square KM • Similar to UNIT of Measurement in DSD • Contains only Unit specific values
Di SUBGROUP • IUS: Indicator Unit Subgroup • SubGroup Dimension: • Combination of one or more sub-dimensions • “Age”, “Sex”, “Location” and “Other” sub-dimensions are set initially in database • Specific values can be created under each sub-dimension • Relate to SEX, AGE GROUP and LOCATION in DSD.
Di SUBGROUP Sub-Dimension Age Sex Location Other Sub-Dimension values < 1 Year < 5 Year 5 – 10 Year Male Female Total Rice Wheat Urban Rural SUBGROUP (Combination) <1 Year Male <5 Year Female Rural Urban • IUS: Indicator Unit Subgroup • Formation Logic:
Di Mapping Tool: Introduction Once data exists in di7 web-based software then data can be mapped and published which conforms with the MDG DSD. This is all done online through the di7 web-based repository through the administration profile, so let’s begin…
Exercise 1: Codelist mapping • Use unstats.un.org/unsd/demodiweb[1-6] • Username = webmaster@xyz.com • Password = support@2012 • Map the codelists (where possible) for • Unit • Age • Sex • Location • Area • And just one indicator, “Antenatal care coverage for at least one visit”
Exercise 2: mapping time series • Use unstats.un.org/unsd/demodiweb[1-6] • Username = webmaster@xyz.com • Password = support@2012 • Map the time series for • “Antenatal care coverage for at least four visits” • “Employment to population ratio” • “Literacy rate of 15-24 year-olds” • “Death rate associated with malaria” • “Proportion of population using solid fuels”