260 likes | 422 Views
SIMPII – Workshop on Information Technology. Day 4 Metadata. Statistics Canada. December 1 st 2011. What is metadata? Standards Why is it important? Implementation example with Social Surveys Common Tools. Outline.
E N D
SIMPII – Workshop on Information Technology Day 4 Metadata Statistics Canada December 1st 2011
What is metadata? Standards Why is it important? Implementation example with Social Surveys Common Tools Outline Statistics Canada • Statistique Canada
Definition: “Metadata is structured information that describes, explains, locates, or otherwise makes it easier to retrieve, use, or manage an information resource”* *NISO (2004) Understanding Metadata. Bethesda, NISO Press Describes content, quality, condition and other characteristics about data What is metadata? Statistics Canada • Statistique Canada
Metadata answers questions about your data: What is the concept? Where is the input source? What is it used for? When did it changed? Who changed the variable last? Helps to improve the communication between: Data developers, Data users and Organizations What is metadata? Statistics Canada • Statistique Canada
Intended to establish a common understanding of the meaning or semantics of the data As an example in StatCan, we use : DDI : standard for technical documentation describing social science data Standards Statistics Canada • Statistique Canada
Records basic information about your data Provides a common understanding of your data Allows for reuse during Survey Development Life Cycle Facilitates connections between systems & services Support archiving & preservation Why is it important? Statistics Canada • Statistique Canada
Example “dog” “golden retriever puppy” Clearly, this more specific search term is better. But it only works if someone has taken the time to associate the metadata. Statistics Canada • Statistique Canada
Example document name expiration date This puppy example illustrates not only the effectiveness of metadata but also the importance of tagging content with metadata. If users don’t take the time to attach metadata when they create, upload, or edit documents the benefits will be lost. audience version DOC project department Statistics Canada • Statistique Canada
Enterprise Metadata Classification Statistics Canada • Statistique Canada
Common Tools Logo Statistics Canada • Statistique Canada
Common Tools Technical Architecture Statistics Canada • Statistique Canada
Solution Overview • Social Survey Metadata Environment (SSME) • Supporting environment of a metadata driven processing system • Interfaces are developed to access and manipulate appropriate metadata in support of a particular business process • Questionnaire Development (QDT) • Data Dictionary (DDT) • Processing and Specifications (PST) • Derived Variable (DVT) Statistics Canada • Statistique Canada
Solution Overview • Social Survey Processing Environment (SSPE) • A set of generalized processes that can be used in the processing activities of the Survey Life Cycle. • The purpose of these processes is to allow subject matter and survey support staff to specify and run the processing of a survey in a timely fashion with high quality outputs. Statistics Canada • Statistique Canada
Questionnaire Development Tool screenshot Statistics Canada • Statistique Canada
Questionnaire Development Tool screenshot Statistics Canada • Statistique Canada
QDT Auto-generated Report Statistics Canada • Statistique Canada
Processing Specifications Tool Statistics Canada • Statistique Canada
Processing Specifications Tool Statistics Canada • Statistique Canada
Processing Specifications Tool Statistics Canada • Statistique Canada
Data Dictionary Tool output Variable Name: CELL_03ALength: 1 Position: 5 Question Name: CELL_Q03 Concept: Reasons to get a cell phone – Gift Question: Forwhich of the following reasons did you get your cell phone?–Gift Universe: Respondents who answered CELL_1=1 Statistics Canada • Statistique Canada
Common Tools Entity Relationship Diagram Statistics Canada • Statistique Canada
Common Tools Portal Statistics Canada • Statistique Canada
Statistical Data and Metadata eXchange (born in 2002)- Standardization for statistical data and metadata access and exchange- Between NSO’s and international organizations- Within a national statistical system - Within an organization- For dissemination Sponsors: BIS, ECB, EUROSTAT, IMF, OECD, UN, World Bank 1) Technical standards (v1: ISO 17369)- XML-based message formats (SDMX-ML)- GESMES and the UN/EDIFACT-based message formats- Guidelines for SDMX web service implementations- SDMX registry specification (“yellow pages”) 2) SDMX Content-Oriented Guidelines- Statistical subject-matter domains (to locate data and working groups)- Cross-domain concepts/code lists (incl. metadata concepts, mapping if difficult to agree)- Metadata common vocabulary (terminology) SDMX
Create SDMX-ML outputs from CANSIM Investigate OECD implementation of SDMX using .STAT software Participate in Statistical network -- Innovation in dissemination, Machine to machine transfer stream with Stats New Zealand, Australian Bureau of Statistics Investigate implementation of SDMX Reference Infrastructure from Eurostat SDMX Plans for Statistics Canada
Communication is key to collaboration Help for decision making Reduces system and data redundancy Enables enterprise-wide application development Conclusion Statistics Canada • Statistique Canada
Xiexie Jean Labbé Field IT Manager Statistical Information System Division Informatics Branch (613) 951-2584 Jean.Labbe@statcan.gc.ca Statistics Canada • Statistique Canada