1 / 12

A two-phase life-cycle model of integrated statistical micro data

Explore the two-phase life cycle model for integrated statistical microdata, combining register-based statistics and survey sampling. Covers measurement, representation, validity, sampling and measurement errors. Welcome to a new age of data integration!

terraj
Download Presentation

A two-phase life-cycle model of integrated statistical micro data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A two-phase life-cycle model of integrated statistical micro data Li-Chun Zhang Statistics Norway lcz@ssb.no

  2. Register-based statistics & early years of survey sampling (Source: UNECE 2007) 20?? N. Kiær (1895). The representative method. ISI Session, Bern. A. Jensen (ISI-committee, 1924): “When ISI discussed the matter twentytwo years ago, it was the question of the recognition of the method in principle that claimed most interest. Now it is otherwise. I think I may venture to say that nowadays there is hardly one statistician, who in principle will contest the legitimacy of the representative method. Nevertheless, I believe that the representative method is capable of being used to a much greater extent than now is the case.” J. Neyman (1934). On the two different aspects of the representative method: The method of stratified sampling and the method of purposive selection. JRSS 97, 558-606.

  3. Survey life cycle from a quality perspective (Groves et al., 2004, Survey Methodology, Figure 2.5) Measurement Representation Target Population Construct Coverage Error Validity Sampling frame Measurement Sampling Error Measurement Error Sample Nonresponse Error Response Respondents Processing Error Adjustment Error Edited Response Postsurvey Adjustments Survey Statistic

  4. A two-phase life-cycle model • Secondary use • Combination of sources

  5. Single-sourceprimary-phase statistical micro data Measurement (Variables) Representation (Objects) Target Concept Target Set Frame Validity Measurement Accessible Set Selection Measurement Response/ Registration Accessed Set Missing/ Redundancy Processing Observed/ Validated Set Editing Single-source Micro Data (Primary)

  6. Integrated secondary-phase statistical micro data Unit vs. Object Measurement vs. Representation Missing Values vs. Coverage Measurement (Variables) Representation (Units) Base Unit No. 1 Target Concept Composite Unit No. 1 Target Population Composite Unit No. 1 Transformation (Object to Unit) Relevance Coverage m:1 m:1 Base Unit No. 2 Composite Unit No. 2 Composite Unit No. 2 Harmonization Data Linkage Mapping Identification Base Unit No. N Classification Composite Unit No. K Alignment Composite Unit No. M Compatibility Unit m:1 Statistical Units Adjustment Composite Unit No. 2 Composite Unit No. H Composite Unit No. 1 Integrated Micro Data (Secondary)

  7. An illustration of register-based household data:Kongsvinger at the time point of census 2001

  8. Representing unit error by allocation matrix (Equivalence on row permutation & sequential upper-triangular by definition)

  9. Value matrix (or vector): XStatistics: y = A X

  10. Two more examples of statistics

  11. Results: Statistical uncertainty w.r.t. unit errors

  12. The 20th Century = Survey Sampling The 21th Century = Data Integration Welcome to a new age!

More Related