120 likes | 256 Views
Progress on Real Time Remote Access. Michelle Simard Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality Tarragona, Spain, November 23 rd , 2011. Since 2009. Developed a Prototype o ffering tabulated counts Developed Statistical Disclosure Control (SDC)
E N D
Progress on Real Time Remote Access Michelle Simard Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality Tarragona, Spain, November 23rd , 2011
Since 2009 • Developed a Prototype offering tabulated counts • Developed Statistical Disclosure Control (SDC) • Continued development on different fronts Statistics Canada • Statistique Canada
The Prototype Statistics Canada • Statistique Canada
Spring 2010 Tabular(counts) outputs only - SAS only Modified PROC FREQ, Data steps Limited to particular household surveys data sets Confidentiality automated, no manual intervention Limited to some Canadian Federal Departments only Ability to query RTRA micro data at any time Access from any computer with internet access, using a secure username and password No travel to Research Data Centres The Prototype Statistics Canada • Statistique Canada
Minimum 4 minutes plus process time Maximum 3 hours plus process time Email notification for outputs with 7-day retention Formatted table in HTML or in SAS The Prototype Statistics Canada • Statistique Canada
Additive and Controlled Rounding (ACROUND) Create rounded additive table close to original table with controlled grand total → semi-controlled rounding Use an iterative process to improve the semi-controlled result → controlled rounding Protects against possible matching of information with PUMF and small impact on precision Maximum : 5 dimensions Current SDC Statistics Canada • Statistique Canada
Proc Percentile Release the percentile only if there are at least n1 observations ≥ the percentile value and at least n2 observations ≤ the percentile value it is ≠ minimum or maximum value the total number of unweighted observations is ≥ m 4. the rounded frequency associated (from ACROUND) with the percentile is ≠ 0 Recent Development Statistics Canada • Statistique Canada
Proc Mean Release the mean only if there are at least n3 observations present in the domain the rounded frequency associated with the mean (from ACROUND) is ≠ 0 For both PROC, “magnitude” rounding will be applied on statistics to balance precision and noise Recent Development Statistics Canada • Statistique Canada
Challenges and Issues • Not only balancing confidentiality and precision BUT quality measures as well • Evaluating the risk • Displaying information (What and How) • Statistics, Standard Error(SE), Variance, Coefficient of Variation (CV), Confidence Interval (CI),Quality Indicator (QI), weighted counts, unweighted counts, ACROUND outputs? Statistics Canada • Statistique Canada
Quality Measures • Release estimate with SE and a Quality Indicator (QI) • If not releasable ==> put ‘X’s or other symbol • otherwise release SE and QI as follows: *Note: CV is calculated from original non-rounded S.E. and percentile Statistics Canada • Statistique Canada
Next steps • Used output control techniques rather than input control techniques • Next step: proportions, ratios, totals, models • May need input control techniques when going into modeling • Expansion to the academic community • Expansion to Censuses, then administrative data • Streamlining the approval processes • Developing a “fee” structure and “penalty” processes Statistics Canada • Statistique Canada
For more information, Pour plus d’information, please contact: veuillez contacter : Michelle Simard Michelle.Simard@statcan.gc.ca THANK YOU Statistics Canada • Statistique Canada