1 / 23

The effect of disclosure control on the SWS data Eileen Howes 30 September 2004

The effect of disclosure control on the SWS data Eileen Howes 30 September 2004. With much assistance from: Giorgio Finella Bill Armstrong. Summary of presentation. Early concerns with output area matrix data Early comparison with Theme Table 10 Commissioned aggregations

buffy
Download Presentation

The effect of disclosure control on the SWS data Eileen Howes 30 September 2004

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The effect of disclosure control on the SWS dataEileen Howes30 September 2004

  2. With much assistance from:Giorgio FinellaBill Armstrong

  3. Summary of presentation • Early concerns with output area matrix data • Early comparison with Theme Table 10 • Commissioned aggregations • Recent comparisons with commissioned table • What can we do next?

  4. OA matrix data quality • Major problems with output area matrix data • Clementswood and Hyde Park wards – difference between Theme Table 10 and SWS301, means of travel to work for all workers

  5. Clementswood ward – workers SWS301 TT10 Difference • Total workers 8,810 8,059 751 • Underground 303 263 40 • Car driver 4,646 4,005 641 • Bus 1,567 1,554 13

  6. Hyde Park ward - workers SWS301 TT10 Difference • Total workers 17,024 16,712 312 • Underground 5,739 5,298 441 • Car driver 2,779 2,954 -175 • Bus 1,835 1,852 -17

  7. Why are they different? Disclosure control applied to a large number of flows 30 OAs in Clementswood ward, all flows into each OA (about 25,000) Problem with disappearing flows

  8. Commissioned aggregations Part of Table SWS301 for all OAs of workplace in London Areas of residence aggregated to: Wards Districts Inner and Outer London GORs Counties/former counties in East and SE

  9. Commissioned aggregations Tables received Big files Supertable format One big mistake in the spec – no area codes More of a challenge

  10. Nightmare at City Hall Export csv files from Supertable Files with 15,200,000 records Adding area codes Checking data

  11. Commissioned table C0310 Finally, we have csv and SASPAC system files for: OA of workplace in London, ward of residence Means of travel to work

  12. Comparisons of C0310 with OA to OA SWS data C0310 SWS301 % diff. Residents of Clementswood 3,843 3,846 -0.1 ward who are in work Works at/from home 355 357 -0.1 Travels by: Underground 463 483 -4.3 Train 605 621 -2.6 Bus 424 432 -1.9

  13. Comparisons of C0310 with OA to OA SWS data (cont.) C0310 SWS301 % diff. Taxi 27 15 +44.0 Car driver 1,351 1,340 +0.8 Car passenger 145 135 +7.0 Motor cycle 18 27 -50.0 Bicycle 33 33 0.0 On foot 404 391 +3.2 Other 18 12 +33.0

  14. Comparisons of C0310 with SWS data OriginWard of residence (Clementswood in London Borough of Redbridge) C0310 aggregated by ONS, rounded once SWS OAs aggregated by user, thousands of rounded numbers Destination London (Sum of all flows into all OAs in London – quite a lot of small numbers rounded differently)

  15. Commissioned Table C0310 Clementswood ward to OAs in London Number of flows in table 24,140 Number of zero flows 23,502 Number of non-zero flows 638 Flows with value of 3 424 (66%) Flows with value of 4 21 Flows with value of 5 10 Flows with value of 6 70 Flows with value of 7 17 Flows with value of 8 12 etc

  16. Comparisons of C0310 with SWS301 Clementswood ward to OAs in London Number of non-zero flows C0310 638 Number of non-zero flows SWS301 595 But: Only 412 flows are common to both tables 182 out of 595 flows in SWS301 are not in C0310 (31%) 226 out of 638 flows in C0310 are not in SWS301 (35%)

  17. Comparisons of C0310 with SWS301 • 214 flows value 4+ on C0310 • 63 of these are not on SWS301 • 29% not on SWS301 • Values of these 63 NOT all multiples of 3 (some 4,5,7,8,10)

  18. Comparisons of C0310 with SWS301 • 217 flows of 4+ on SWS301 • 66 of these flows are not on C0310 • 30% not on C0310 • Values of these 66 are all multiples of 3

  19. Comparisons of C0310 with SWS301 • Of the 151 common flows (on both C0310 and SWS301): • Only 28 (19%) have the same value in both tables • Some are very different

  20. Comparisons of C0310 with SWS301 • Less than 5% of non-zero flows have the same value in each table • 95% are different numbers or are not there

  21. And if you thought the ward to ward data would be better: Ward to ward SWS Total number of flows in data 2,123,432 Flows which appear in only 1 table 294,053 Flows which appear in only 2 tables 351,352 Flows which appear in only 3 tables 335,690 Flows which appear in only 4 tables 259,260 Flows which appear in only 5 tables 222,628 Flows which appear in all 6 tables 660,449

  22. Is this data fit for purpose? • No

  23. So what can we do next? More of this type of analysis Lobby for separate class of user Public sector/academic users 1920 Census Act

More Related