1 / 24

Local and Global Scores in Selective Editing

Local and Global Scores in Selective Editing. Dan Hedlin Statistics Sweden. Local score. Common local (item) score for item j in record k : w k design weight predicted value z kj reported value  j standardisation measure. Global score.

vwalls
Download Presentation

Local and Global Scores in Selective Editing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Local and Global Scores in Selective Editing Dan Hedlin Statistics Sweden

  2. Local score • Common local (item) score for item j in record k: • wk design weight • predicted value • zkjreported value • jstandardisation measure

  3. Global score • What function of the local scores to form a global (unit) score? • The same number of items in all records • p items, j = 1, 2, … p • Let a local score be denoted by kj • … and a global score by

  4. Common global score functions In the editing literature: • Sum function: • Euclidean score: • Max function:

  5. Farwell (2004): ”Not only does the Euclidean score perform well with a large number of key items, it appears to perform at least as well as the maximum score for small numbers of items.”

  6. Unified by… • Minkowski’s distance • Sum function if  = 1 • Euclidean  = 2 • Maximum function if   infinity

  7. NB extreme choices are sum and max • Infinite number of choices in between •  = 20 will suffice for maximum unless local scores in the same record are of similar size

  8. Global score as a distance • The axioms of a distance are sensible properties such as being non-negative • Also, the triangle inequality • Can show that a global score function that does not satisfy the triangle inequality yields inconsistencies

  9. Hence a global score function should be a distance • Minkowski’s distance appears to be adequate for practical purposes • Minkowski’s distance does not satisfy the triangle inequality if  < 1 • Hence it is not a distance for  < 1

  10. Parametrised by  • Advantages: unified global score simplifies presentation and software implementation • Also gives structure:  orders the feasible choices…from smallest:  = 1…to largest: infinity

  11. Turning to geometry…

  12. Sum function = City block distance p = 3, ie three items

  13. Euclidean distance

  14. Supremum (maximum, Chebyshev’s) distance

  15. Imagine questionnaires with three items Record k Euclidean distance

  16. The Euclidean function, two items Threshold A sphere in 3D Threshold 

  17. The max function A cube in 3D Same threshold 

  18. The sum function An octahedron in 3D

  19. The sum function will always give more to edit than any other choice, with the same threshold

  20. Three editing situations • Large errors remain in data, such as unit errors • No large errors, but may be bias due to many small errors in the same direction • Little bias, but may be many errors

  21. Can show that if… • Situation 3 • Variance of error is • Local score is • Then the Euclidean global score will minimise the sum of the variances of the remaining error in estimates of the total

  22. Summary • Minkowski’s distance unifies many reasonable global score functions • Scaled by one parameter • The sum and the maximum functions are the two extreme choices • The Euclidean unit score function is a good choice under certain conditions

More Related