210 likes | 335 Views
An Introduction to Multiple Systems Estimation for Estimating a Count of Adverse Events. Jana Asher Carnegie Mellon University October 16, 2002. Outline. Background Overview of capture-recapture Capture-recapture model assumptions Multiple systems estimation log-linear models
E N D
An Introduction to Multiple Systems Estimation for Estimating a Count of Adverse Events Jana Asher Carnegie Mellon University October 16, 2002
Outline • Background • Overview of capture-recapture • Capture-recapture model assumptions • Multiple systems estimation • log-linear models • Rasch models • Example • Ethnic Albanian deaths in Kosovo, March – June 1999
Background • Used for estimating a population count. • The size of a wildlife population. • The number of WWW pages. • The number of people in the USA. • The number of human rights violations (civilian deaths) in Guatemala and Kosovo. • Capture-recapture = dual systems estimation. • Multiple capture-recapture = multiple systems estimation.
Overview of Capture-Recapture Capture 1 Capture 2 Overlap
Capture-Recapture Assumptions • Independence of lists • Homogeneity of capture probabilities • Error-free matching across lists • No in- or out-migration • No duplicates within a list • Lists are random samples
Multiple Systems Estimation for Three Lists • Three lists allow for modeling of dependency and/or heterogeneity.
Multiple Systems Estimation for Three Lists • Three lists allow for modeling of dependency and/or heterogeneity. • Model for dependency: where
Multiple Systems Estimation for Three Lists • Three lists allow for modeling of dependency and/or heterogeneity. • Full quasi-symmetry (Rasch) model for heterogeneity: where
Multiple Systems Estimation for Three Lists • Three lists allow for modeling of dependency and/or heterogeneity. • Full quasi-symmetry (Rasch) model for heterogeneity: Rasch model enables projection to missing cell via moment constraints (inequality restrictions).
Multiple Systems Estimation for More than Three Lists • Same modeling techniques, more parameters. • More high-quality lists available means less assumptions are required.
Example: Kosovo • Analysis required for the trial of former Yugoslav President Slobodan Milosevic for war crimes allegedly committed in Kosovo. • Question of interest: Did a systematic campaign by Yugoslav forces lead to Kosovar Albanian deaths and expel Kosovar Albanians from their homes?
Example: Kosovo • Migration data from two sources; analyzed using standard demographic techniques. • Ethnic Albanian death data from four sources; estimates of number of deaths derived via multiple systems estimation.
Kosovo: Data Sources • The American Bar Association Central and East European Law Initiative: 1,674 interviews; 5,089 incidents. • Exhumations by international teams on behalf of the International Criminal Tribunal for the Former Yugoslavia: 1,767 exhumations. • Human Rights Watch: 337 interviews; 1,717 incidents. • The Organization for Security and Cooperation in Europe: 1,837 interviews; one or more incidents each interview.
Kosovo: Data Matching • Duplicates within each list removed. • 6 matches performed; one for each pair of lists. • Human coders used match-facilitation software. • Each list pair matched 2-4 times by different coders. • Number of individual deaths (killings where the victim can be named): 4,400.
Kosovo: Death Count Estimates • Estimate of overall number of deaths created from a log-linear model of the four-way cross-classification table: 10,356 (9,002, 12,122). • Two-day time period estimates of number of deaths created from log-linear models of three-way cross-classification tables; four such cross-classification tables per time period.
Kosovo: Analysis • Regression analysis performed using KLA and NATO activity data as independent variables and death/migration estimates as dependent variables. • The analysis supports the conclusion that a systematic campaign of Yugoslav forces was responsible for ethnic Albanian migrations and deaths in Kosovo between March and June of 1999.
Overall Conclusions • Where several high-quality pre-existing incomplete lists of adverse events exist, multiple systems estimation is a viable technique for estimating a total count of adverse events. • Relatively sophisticated technical expertise is required to use this estimation technique well.
Further Reading • Ball, P., Betts, W., Scheuren, F., Dudukovich, J., and Asher, J. (2002). Killings and Refugee Flow in Kosovo March - June 1999: A Report to the International Criminal Tribunal for the Former Yugoslavia. American Association for the Advancement of Science, Washington, DC. • Contains a good reference list. • Available on my website: http://www.stat.cmu.edu/ ~asher/PAPERS2002/polkilkos_020109.pdf