300 likes | 379 Views
Cold Hits: The Bayesian Perspective. S. L. Zabell Northwestern University Dayton Conference August 13, 2005. Motivating Example. Database size: 10,000 Match probability: 1 in 100,000 Suspect population: 1,000,000. The NRC2 Approach. Np = 10,000 x (1/100,000) = 1/10
E N D
Cold Hits:The Bayesian Perspective S. L. Zabell Northwestern University Dayton Conference August 13, 2005
Motivating Example • Database size: 10,000 • Match probability: 1 in 100,000 • Suspect population: 1,000,000
The NRC2 Approach • Np = 10,000 x (1/100,000) = 1/10 • Suggests modest evidence
But consider … • One expects about 10 people to have type • 1,000,000 x (1/100,000) • The database search has identified one • So probability this is perp about 1 in 10 (!)
This seems paradoxical … • Is the database a random sample? • Analysis assumes it is. • Intuition: database more likely to contain perp. • Is this legally appropriate?
Bayesian inference • Due to the Reverend Thomas Bayes • Models changes in belief • Lingo: prior and posterior odds
Brief Historical Interlude • Laplace (1749 - 1827) • Championed this viewpoint • Many practical applications • Dominant view in 19th century • Attacked in 20th century by • R. A. Fisher • Jerzy Neyman
Gradual Rehabilitation • Frank Ramsey (1926) • Bruno de Finetti (1937) • I. J. Good (“Probability and the Weighing of Evidence”, 1950) • L. J. Savage (1957) … and many others
The Enigma • Modified commercial device • Offline encryption • Short tactical messages • Army, Navy, Luftwaffe, Abwehr versions
“TUNNY” • Lorenz SZ(“Schlüselzusatz”)40/42 • Online teleprinter encryption device • Longmessages(several thousand characters) • Used by Hitler and his generals • Came into general use 1942
Links • System in extensive use • By time of Normandy invasion, • 26 links • 2 central exchanges • Example: JELLYFISH: Berlin - Paris (Oberbefehlshaber West)
“General Report on Tunny” • In-house report written in 1945 • > 500 pages long • I. J. Good, D. Michie, G. Timms • Declassified in 2000
Some basic terms • Probability: p • Odds: p/(1 - p) • Initial (prior) odds • Final (posterior) odds
Bayes’s Theorem posterior odds = likelihood ratio x prior odds
Theorem is not controversial • It is a simple consequence of axioms • Status of prior odds at issue • Must they be objective frequencies OR • Can they be subjective degrees of belief?
Example 1: The blood matches • Suspect and evidence match • P[E | H0] = p (RMP) • P[E | H1] = 1 • LR = 1/p
Hypothetical • Prior odds of guilt: 2 to 1 • RMP p: 1 in 400,000 • LR 1/p: 400,000 • Posterior odds: LR x prior odds 800,000 to 1
Example 2: Paternity • LR is “paternity index” (PI) • Probability of paternity is • PI x INITIAL PROBABILITY OF PATERNITY
PI ≠ “probability of paternity • These are only the same provided • prior odds are “50-50” (1 : 1) • This may be appropriate in civil cases • Mother and father on equal footing • NOT appropriate in criminal cases • Contrary to “presumption of innocence”
Example 3: Database search • Balding and Donnelly: LR unchanged • This makes sense: • P[E | H0] the same (match probability p) • P[E | H1] the same (1) • So their ratio is still the same (1/p).
Isn’t this paradoxical? • Common intuition: difference between • “Probable cause” scenario • “Database trawl” scenario • Paradox resolved: • LR the same • The priors are different
Problems in practical use • “Suspect population” ill-defined • Assigning probabilities (not uniform) • Communicating this to a jury • But the qualitative insight is key
Didn’t Balding and Donnelly say • Evidence stronger in this case? • Yes: some individuals ruled out • People in databank who don’t match • Realistically this is rarely important • Jailhouse homicide
Relatives and databank searchs • Suspect population: 1,000,000 • No matches in databank of 100,000 • Three close calls • LR for sibs 25,000
This means • Odds for three relatives • Increases: • from 1 in 1,000,000 to 1 in 40 • Odds for everyone else • Decreases • Proportionately