Statistical Inference

Statistical Inference • Estimation • Confidence Intervals • Estimate the proportion of the electorate who support Candidate X • Hypothesis Tests • Make a decision • Will Candidate X win the election?

Hypothesis Test: An example • Castanada v. Partida 430 U.S. 482 • Background • Rodrigo Partida arrested and convicted of felony crime in Hidalgo County, Texas in 1972 • Partida, Mexican-American, argued discrimination in selection of grand jury that indicted him • Argued before the Supreme Court on Nov. 9, 1976 • Filed habeas corpus petition alleging denial of due process and equal protection under 14th Amendment because of gross under-representation of Mexican-Americans on the county grand juries

Castanada v. Partida • Statistical evidence presented to court • Population • Hidalgo County: 181,535 • Mexican-American: 143,611 (79.1%) • Grand Juries (1962-1972) • 870 persons summoned • Mexican-American: 339 (39.0%)

Castanada v. Partida • The population is all potential grand juries • The sample is the 1962-1972 jury data • Let p be probability that a random juror is Mexican-American • If no discrimination, then p = .791 (State’s position) • Otherwise, p < .791 (Defendant’s position) • Set up null and alternative hypotheses • Ho: p = .791 vs. Ha: p < .791

Castanada v. Partida • The hypotheses involve the population parameter p • The evidence is based on the sample proportion • The central question: If the state’s case is true and p = .791, what’s the chance of observing a sample proportion as small as .390? • Central logic of hypothesis tests: Assume the null hypothesis is true and ask what’s the probability of observing what we actually did observe. • That probability is called the P-value

Castanada v. Partida • Computing the P-value (you won’t have to do it!) = -.401/.0138 = -29.058 The z-score value is over 29 standard deviations from the mean! That’s a number with a decimal point and then 186 zeros before it gets to 6. It’s about the same chance of winning the Powerball Lottery 23 times in a row!

Castanada v. Partida • Conclusion: reject the null hypothesis • If there was no discrimination, and if the true probability of picking a Mexican-American juror was in fact 79.1%, then the chance of getting a jury composed of only 39% Mexican-American is astronomically miniscule. • Conclusion: We reject chance as the explanation

Castanada v. Partida • Legal Issue: finding of statistical significance does not constitute proof of discrimination • But it’s used to establish prima facie case • Burden of proof shifts to plaintiff to give another reasonable explanation for disparities on jury • On March 23, 1977, Supreme Court ruled 5-4 in favor of Partida to establish his prima facie case.

Castanada v. Partida:Legal Issue • Justice Blackmun: “As a general rule for such large samples, if the difference between the expected value and the observed number [of Mexican-Americans] is more than 2 or 3 standard deviations, then the hypothesis that the jury selection was random would be suspect by a social scientist.” • Federal courts have fashioned “two or three standard deviation” norm into rule of law, particularly in discrimination cases

Summary • Confidence interval estimates an unknown parameter • Hypothesis test assesses the evidence for some claim about the value of an unknown parameter. • Central Question: “Could the effect we see in the data just be an accident due to chance, or is it good evidence that the effect is really there in the population?” • Answer by computing the probability (P-value) that the observed effect is as large as what we would expect by chance, assuming the null hypothesis were true. • A small P-value means that the outcome is unlikely to occur by chance. This is evidence against the null hypothesis. • A moderate or large P-value means the outcome could be consistent with chance as an explanation. There is insufficient evidence to reject the null hypothesis.

Statistical Inference