130 likes | 268 Views
Week 10: Contingency Tables & Measures of Association. Administrative Tasks. The HW Tough stuff, but you did well Stata is easy, interpretation of output isn’t Let’s talk about 1.E. & 2.C. Section 901 vs. 908 Showdown!.
E N D
Administrative Tasks • The HW • Tough stuff, but you did well • Stata is easy, interpretation of output isn’t • Let’s talk about 1.E. & 2.C.
Section 901 vs. 908 Showdown! • Draft a half page memo describing how your group proposes to evaluate the project • The class submitting the best proposal will have their last HW grade increased one level • Evaluation based on utility of the testing procedure & quality of memo presentation • Be specific, thorough, and yet concise • You have 5 minutes…go!
Here’s the project: U.S. Customs and Border Protection has recently equipped half of its facilities with new robotic drug sniffing dogs, the so called “RoboDogs”. The RoboDogs were randomly positioned at inspection points around the nation. You have data on the total $ amount of drugs seized from a sample of 20 points for last month, 10 with RoboDogs and 10 without. The 1 RoboDog at each facility costs $1 million/month. How could you determine if it was worth the investment?
Oh crap, what if we’re not working with interval data? • Recall Contingency Tables • A.k.a. cross-tabulations (cross tabs) • Categories of IV are columns • Categories of DV are rows • Categories should increase as you move from the top left corner to the bottom right • Calculate % of categories of IV • Write total cell frequency in parentheses • Interpret by comparing % across columns • Let’s set up a table:
The inter-ocular test • With a table set up appropriately the diagonals are very informative • They indicate a relationships directionality, or lack thereof • Perfect relationships: • All observations fall into the diagonal cells • Downward slope = positive • Upward slope = negative • Counterintuitive, but a byproduct of table layout • With absolute certainty you know the value of the DV, just by knowing the value of the IV
The inter-ocular test (cont.) • Null relationships • % is constant within rows • Knowing the value of the IV does not improve our ability to predict the DV • There is absolutely no relationship between these variables • A continuum exists between these two extremes • The closer a relationship is to being perfect the more confident we can be that the IV is statistically related to the DV
The Chi-square test • Definition: • “a procedure for evaluating the level of statistical significance attained by a bivariate relationship in a cross-tabulation” • Four steps • 1. Calculate expected frequencies • 2. Square the difference between observed and expected and divide by expected • 3. Sum all of these to obtain a test chi-square statistic • 4. Compare test statistic with chi-square table (pg. 529) • Degrees of freedom is (# of columns -1) x (# or rows -1) • Let’s walk through an example:
Limitations of the chi-square test • Nature and direction of relationship??? • Irrelevant! • Inflated by sample size • Does not assess magnitude • Probability of existence of relationship • Need other measures to account for relationship strength
What if we want to know the magnitude of association? • 2 basic criteria: • Proportional Reduction in Error (PRE) • Prediction based metric that varies between 1 & 0 • 0 = the independent variable does not increase our ability to predict the dependent variable at all • 1 = the independent variable allows us to predict all observations of the dependent variable • Asymmetric measures of association • Models the independent as the causal variable and the dependent variable as the effect
Lambda LambdaLambda • Recall that our best guess for any variable is its central tendency • If we made this guess for every observation we would error every time a value was not the central tendency • If we know the IV how much of this error can we eliminate?
Lambda Cont. (a.k.a. Revenge of the Nerds II) • Lambda measures: • 2 categorical vars. • % of prediction errors eliminated by accounting for IV • Rough guideline for Lambda values • .1 = Weak • .1 to .2 = Moderate • .2 to .3 = Strong • > .3 = Very Strong
Somer(s’) time • Ordinal measure of association • Terminology: • Concordant pairs: • Positive relationship • Discordant pairs: • Negative relationship • Think about diagonals • Somers’ measures (Concordant – Discordant) divided by all pairs • Let’s do an example: