300 likes | 369 Views
Economics 105: Statistics. Any questions? GH 17 and 18 due Wednesday. Interpreting Electoral Polls*.
E N D
Economics 105: Statistics Any questions? GH 17 and 18 due Wednesday
Interpreting Electoral Polls* • WASHINGTON (Reuters - 09/09/00) - A new Newsweek poll on Saturday showed Vice President Al Gore maintaining a strong lead over Texas Gov. George Bush in the presidential race, but a CNN/USA Today survey found the candidates virtually tied. • According to the Newsweek poll, Democratic nominee Gore leads Republican nominee Bush 47 percent to 39 percent among registered voters, with Green Party candidate Ralph Nader at 3 percent and Reform Party candidate Pat Buchanan at 1 percent. • Among likely voters, Gore led Bush 49 percent to 41 percent, the same margin as among registered voters. • The poll was conducted by Princeton Survey Research Associates Sept. 7-8 among 756 registered voters, including 595 who said they were likely to vote in the election. • The margin of error was 4 percentage points for the survey of registered voters and 5 percentage points for likely voters. *Source: http://www.kellogg.northwestern.edu/faculty/weber/decs-433/Presidential_Polls.htm
Interpreting Electoral Polls • Where do the 4% points and 5% points come from? • Recall … • A rough calculation for margin of error is • Sample SizeMargin of Error 10,000 .01 2,500 .02 1,112 .03 625 .04 400 .05
Interpreting Electoral Polls • What are the precise margins of error on each candidate’s support? • CandidateSample pPrecise Margin of error Gore 47 3.55 % points Bush 39 3.47 Nader 3 1.21 Buchanan 1 .71 • Conclusions is upper bound for margin of error at 95% confidence level • But it is much too big for proportions far from 50%
Interpreting Electoral Polls • Is Gore statistically ahead? • Rule of thumb to use when watching Fox/MSNBC/etc: • Double the margin of error reported in the news article & compare that to the difference in sample proportions • D will then be the difference between the proportion of voters supporting Gore and the proportion supporting Bush • Find E[D] and Var[D]
Interpreting Electoral Polls • Is Gore statistically ahead? Yes … • Rule of thumb to use when watching Fox/MSNBC/etc: • Double the margin of error reported in the news article & compare that to the difference in sample proportions • 95% CI for the “lead”
The Structure of Research(aka, “the scientific method”) An "hourglass" notion of research Begin with broad questions to a problem. Narrow down, focus in. Operationalize. OBSERVE Analyze data. Reach conclusions. Generalize back to questions.
The Research Process • Identify a problem/issue/topic of interest (or be intellectually curious) e.g. obesity, low literacy, healthcare, income inequality, poverty, crime, investing, child mortality • Read the existing literature and write a thorough literature review
The Research Process 3. Narrow the problem/issue down to a focused research question e.g., what is the effect of education on income? what are determinants of child mortality? 4. Outline a theory or conceptual framework 5. State a testable hypothesis 6. Design a study to answer the research question ... how does one infer causality?
The Research Process • 7. Collect data • measure outcomes and the factors causing those outcomes • 8. Analyze and interpret the data • 9. Revise the theory as needed • 10. Replicate the study/repeat for different populations (no single study is definitive)
Where Do Research Topics Come From? • Practical problems in each field/discipline • Literature in the field/discipline • Your own thinking, experience, knowledge
The Structure of Research What kinds of questions does science address? What is the general opinion about democracy in a country? Do women and men view democracy differently? **Note gender does not “cause” different opinions about democracy Does an increase in per capita income cause a country to become more democratic? Descriptive: Relational: Causal: http://www.vanderbilt.edu/AEA/students/Econliterature.htm
Elements of a Research Question • What? Obesity (BMI) • Who? Children, Adolescents, Adults, “Vulnerable” population • Where? Hospital, Worksite Neighborhood, State, Country • When? Today, last week, last month, last year • Why? Diet, exercise, socioeconomic status, availability of sidewalks, aggressive fast food advertising
+ + Dependent variable Body Mass Index Dependent variable Body Mass Index - - - + - + Independent variable Exercise Independent variable # of Fast Food Burgers Types of Association Between Variables
+ Cardio Exercise + Dependent variable Weight (lbs) Strength Training Dependent variable Body Mass Index - - - + - + Independent variable Time in Weight Management Program Independent variable Hair Color Types of Association Between Variables • Remember that a sample correlation coefficient can quantify the magnitude of an association, but it does not imply causality!
How do I determine if higher per capita income causes a country to become more democratic? • 3 criteria must exist in order to infer causality • Association • Direction of Influence • The cause must precede its effect. (none of this ) • Hypothesized relationships should always specify direction of influence whenever possible. • Non-spuriousness • Elimination of the rival hypothesis
Number of storks Number of births Number of storks ? Number of births Wallis & Roberts 1956 Nonspuriousness
Number of firefighters college G.P.A. Amount of damage Income 20 years after college Number of firefighters Income 20 years after college ? ? Amount of damage college G.P.A. Nonspuriousness
Nonspuriousness • Consider the following headline: • “BOTTLED WATER LINKED TO HEALTHIER CHILDREN” • It invites a causal inference, but is one warranted? • “LONGER HAIR AND HIGHER G.P.A. GO HAND IN HAND AT COLLEGE X” • And I saw this ad on TV this weekend: • “Companies that use SAP software earn 32% higher profits.”
Nonspuriousness • Famous orchestra conductors were found to have a sample mean life expectancy of 73.4 years (Atlas, 1978). • Is this relatively long? What’s the relevant comparison group? • Orchestra musicians? Nonfamous conductors? General public? • Researcher chose the U.S. population, and mean life expectancy at the time was 68.5 years. • Famous conductors get 5 extra years! Causal?
Brief Introduction to Research Design Design Notation Internal Validity Experimental Design
Design Notation • Observations or measures are indicated with an “O” • Treatments or programs with an “X” • Groups are shown by the number of rows • Assignment to group is by “R,N,C” • Random assignment to groups • Nonequivalent assignment to groups • Cutoff assignment to groups • Time
There are two lines, one for each group. Vertical alignment of Os shows that pretest and posttest are measured at same time. X is the treatment. Subscripts indicate subsets of measures. Os indicate different waves of measurement. Design Notation Example R O1,2 X O1,2 R O1,2 O1,2 R indicates the groups are randomly assigned.
Yes Randomized (true experiment) No Nonexperiment Types of Designs Random assignment? No Control group or multiple measures? Yes Quasi-experiment
Non-Experimental Designs X O Post-test only (case study) O X O Single-group, pre-test, post-test X O O Two-group, post-test only (static group comparison)
Experimental Designs • Pretest-Posttest Randomized Experiment Design • If continuous measures, use t-test • If categorical outcome, use chi-squared test • Posttest only Randomized Experiment Design • Less common due to lack of pretest • Probabilistic equivalence between groups
Experimental Designs Solomon Four-Group Design • Advantages • Information is available on the effect of treatment (independent variable), the effect of pretesting alone, possible interaction of pretesting & treatment, and the effectiveness of randomization • Disadvantages • Costly and more complex to implement