370 likes | 617 Views
Probability, Statistics and the Logic of Scientific Causality. An Introduction to Social Science Data Analysis. Logic of Scientific Inquiry. Inter-subjective Standards Mathematics Logic Subjective Standards Religion Normative philosophy . What is the job of a social scientist?.
E N D
Probability, Statistics and the Logic of Scientific Causality An Introduction to Social Science Data Analysis
Logic of Scientific Inquiry • Inter-subjective Standards • Mathematics • Logic • Subjective Standards • Religion • Normative philosophy
What is the job of a social scientist? • To try to use whatever tools we have at our disposal to prove ourselves wrong about our causal theory. • Tools: • Logic • Empirical observation • Try to disprove our theory as much as possible • Unfortunately, we cannot prove anything • Always estimate the level of uncertainty in any claim
Approaches to Political Science • Interpretavism – if we describe the phenomenon, what does it mean • Behavioralism – the underlying roots of our attitudes and behaviors • Neo-institutionalism – the relationships among attitudes and behaviors depend on rules and other constraints • Rational Choice • Assume preferences • Deduce outcomes
The Research Question:Why does ‘y’ vary? • Why do some people vote for Democrats and others for Republicans? • Why do some ethnic conflicts get resolved and other end in holocausts? • Why do some democracies remain stable and others fall apart? • Why are some economies successful and others are not? • Why do some civil conflicts result in revolution and others do not? • What causes some people to support the civil liberties of political enemies? • What causes some people to trust one another and others not to trust? • What causes some people to participate in their government? • What causes some people to bring litigation against their government?
Measurement • Inter-subjective Measures • Inches • Degrees Fahrenheit • Dollars • Continuous v. Discrete
Concepts that are difficult to measure inter-subjectively • Democracy • Self-Esteem • Ideology
Face Validity: Political Tolerance If your worst political enemy (i.e. Nazi’s, KKK) came to your town, would you support their right to march downtown? Not support at all Not really support Somewhat support Strongly support 4 1 2 3
Reliability Repeat Study: the same people who got a 1 would get about a 1 again and so on. Multiple questions at the same time.
High Variance Mean
Low Variance Mean
Variance • Total Variance: • Sum of squared distance of each point to the mean/number of observations. (4)2=16 (3)2=9 (0)2=0 (2)2=4 (1)2=1 (1)2=1 (1)2=1 Mean No error Total Variance = 32/7 = 4.57
Variance: Political Tolerance Number of People
Variance – Skewed Distribution Number of People
Low Variance Number of People
High Variance Number of People
Standard Deviation Square root of the average level of variance We square the deviations from the mean, but then the units of squared numbers do not make sense, so we then take the square root of it.
Standard Deviation • The standard deviation is the sum of squared distance of each point to the mean, divided by the number of observations. (4)2=16 (3)2=9 (0)2=0 (2)2=4 (1)2=1 (1)2=1 (1)2=1 Mean No error Total Variance = 4.57; Standard Deviation = √32/7 = 2.13
Measurement We will spend a great deal of time on measurement in this class
Probability • The probability of an outcome … • Is the frequency of that outcome • if the process were repeated a large number of times… • Under similar conditions
Probability is not causality • Fire trucks fire damage • Storks babies
Causal Theory We will spend a great deal of time on causal theory in this class
Statistical Theory • Frequentist statistical theory assumes repeated observations. • From large sample sizes, we assume that we have repeated observations. • Large? 60.
Statistical Relationships 40 3 3 3 2 35 2 3 3 1 3 2 30 1 3 Education 3 1 25 1 3 3 3 1 Lowest Low Medium High Highest 3 3 20 2 2 2 3 2 4 2 3 2 3 2 4 3 2 1 15 4 3 4 2 1 2 3 4 3 4 3 2 2 3 2 4 5 10 1 1 2 3 4 3 4 5 5 4 4 1 1 2 2 3 3 5 5 5 5 1 2 2 2 3 3 4 4 5 5 5 2 1 3 2 3 4 3 5 5 4 5 1 1 2 3 3 3 5 4 5 5 0 1 2 3 4 5 Political Tolerance
Probability • The probability of an outcome … • Is the frequency of that outcome • if the process were repeated a large number of times… • Under similar conditions
Statistical Relationships 6 Slope 5 4 Political Tolerance Mean 3 2 1 0 0 1 2 3 4 5 6 Education
Probability: Best Guess • What is the probable value of tolerance, given condition of education? • This is what the slope tells us. 6 Slope 5 4 Political Tolerance Mean 3 2 1 0 0 1 2 3 4 5 6 Education
The Mean • The mean is the best guess if all you have is a single variable Mean No error • The purpose of the mean is to minimize error in guessing • The mean is the most probable expected value
Our job in statistical analysis • Is to do better than the mean at making the ‘best guess’
Variance: Explained and Unexplained 6 Unexplained: Distance from the points to the slope 5 Slope Explained: From slope to mean 4 Political Tolerance Mean 3 2 1 0 0 1 2 3 4 5 6 Education
Remember when we said: Our job in statistical analysis is to do better than the mean? We use the slope to… • Minimize Error: the distance between the points and the slope, …while, by definition, simultaneously • Maximizing Explained Variance: the distance between the mean and the slope.
Decrease Error 6 Slope 5 4 Political Tolerance Mean 3 2 1 0 0 1 2 3 4 5 6
Variance: Explained and Unexplained 6 5 Slope 4 Political Tolerance Mean 3 2 1 0 0 1 2 3 4 5 6 Education
Unexplained Why does ‘y’ vary? • Our job as social scientists is to explain variance. • Statistically, we do that by separating explained from unexplained. Explained
Aspects of Political Science Data Analysis • Causal Theory • Measurement Theory • Explain Variation