470 likes | 572 Views
Controlling for a third variable: Examples (with one exception) from the NES data. Note: The example of a spurious relationship I gave in class was incorrect. It was incorrect because education was not the cause of race. (The expression “duh” comes to mind here.)
E N D
Controlling for a third variable: Examples (with one exception) from the NES data
Note: The example of a spurious relationship I gave in class was incorrect. It was incorrect because education was not the cause of race. (The expression “duh” comes to mind here.) • The substantive point of work on this subject was that lower participation on the part of African Americans could be explained by their lower levels of education—i.e., the explanation was not race per se. • That point is correct. But Professor Powell’s point—that education was an intervening variable—was also correct. • The example of a spurious relationship that follows is corrected. (Same as before; diff var’s)
Example: Spurious relationship • Spurious relationships, where the original relationship more or less completely disappears when you control for a second variable, are quite rare. • So, for this one example, I’m going to show you hypothetical numbers.
Spurious relationship (cont.) • Hypothesis: People with low levels of political efficacy participate less than those with high efficacy. • But, this effect could be explained by lower levels of education, so controlling for education would make the relationship between efficacy and participation disappear. • That is, the relationship between efficacy and participation is spurious.
The simple (uncontrolled) relationship might look like this. Tau-c = -.33 The difference is not large, but it clearly shows that those with low efficacy participate less.
Low education tau-c = -.05 When we control for education, we might get something like this. Medium education tau-c = .01 High education tau-c = .04
Spurious relationship (cont.) • How do we interpret the results? • As strong evidence that the original relationship was spurious—i.e., that the difference in participation rates between more and less efficacious people was due to their differing education levels. • Question: Does this mean that people with low efficacy actually participate as much as those with high efficacy?
Spurious relationship (cont.) • No. The original relationship is not wrong. But it is misleading if left unexplained. • A sensible interpretation is that those with low efficacy lag in education and that is why they participate less. They evidently do not participate less because of various psychological or political motivations that are associated low efficacy.
Example: Conditional (specification) relationship • Hypothesis: People’s overall liberal-conservative views (judged by their self-placement) influence (cause) their feelings of attachment to the political parties (measured by their three-point party identification). • However, this relationship is likely not to be as strong for African Americans (who, overall, very often consider themselves Democrats).
Conditional relationship (cont.) • So, let’s first look at the relationship between lib-cons self placement and partisanship (uncontrolled).
Conditional relationship (cont.) • Now, let’s control for race (blacks vs. whites only).
Conditional relationship (cont.) • Summary of relationships Self-placement x party (tau-b) Overall Whites Blacks .38 .42 .22
Conditional relationship (cont.) • How do we interpret the results? • As strong support for the hypothesis—both that people’s liberal-conservative views influence their partisanship and that this relationship is conditioned by race (i.e., is stronger for whites than blacks). • How do we know that? Because when we control, the tau-c (.38) is strengthened for whites (.42) and considerably reduced (.22) for blacks.
Conditional relationship (cont.) • Note that this conclusion does not say that African Americans are more often Democratic—though that is true. • Rather, it says that the relationship between liberal-conservative views and partisanship is weaker for blacks. Presumably, blacks’ ideological views play less of a role (than for whites) in determining their partisanship.
Example: Intervening variable • Hypothesis: Partisanship has a very strong effect on who one votes for. • But, it has this effect because partisanship causes people to have very different views of issues and candidates, which in turn, influence the vote. • That is, issue and candidate views intervene between partisanship and voting choices.
Intervening variable (cont.) • So, let’s first look at the relationship between partisanship and vote choice (uncontrolled).
Intervening variable (cont.) • Now, let’s control for Clinton’s handling of the economy. (Yes, I’m aware that we’re looking at 2000 and Gore was the Dem. candidate. But how Clinton handled the economy might still have been important.)
Tau-c for the four categories of handling of the economy
Intervening variable (cont.) • How do we interpret the results? • First, as (pretty) strong support for the hypothesis (there is that messy .72). It appears as if views of Clinton’s handling of the economy intervene between partisanship and voting choices. • How do we know that? Because when we control, the tau-c’s are (with one exception) considerably reduced from the original value of .78 and because of our theory.
Intervening variable (cont.) • Second, partisanship makes some difference above and beyond that of Clinton’s handling of the economy. • How do we know that? The tau-c values are still quite large, even after controlling. (Also, look at the percentages in each of the sub-tables.)
Intervening variable (cont.) • Third, it’s another example of a conditional relationship. Note that (again, with the exception of the one category), as Clinton approval goes down, the relationship between partisanship and the vote is weaker. • NOTE: this does not simply mean that fewer people voted for Gore, though this is true—but that the relationship between partisanship and the vote was weaker.
Intervening variable (cont.) • What do we do with that pesky .72? • Don’t totally ignore it: the results are not perfect. Reality isn’t always simple. • If possible, try to explain why the “oddity” occurs. (In this case, I think it would be very difficult.) Try to find support for any explanation you come up with. • Don’t over-interpret—i.e, don’t come up some unlikely, unsupported explanation.
Intervening variable (cont.) • One thing you should do is to look at the n (number of cases) underlying the odd result. (Here I’ve suppressed the n’s simply to make things big enough to read.) • In this case, the n is not small. Good try, but it doesn’t work. • We’re left with (as is often the case) a good, but not perfect (or perfectly explicable) analysis and interpretation.
Example: Antecedent variable • Hypothesis: Interest in the campaign causes people to be more informed about politics. • But, education is an antecedent variable—i.e., education causes people to be interested in politics and thus, indirectly, is a cause of knowledge. • If this sounds rather like the intervening variable case, it should (as you will see).
Antecedent variable (cont.) • What I’m going to do is simply follow the steps outlined by Professor Powell (next slide).
a a c b a b + Using a third variable to find an antecedent cause: b + Acauses b, but we can learn more by finding a is caused by c. Here we start with: a b We ascertain: c a c With… a Then we identify a as intervening by predicting b with c and controlling for a. To the extent the relationship is attenuated by the control, c is antecedent and works through a.
Example: Antecedent variable • So, let’s first look at the relationship between campaign interest and political knowledge (uncontrolled). (a & b) • For convenience of presentation, I’ve collapsed the six-item knowledge scale into three categories. Generally, I would not do this. I would normally prefer to have more, rather than fewer, categories (especially in my dependent variable).
Antecedent variable (cont.) • Now we need to see if education is related to interest in the campaign. (c & a) • So, we crosstab education and interest.
Antecedent variable (cont.) • Note that we have a tau-c here of .15 (the output says -.15, but effectively, it’s a positive relationship). • Two steps left. • We next check to see if education is related to knowledge. (c & b)
Antecedent variable (cont.) • Education and knowledge are related (tau-c = .25) Note: if you are really quick-eyed, you will note that I used a tau-c here on a 3x3 table. I did so for consistency. • Finally, we again look at the relationship between education and knowledge, but now controlling for interest. (c & b, controlling for a)
Antecedent variable (cont.) • How do we interpret the results? • As some support for the hypothesis. It appears as if education is an antecedent cause of the relationship between the campaign interest and political knowledge. • How do we know that? Because, when we controlled, the tau-c’s were reduced from the original value of .25 (to .24 for high interest and .16 for low interest) and because of our theory.
Some cautionary notes • Be careful about using a control variable with too many categories (or recode so there aren’t too many categories).
I took this relationship and then, unthinkingly, controlled for education, with 7 categories. (Education is not a very sensible control here theoretically, but set that aside.)
Cautionary notes (cont.) • There’s a lot of variation in the tau-c values, but no sensible pattern. • Some of the variability may be caused by small numbers of cases.
Similarly, in an earlier example of a specified relationship, I looked only at liberal-conservative views x partisanship for blacks and whites, not for all races. That’s because my theory told me what to expect for these two groups, not for others. • Which leads to the next point.
Cautionary notes (cont.) • Theory/reasoning is important. • What it makes sense to control for, and what the interpretation is once you’ve controlled, depends heavily on your reasoning about what causes what. • In particular, whether you have an intervening variable or an antecedent variable, isn’t determined simply by the tables you run (or the measures you calculate).
An explanatory point • As shown in the text (pp. 87-92), the same sort of reasoning (about kinds of relationships) is applicable when you have interval-level variables and use means instead of crosstabs.
A look forward • It might occur to you to ask about controlling for more than one variable. • Good thought. We will. But we generally do not do it by using crosstabs—for obvious reasons about complexity and interpretability. • We will get into this soon by looking at correlation and regression.
Data Analysis #2Due one week from today (by ind’s, not pairs) • Directions are on the syllabus • Reminders (unnecessary for most of you) Do not simply give us SPSS tables. Do create tables with meaningful labels, only the entries that are necessary, and so on. Explain your results. (More than “yes my hypothesis is supported.)
Usually c3 pp. (double-spaced) + tables Tables should go on a separate page. • Writing is important. Use clear, straightforward prose. Proper grammar; correct spelling, punctuation, and capitalization; typo- free