170 likes | 347 Views
Research Methods in Politics 15. Testing for Association. Teaching and Learning Objectives. to consider the concept of association between variables to learn how to test for association by the application of correlation analysis and by applying tests of significance to the results
E N D
Research Methods in Politics15 Testing for Association Research Methods in Politics Chapter 15
Teaching and Learning Objectives • to consider the concept of association between variables • to learn how to test for association by the application of correlation analysis and by applying tests of significance to the results • to learn how association between variables can be expressed as equations • to learn what is meant by regression and how to produce explanatory equations between two or more variables by applying linear regression analysis and multiple linear regression analysis Research Methods in Politics Chapter 15
Association • univariate statistics, x • bivariate statistics, x, y • multivariate statistics, x1,x2,x3,. . . xn, y • independent variable (cause, driver) x dependent variable (effect, result, outcome) y • expressed as y = f(x) i.e. y is a function of x • relationship linear, exponential, logarithmic, curvilinear, etc Research Methods in Politics Chapter 15
Example: Pay and Union Density: is there any association (relationship)? Pay by Industry Pay as % Unionisation Industry of all industries Union Density % Newspaper Printing 157.8 94 Mineral Oil Refining 143.2 59 Underground Workers 145.9 99 Coal Mining 133.6 97 Air Transport 130.8 85 Electricity and Gas 122.9 95 Other Printing 120.7 94 Port and Inland Water 119.8 83 General Chemicals 117.5 59 Aerospace Engineering 117.3 80 Highest Industry Pay ……………………………………………………………………………………………………………………………………………………… Wholesale Distribution 86.7 15 Lowest Industry Pay Textiles 86.2 99 Motor Repairs 85.5 60 Industrial Materials 85.3 15 Clothing 84.0 42 Retail Distribution 83.6 15 Woollen Worsted 82.3 47 Education Services 80.6 78 Catering 77.9 8 Agriculture 72.4 23 Table 15.1Pay and union density: Source: Department of Employment, 1981: Table 54 Research Methods in Politics Chapter 15
X-Y graph of Table 15.1 Union density and pay, 1981 Research Methods in Politics Chapter 15
Analysis • as union density increases, pay appears to increase • association appears to follow a straight line • Pay = P0 + g. Union Density • where P0 is pay level where union density is 0 andg is gradient of the straight line • relationship can be calculated: coefficient of correlation, r Research Methods in Politics Chapter 15
Regression • coefficient of correlation, r • devised by Galton (1822-1911) to measure regression • tendency of children to have height etc. nearer the mean – ‘going back’ • r measures the tendency of paired data to regress • relationship between two paired variables can be expressed as a linear regression equation y = a + bx + έ Research Methods in Politics Chapter 15
coefficient of correlation, r r expressed on a scale between +1.00 and -1.00 r = +1.00 perfect positive correlation r = -1.00 perfect negative correlation r = 0.00 no correlation Research Methods in Politics Chapter 15
coefficient of correlation, r • r = 0.1, the association is termed ‘of small importance’ • r = 0.3, the association is termed ‘of medium importance’ • r = 0.5, the association is termed ‘of large importance’ • R = 0.7> beware of collinearity: x • (concealed variable) z y • interpretation from coefficient of determination, R2– proportion of the variance (change) in one variable that can be attributed to another • evidence of correlation (association) does not necessarily mean causation Research Methods in Politics Chapter 15
Calculating Regression Statistics • using MS Excel: apply to Pay/ Union Density paired data • SUMMARY OUTPUT [part] • Regression Statistics • Multiple R 0.688452 • R Square 0.473966 • Adjusted R Square 0.444742 [most reliable R2] • Standard Error 20.01117 • Observations 20 • Coefficients Standard Error t Stat P-value Lower 95% Upper 95 % • Intercept 71.48414 9.822873 7.27 9.18E-07 50.84703 92.12124 • X Variable 1 0.564809 0.140249 4.02 0.00079 0.270157 0.859461 Pay = 71.5 + 0.56 Union Density Research Methods in Politics Chapter 15
Null Hypothesis • H0 no relationship between pay and Union Density • H1 relationship between pay and Union Density • refer back to summary output • Coefficients Standard Error t Stat P-value Lower 95% Upper 95 % • Intercept 71.48414 9.822873 7.27 9.18E-07 50.84703 92.12124 • X Variable 1 0.564809 0.140249 4.02 0.00079 0.270157 0.859461 • P-value is observed level of significance – 0.05 in Politics • ‘if p value is low, then H0 can go’ • P-values are below 0.05 • Null Hypothesis can be refuted • There is an association between Pay and Union Density Research Methods in Politics Chapter 15
Multiple Regression Analysis • y = A + B1x1 + B2x2 + B3x3 + . . . + Bnxn • where x1,x2 ,x3 , . . .xn are independent variables • additional data for Pay/Union Density shows • Multiple linear regression equation is • y = 35.2 + 0.46x1 + 0.14x2 + 0.31x3 + 0.21x4 + ε • Relative Pay % = 35.2 + 0.46 [Union Density%] + 0.14 [%Workers in plants of 500+] • + 0.31 [%male workers] + 0.18 [%UK market share] + residual error Research Methods in Politics Chapter 15
R-matrix Research Methods in Politics Chapter 15
Questions for Discussion or Assignments • Correlation does not necessarily mean causation’. Discuss. Explain how you would investigate a high correlation for causation 2. The table below shows paired data for the total number of UK workers registered as unemployed and membership of the British Communist party, 1929-39. year UK unemployed (000s) BCP membership 1929 1,216 3,200 1930 1,917 2,555 1931 2,630 6,279 1932 2,745 5,600 1933 2,521 5,700 1934 2,159 5,800 1935 2,036 7,700 1936 1,755 11,500 1937 1,484 12,250 1938 1,791 15,570 1939 1,5141 7,756 Research Methods in Politics Chapter 15
Questions for Discussion or Assignments II 2. (Continued) Is there any evidence of association between unemployment and party membership between 1929 and 1939? Using Excel, draw an X-Y graph of unemployment on the x-axis and CP membership on the y-axis. Calculate the coefficient of correlation. Calculate the linear regression equation. Calculate the contribution made by unemployment to CP membership. Test the statistical significance of the calculation. Can the null hypothesis be dismissed? The data shows that, after 1934, CP membership increased whilst unemployment fell. What other causes of increasing CP membership can you suggest and why? Research Methods in Politics Chapter 15
Questions for Discussion or Assignments III 3.Read carefully the extract from the publication by Rallings, C., and Thrasher, M., (1997) Local Government Elections in Britain, London, Routledge, pp. 46-63. Examine the relevant 2001 census data for the wards of a UK city of your choice from web-site www.neighbourhood.statistics.gov.uk On the basis of the information given in the selected text and the census data available, set out an hypothesis to the research question: which three factors are most likely to have affected electoral turnout in local elections in your chosen city? Find and save the most recent headline election data for ward turnout for all of the wards in your selected city from its web-site www.[selected city].gov.uk Select three independent/collinear variables from the census information for testing your hypothesis. Transpose the data for ward names, turnout and your three selected census variables into a single spreadsheet. Produce X-Y charts of the data. Using Microsoft Excel spreadsheet software, create a spreadsheet consisting of all wards, the turnout data and the three variables you have selected from the census data. Calculate the coefficients of correlation between turnout and selected census characteristics. What inferences can you draw from the association of turnout and selected census data? Are the data significant? What limitations do you attach to these inferences? Using the appropriate formula within Excel, calculate the multiple linear regression equation between ward turnout (Y) and the independent variables (X1 . . Xn). Your submission should be no less than 2,000 words in report form. It must critically review the text by Rallings and Thrasher and clearly justify your choice of potential independent variables. You must then explicitly describe, justify and explain the analytic techniques you have adopted, the ‘results’, the linear regression equation calculated and the limitations attached to the output. Research Methods in Politics Chapter 15