280 likes | 503 Views
God Helps Those Who Help Themselves? The Effects of Religious Affiliation, Religiosity, and Deservedness on Generosity Toward the Poor. The problem for this class is taken from the journal article:
E N D
God Helps Those Who Help Themselves? The Effects of Religious Affiliation, Religiosity, and Deservedness on Generosity Toward the Poor • The problem for this class is taken from the journal article: • Jeffrey A Will and John K. Cochran, "God Helps Those Who Help Themselves?: The Effects of Religious Affiliation, Religiosity, and Deservedness on Generosity Toward the Poor." Sociology of Religion, 1995, 56:3, 327-338. • This analysis adds additional variables for respondents to the analysis reported in the article: • Jeffrey A. Will, "The Dimensions of Poverty: Public Perceptions of the Deserving Poor." Social Science Research, 22, 312-332 (1993). • This analysis presumes that the dimensions of poverty article has been reviewed. • The data for this problem is available in the data set, DeservingPoor.Sav, which can be downloaded from the download web page. The data has been recoded from the raw GSS data to the format presented in the article. God Helps Those Who Help Themselves
Stage 1 Summary: Definition Of The Research Problem Relationship to be Analyzed "Our concern in this study, therefore, is to examine the influence of religious variables on generosity toward the poor...we examine how specific characteristics of poor families influence generosity and how the effects of deservedness vary across faith groups." (Page 328) Specifying the Dependent and Independent Variables The dependent variable is generosity, defined as the level of economic support respondents award the hypothetical welfare families depicted in the vignettes, AMT 'Total Amount Family Gets'. There are three sets of independent variables: 1. The dimensions of deservedness described in the vignette, e.g. number of children, mother's marital status, etc., which we used in the problem above. 2. Religious affiliation (Conservative Protestants, Moderate Protestant, Liberal Protestant, Catholic, Jewish, No Affiliation) Attendance at religious services, and religious identity salience (how strongly respondents identified with their faith group). 3. Respondent control variables: age, education, race, gender, and household income. God Helps Those Who Help Themselves
Stage 1 Summary: Definition Of The Research Problem Method for including independent variables: standard, hierarchical, stepwise The question would suggest a three block hierarchical regression: with deservedness in the first block, respondent control variables in the second block, and respondent religious variables in the third block. However, the author does standard multiple regression, and we will conform to his analysis. God Helps Those Who Help Themselves
Stage 2 Summary: Develop The Analysis Plan: Sample Size Issues Missing data analysis Since there are a large number of independent variables in the analysis, I ran frequency distributions on the variables to identify those that had missing data. Variables that did not have any missing data were excluded from the missing data analysis. Missing values were present for the variables: age, income, the religious groups (Catholic, Jewish, etc.), religious identity, and attendance. The correlation matrix for the valid/missing variables had correlations of 1.0 between all of the religious groups. This is because the religious groups had the same missing cases, i.e. missing values for the original religion variables would produce identical missing values for each of the dummy-coded groups. Ignoring these 1.00 correlations, we note the next highest correlation is .176. The correlations for missing values are very weak, so we do not have a missing data process that will be problematic. Since our sample size is large, we will eliminate all of the missing cases in our analyses. Power to Detect Relationships: Page 165 of Text With over 9,000 cases, we exceed the dimensions of the power table, so R² values of less than 2% will be found to be statistically significant. Minimum Sample Size Requirement: 15-20 Cases Per Independent Variable For the analysis of respondent categories, we add eight new independent variables: Religious Preference, Age, Sex, Race, Education, Income, Strength of Religious Identity, and Church Attendance. When we complete the dummy coding, we will add a total to twelve independent variables to the analysis. The ratio of observations to independent variables is 9,555 divided by 35 which equals 273 cases per independent variable. God Helps Those Who Help Themselves
Stage 2 Summary: Develop The Analysis Plan: Measurement Issues Incorporating Nonmetric Data with Dummy Variables All variables requiring dummy coding were recoded when the data set was constructed. In particular, religion was coded into a set of dichotomous religious groups, e.g. Catholic, Jewish, etc. Representing Curvilinear Effects with Polynomials We do not have any evidence of curvilinear effects at this point in the analysis. Representing Interaction or Moderator Effects We do not have any evidence at this point in the analysis that we should add interaction or moderator variables. God Helps Those Who Help Themselves
Stage 3 Summary: Evaluate Underlying Assumptions Metric Dependent Variable and Metric or Dummy-coded Independent Variables All of the variables in the analysis are metric or dummy-coded. Note that Family Savings was dichotomously coded as either 0 or 1000. This scheme will make the slope more interpretable. Normality of metric variables Five new metric variables were added to the analysis: · R_AGE 'Age Of Respondent' · R_EDUC 'Education Of Respondent' · R_INCOM 'Income Of Respondent' · R_SALIEN 'Religious Identity Of Respondent' · R_ATTEND 'Church Attendance Of Respondent' None of these variables are normally distributed and none of the transformations induce them to normality. Linearity between metric independent variables and dependent variable There is no evidence of a nonlinear relationship between these five added variables and the dependent variable. God Helps Those Who Help Themselves
Stage 3 Summary: Evaluate Underlying Assumptions Constant variance across categories of nonmetric independent variables We do not pass the homogeneity test for the variables: R_MALE 'Gender Of Respondent', R_WHITE 'Race Of Respondent', R_CONSPR 'Respondent Is Conservative Protestant', and R_MODEPR 'Respondent Is Moderate Protestant'. The only remedy for this problem would be a transformation of the dependent variable, but given the normal appearance of the histogram of the dependent variable compared to the histograms of the transformations of the dependent variable, I will forego any transformations. God Helps Those Who Help Themselves
Stage 4: Compute the Statistics And Test Model Fit: Computations In this stage, we compute the actual statistics to be used in the analysis. Regression requires that we specify a variable selection method. The article uses a standard multiple regression. Compute the Regression Model The first task in this stage is to request the initial regression model and all of the statistical output we require for the analysis. God Helps Those Who Help Themselves
Request the Regression Analysis God Helps Those Who Help Themselves
Specify the Dependent and Independent Variables God Helps Those Who Help Themselves
Specify the Statistics Options God Helps Those Who Help Themselves
Specify the Plots to Include in the Output God Helps Those Who Help Themselves
Specify Diagnostic Statistics to Save to the Data Set God Helps Those Who Help Themselves
Complete the Regression Analysis Request God Helps Those Who Help Themselves
Stage 4: Compute the Statistics And Test Model Fit: Model Fit In this stage, we examine the relationships between our independent variables and the dependent variable. First, we look at the test of R Square which represents the relationship between the dependent variable and the set of independent variables. This analysis tests the hypothesis that there is no relationship between the dependent variable and the set of independent variables, i.e. the null hypothesis is: R² = 0. If we cannot reject this null hypothesis, then our analysis is concluded; there is no relationship between the dependent variable and the independent variables that we can interpret. If we reject the null hypothesis and conclude that there is a relationship between the dependent variable and the set of independent variables, then we examine the table of coefficients to identify which independent variables have a statistically significant individual relationship with the dependent variable. For each independent variable in the analysis, a t-test is computed that the slope of the regression line (B) between the independent variable and the dependent variable is not zero. The null hypothesis is that the slope is zero, i.e. B = 0, implying that the independent variable has no impact or relationship on scores on the dependent variable. God Helps Those Who Help Themselves
Significance Test of the Coefficient of Determination R Square The R square value for this analysis, 0.188, is statistically significant at p<0.0001. God Helps Those Who Help Themselves
Significance Test of Individual Regression Coefficients The individual variables that had a statistically significant relationship to the dependent variable are highlighted in the table below, using 0.01 as the alpha level because the sample size was so large. For the highly significant variables in this analysis, our results concur for all variables, except mother's education. Our finding for mother's education shows a statistically significant relationship while the article does not. God Helps Those Who Help Themselves
Significance Test of Individual Regression Coefficients(continued) God Helps Those Who Help Themselves
Stage 4: Compute the Statistics And Test Model Fit:Meeting Assumptions Using output from the regression analysis to examine the conformity of the regression analysis to the regression assumptions is often referred to as "Residual Analysis" because if focuses on the component of the variance which our regression model cannot explain. Using the regression equation, we can estimate the value of the dependent variable for each case in our sample. This estimate will differ from the actual score for each case by an amount referred to as the residual. Residuals are a measure of unexplained variance or error that remains in the dependent variable that cannot be explained or predicted by the regression equation. God Helps Those Who Help Themselves
Linearity and Constant Variance for the Dependent Variable:Residual Plot The residual plot shows the pattern that is associated with a discrete dependent variable. There is no evidence of nonlinearity. In the plot of residuals, we see than the spread of the residuals is constant (same height) across of the values for the dependent variable, so we do not have a pattern of heteroscedasticity. God Helps Those Who Help Themselves
Normal Distribution of Residuals: Normality Plot of Residuals If we examine the normal p-p plot produced by the regression, the residuals appear to be normally distributed. God Helps Those Who Help Themselves
Linearity of Independent Variables: Partial Plots The partial plots, such as the one for Respondent's Age, do not suggest a pattern of nonlinearity. God Helps Those Who Help Themselves
Independence of Residuals: Durbin-Watson Statistic The value of the Durbin-Watson statistic (1.084) for this problem points to the same issue with serial correlation that we had in the problem above. Like the prior problem, this evidence of serial correlation is an artifact of the way the data set was structured, i.e. each respondent reviewed seven vignettes, which were added to the data set in sequential order. There is a tendency for a respondent to be generous or punitive across cases he or she reviewed. God Helps Those Who Help Themselves
Identifying Dependent Variable Outliers: Casewise Plot of Standardized Residuals We have 100 outliers on the dependent variable listed for this problem. This amounts to about 1% of the cases in the sample. God Helps Those Who Help Themselves
Identifying Influential Cases - Cook's Distance Cook's distance identifies cases that are influential or have a large effect on the regression solution and may be distorting the solution for the remaining cases in the analysis. While we cannot associate a probability with Cook's distance, we can identify problematic cases that have a score larger than the criteria computed using the formula: 4/(n - k - 1), where n is the number of cases in the analysis and k is the number of independent variables. For this problem which has 8889 subjects who had nonmissing data and 33 independent variables, the formula equate to: 4 / (8889 - 33 - 1) = 0.00045. A total of 530 cases had a Cook's distance of 0.00045 or larger, or about 6% of the sample. Since only 1 percent of the cases were outliers on the dependent variable, it is likely that the majority of these cases are outliers on the combination of independent variables. If that is the case, it could be argued that we should not consider omitting these cases because they are a consequence of the factorial design which randomly assigned the values or conditions to the independent variables. If we rejected that argument and ran the regression without these cases, the results are even more positive. The R² value increases from 18.8% to 30.5%. Moreover, some of the individual relationships with independent variables change. Though it is obvious that the influential cases have a negative impact on the analysis, we will retain these cases to maintain consistency with the author. God Helps Those Who Help Themselves
Stage 5: Interpret The Findings - Regression Coefficients Direction of relationship and contribution to dependent variable and Importance of Predictors With the exception of mother's education, our interpretation of the coefficients agrees substantially with the author's. Impact of multicollinearity Multicollinearity does not appear to be a problem in this analysis. SPSS did not alert us to any tolerance problems. The correlations among independent variables were weak or very weak, a consequence of the factorial design of the study. God Helps Those Who Help Themselves
Stage 6: Validate The Model Interpreting adjusted R square The R Square value (.188) drops to an Adjusted R Square (.185), a minor decline that indicates the model is not over fitted to the data. Split-sample validation We can use the same selection variable that we used in the analysis above. The results of the validation analysis are shown in the table on the next slide. The validation analysis supports the generalizability of the model. Some of the variables in the full model may have required the larger sample size of the full model to achieve statistical significance. In particular, note that many of the religion variables may not be as stable and generalizable as the full model would suggest. God Helps Those Who Help Themselves