560 likes | 690 Views
Chapter Fourteen. Examining Associations: Correlation and Regression. Did You Know that Degree, Color and Race Make a Difference in Home Refinancing?. A study found that broker fees for purchasers without a college degree pay $1,472 more than those with a college degree
E N D
Chapter Fourteen Examining Associations: Correlation and Regression
Did You Know that Degree, Color and Race Make a Difference in Home Refinancing? • A study found that broker fees for purchasers without a college degree pay $1,472 more than those with a college degree • No only did a degree matter, but race was also a factor. African Americans on average paid $500 more than whites, Hispanics $275 more than whites. • Regression analysis was used to determine whether various borrower characteristics had a bearing on the amount of broker fees and closing costs paid.
Did You Know that Disaster Area Declarations are Related to Electoral Votes? • A study revealed states that had been declared disaster areas are crucial to presidential elections • Regression analysis revealed that states that are likely to be declared disaster areas were the states that were highest in electoral votes
Did You Know that the Presence of an NFL Team Boost Rental Costs? • Regression analysis has revealed that in cities with an NFL team, rental costs for apartment in the central city area were 8 percent higher than in cities without an NFL team • Property tax receipts were also found to be higher in cities with NFL teams
Overview of Techniques for Examining Associations • Spearman Correlation Coefficient Technique • The technique is appropriate when • The degree of association between two sets of ranks (pertaining to two variables) is to be examined • Illustrative research question(s) thistechnique can answer • Is there a significant relationship between motivation levels of salespeople and the quality of their performance? • Assume that the data on motivation and quality of performance are in the form of ranks, say, 1through 20, for 20 salespeople who were evaluated subjectively by their supervisor on each variable
Overview of Techniques for Examining Associations(Cont’d) • Pearson Correlation Coefficient Technique • This technique is appropriate when • The degree of association between two metric-scaled (interval or ratio) variables is to be examined • Illustrative research question(s) thistechnique can answer • Is there a significant relationship between customers' age (measured in actual years) and their perceptions of our company's image (measured on a scale of 1to 7)?
Overview of Techniques for Examining Associations(Cont’d) • Simple Regression Analysis Technique • This technique is appropriate when • A mathematical function or equation linking two metric-scaled (interval or ratio) variables is to be constructed, under the assumption that values of one of the two variables is dependent on the values of the other
Overview of Techniques for Examining Associations – Simple Regression Analysis(Cont’d) • Illustrative Research Question(s) this Technique Can Answer • Are sales (measured in dollars) significantly affected by advertising expenditures (measured in dollars)? • What proportion of the variation in sales is accounted for by variation in advertising expenditures? How sensitive are sales to changes in advertising expenditures?
Overview of Techniques for Examining Associations(Cont’d) • Multiple Regression Analysis Technique • This technique is appropriate • Under the same conditions as simple regression analysis except that more than two variables are involved wherein one variable is assumed to be dependent on the others
Overview of Techniques for Examining Associations(Cont’d) • Illustrative Research Question(s) this Technique Can Answer • Are sales significantly affected by advertising expenditures and price (where all three variables are measured in dollars)? • What proportion of the variation in sales is accounted for by advertising and price? How sensitive are sales to changes in advertising and price?
A Spearman correlation coefficient is a measure of association between two sets of ranks n 6 d2i i =1 rs = 1 - ---------------------------- n(n2 - 1) di = the difference between the ith sample unit's ranks on the two variables n = the total sample size Spearman Correlation Coefficient
Example: Industrial Marketing Firm • An industrial marketing firm has been hiring all its salespeople from among the graduates of 10 business schools in the vicinity of its headquarters • The firm developed a subjective ranking of the perceived prestige levels of the 10 schools and the performance levels of the groups of graduates recruited from these schools • Question • What is the degree of association between the prestige levels of the schools and the sales performance levels of their graduates hired by this company?
Table 14.2 Association Between School Prestige and Performance of Graduates
Results • First step is to calculate the Spearman Correlation Coefficient. • The result is = .661 • The next step is to calculate the t-distrubtion
Spearman Correlation Co-efficient Hypotheses H0: s = 0 Ha: s 0
t - Distribution • For = .05, • t for 8 degrees of freedom (d.f. = n - 2 = 10 - 2 = 8) • tc = +2.31 and -2.31 • Decision Rule: • “Reject H0 if t 2.31 or if t -2.31.” • Since t > 2.31, we reject H0 and conclude that there is a true association between the prestige of business schools and the job performance of its graduates.In other words, the sample correlation of .661 is unlikely to have occurred because of chance.
The Pearson correlation coefficient is the degree of association between variables that are interval-or ratio-scaled. Pearson correlation coefficient (rxy) between them is given by n = sample size (total number of data points) X and Y = means Xi and Yi = values for any sample unit i sx and sy = standard deviations n S (Xi – X)(Yi – Y) = 1 i rxy = ----------------------------- (n-1) sx sy Pearson Correlation Coefficient
Scatter Diagram • Plot in a two-dimensional graph • Indicates how closely and in what fashion the variables are associated
Exhibit 14.1 Scatter Diagram of Sales and Advertising Data • What is the relationship between dollar sales and advertising expenditure ?
Exhibit 14.2 Scatter Diagram of Sales and Number of Competing Brands • What is the relationship between dollar sales and number of competing detergents ?
Pearson Correlation • Correlation between sales and advertising is .927 • Correlation between sales and number of competing brands is .910
Two-Tailed Hypothesis Test For Correlations • H0: = 0; • Ha: 0, • For = .05, • 19 degrees of freedom(d.f.= n - 1 = 19) • rc = + .433 and rc = -.433 • Decision rule is: • “Reject H0 if r .433 or if r -.433.” • Reject H0 in both cases
Exhibit 14.3 Scatter Diagram Showing a Nonlinear Association Between Variables
National Insurance Company– Computing Pearson Correlation Among Service Quality Constructs • National Insurance Companywas interested in the correlations between respondents’ overall service-quality perceptions (on the 10-point scale) and their average ratings along each of the five dimensions of service quality
National Insurance Company– Computing Pearson Correlation Among Service Quality Constructs(Cont’d) • Click ANALYZE • Select CORRELATE • Select BIVARIATE • Move “oq, reliable, empathy, tangible,response, and assure” to VARIABLES box • Click OK
National Insurance Company– Computing Pearson Correlation Among Service Quality Constructs(Cont’d)
National Insurance Company– Computing Pearson Correlation Among Service Quality Constructs Using SPSS
Interpreting Pearson Correlation Coefficients • Each of the five service-quality measures (reliability, empathy, tangibles, responsiveness, and assurance) is significantly related to the overall quality (OQ) at the .001 level of significance • Responsiveness has the strongest correlation (.8625) • Tangibles have the weakest correlation (.5038) • All the correlations are strong enough to be meaningful
Simple Regression Analysis • Generates a mathematical relationship (called the regression equation) between one variable designated as the dependent variable (Y) and another designated as the independent variable (X)
Independent Variable vs.Dependent Variable • Independent variable • Explanatory or predictor variable • Often presumed to be a cause of the other • Dependent variable • Criterion Variable • Influenced by the independent variable
Scenario: Curtis Construction Industry Lobbyist • Curtis, a construction industry lobbyist, is in an area of the country that has a high unemployment rate and a number of economically depressed construction projects • His current charge is to convince local government officials to vote in favor of several tax concessions for the construction industry • He is wondering whether he can generate any concrete evidence to show that increased construction activity (presumably spurred by the proposed tax concessions) would greatly benefit the state
Scenario: Curtis Construction Industry Lobbyist (Cont’d) • Possible Dependent Variable • Number of people unemployed or the unemployment rate • Data on this variable may be gathered from a sample of areas from around the country • Possible Independent Variable • Number of construction permits issued or number of ongoing construction projects • Data on this variable should be gathered from the same sample
Scenario: Carol, Chief Librarian • Carol, chief librarian in a major university, is eager to increase the number of students borrowing books from the library as well as the number of books borrowed per student • She needs some persuasive evidence to show how increased borrowing of books might benefit students
Scenario: Carol, Chief Librarian (Cont’d) • Possible Dependent Variable • Cumulative grade point ratio • Data on this variable should be gathered for a sample of students who have borrowed books in the past • Possible Independent Variable • Number of books borrowed • Assuming that the library has records of the books borrowed by students, data on this variable can be obtained from those records for the same sample of students
Scenario: Jack, Trade Show Officer • Jack, an officer in an association in charge of putting together and promoting industrial trade shows, is wondering about the impact of the number of exhibitors in a trade show on trade show attendance
Scenario: Jack, Trade Show Officer (Cont’d) • Possible Dependent Variable • Number of people visiting a trade show • Data on this variable can be obtained for a representative sample of trade shows from the association’s past records • Possible Independent Variable • Number of exhibitors in a trade show • Necessary data can be obtained from the past records
Deriving a Regression Equation • Y = a + bX, where a and b are constants • Y-> Dependent Variable • x-> Independent Variable
Exhibit 14.4 Several Subjectivity Constructed Regression Lines
Regression Using SPSS –Sales and Advertising Data • Click ANALYZE • Select REGRESSION • Click LINEAR • Move “Dollar Sales for Bright” to DEPENDENT box • Move “advertising expenditures for Bright” to • INDEPENDENT(S) box • Click OK
Exhibit 14.5 SPSS Computer Output or Simple Regression Analysis of Sales and Advertising Data
Standard Error SSE Sy/x = ----------- n - k - 1 • The value of the standard error (sy/x) is shown in the computer output as 2.277, which is the square root of the error mean square value of 5.186
Practical Applications of Regression Equations • The regression coefficient,or slope, can indicate how sensitive the dependent variable is to changes in the independent variable • The regression equation is a forecasting tool for predicting the value of the dependent variable for a given value of the independent variable
Precautions In Using Regression Analysis • Only capable of capturing linear associations between dependent and independent variables • A significant R2-value does not necessarily imply a cause-and-effect association between the independent and dependent variables • A regression equation may not yield a trustworthy prediction of the dependent variable when the value of the independent variable at which the prediction is desired is outside the range of values used in constructing the equation
Precautions In Using Regression Analysis(Cont’d) • A regression equation based on relatively few data points cannot be trusted • The ranges of data on the dependent and independent variables can affect the meaningfulness of a regression equation
Yi = a + b1X1i + b2X2i + … + bkXki Yi is the predicted value of the dependent variable for some unit i; X1i, X2i, …, Xki are values on the independent variables for unit i; bl, b2, . . . , bk are the regression coefficients; a is the Y-intercept representing the prediction for Y when all independent variables are set to zero Multiple Regression Analysis
National Insurance Company – Multiple Regression Using SPSS • Jill and Tom were interested in conducting a multiple regression analysis wherein overall service quality perceptions is the dependent variable and the average ratings along the five dimensions are the indpendent variable
National Insurance Company – Multiple Regression Using SPSS(Cont’d) • Click ANALYZE • Select REGRESSION • Click LINEAR • Move “OQ” to DEPENDENT Box • Move “reliable, empathy, tangible, response, and assure” to INDEPENDENT(S) box • Click OK
National Insurance Company– Multiple Regression Using SPSS(Cont’d)
The R-square of .810 indicates a strong relationship between these variables and overall quality. National Insurance Company– Multiple Regression Using SPSS(Cont’d)