330 likes | 493 Views
Marketing Research. Aaker, Kumar, Day and Leone Tenth Edition Instructor’s Presentation Slides. Chapter Fourteen. Correlation Analysis and Regression Analysis. Correlation analysis Measures strength of the relationship between two variables Correlation coefficient
E N D
Marketing Research Aaker, Kumar, Day and Leone Tenth Edition Instructor’s Presentation Slides
Chapter Fourteen Correlation Analysis and Regression Analysis
Correlation analysis • Measures strength of the relationship between two variables • Correlation coefficient • Provides a measure of the degree to which there is an association between two variables (X and Y) Definitions
Statistical technique that is used to relate two or more variables • Objective is to build a regression model or a prediction equation relating the dependent variable to one or more independent variables • The model can then be used to describe, predict, and control the variable of interest on the basis of the independent variables • Multiple regression analysis - Regression analysis that involves more than one independent variable Regression Analysis
Pearson correlation coefficient • Measures the degree to which there is a linear association between two interval-scaled variables • A positive correlation reflects a tendency for a high value in one variable to be associated with a high value in the second • A negative correlation reflects an association between a high value in one variable and a low value in the second variable Correlation Analysis
Population correlation (p) - If the database includes an entire population • Sample correlation (r) - If measure is based on a sample Correlation Analysis (Contd.) • R lies between -1 < r < + 1 • R = 0 ---> absence of linear association
Correlation Coefficient Simple Correlation Coefficient Pearson Product-moment Correlation Coefficient
Null hypothesis: Ho : p = 0 Alternative hypothesis: Ha : p ≠ 0 Test statistic Testing the Significance of the Correlation Coefficient Example:n = 6 and r = .70 At = .05 , n-2 = 4 degrees of freedom, Critical value of t = 2.78 Since 1.96<2.78, we fail to reject the null hypothesis.
Partial Correlation Coefficient • Measure of association between two variables after controlling for the effects of one or more additional variables
Simple Linear Regression Model Yi = βo + β1xi + εi Where • Y = Dependent variable • X =Independent variable • βo = Model parameter that represents mean value of dependent variable (Y) when the independent variable (X) is zero • β1 = Model parameter that represents the slope that measures change in mean value of dependent variable associated with a one-unit increase in the independent variable • εi = Error term that describes the effects on Yi of all factors other than value of Xi Regression Analysis
Error term is normally distributed (normality assumption) Mean of error term is zero [E(εi) = 0) Variance of error term is a constant and is independent of the values of X (constant variance assumption) Error terms are independent of each other (independent assumption) Values of the independent variable X are fixed (non-stochastic X) Assumptions of the Simple Linear Regression Model
Calculate point estimate bo and b1 of unknown parameter βo and β1 Obtain random sample and use this information from sample to estimate βo and β1 Obtain a line of best "fit" for sample data points - least squares line Estimating the Model Parameters Predicted value of Yi , Where
bo and b1 minimize the residual or error sum of squares (SSE) SSE = ei2 = ((yi - yi)2 = Σ [yi-(bo + b1xi)]2 Residual Value • Difference between the actual and predicted values • Estimate of the error in the population ei = yi - yi = yi - (bo + b1 xi)
Mean Square Error Standard Error of b1 Standard Error of b0 Standard Error
Null Hypothesis • There is no linear relationship between the independent & dependent variables • Alternative Hypothesis • There is a linear relationship between the independent & dependent variables Testing the Significance of Independent Variables H0: β1 = 0 Ha: β1≠ 0
Test Statistic t = b1 - β1 sb1 Degrees of Freedom V = n – 2 Testing for a Type II Error Ho: β1 = 0 Ha: β1≠ 0 Decision Rule Testing the Significance of Independent Variables (Contd.) Reject ho: β1 = 0 if α > p value
SSTSum of squared prediction error that would be obtained if we do not use x to predict y SSE Sum of squared prediction error that is obtained when we use x to predict y SSM Reduction in sum of squared prediction error that has been accomplished using x in predicting y Sum of Squares
Dependent variable, yi = bo + bixi Error of prediction is yi – y Total variation (SST) = Explained variation (SSM) + Unexplained variation (SSE) Predicting the Dependent Variable Coefficient of Determination (r2) • Measure of regression model's ability to predict (Yi - Y)2 = (Yi - Y)2 + (Yi – Yi)2 r2 = (SST - SSE) / SST = SSM / SST = Explained Variation / Total Variation
A linear combination of predictor factors is used to predict the outcome or response factors The general form of the multiple regression model is explained as: Multiple Regression where β1 , β2, . . . , βkare regression coefficients associated with the independent variables X1, X2, . . . , Xkand εis the error or residual.
The prediction equation in multiple regression analysis is Multiple Regression (Contd.) Ŷ = α + b1X1 + b2X2 + …….+bkXk where Ŷis the predicted Y score and b1 . . . , bkare the partial regression coefficients.
b 1 is the expected change in Y when X1 is changed by one unit, keeping X 2 constant or controlling for its effects. b 2 is the expected change in Y for a unit change in X2, when X1 is held constant. If X1 and X2 are each changed by one unit, the expected change in Y will be (b1 / b2) Partial Regression Coefficients Y = α + b1X1 + b2X2 + error
Consider t-value for βi's Use beta coefficients when independent variables are in different units of measurement Standardized βi = bi Standard deviation of xi Standard deviation of Y Check for multicollinearity Evaluating the Importance of Independent Variables
Predictor variables enter or are removed from the regression equation one at a time • Forward Addition • Start with no predictor variables in regression equation i.e. y = βo + ε • Add variables if they meet certain criteria in terms of F-ratio Stepwise Regression
Backward Elimination • Start with full regression equation i.e. y = βo + β1x1 + β2 x2 ...+ βr xr + ε • Remove predictors based on F- ratio • Stepwise Method • Forward addition method is combined with removal of predictors that no longer meet specified criteria at each step Stepwise Regression (Contd.)
Residual Plots Random distribution of residuals Nonlinear pattern of residuals Heteroskedasticity Autocorrelation
Examines whether any model estimated with one set of data continues to hold good on comparable data not used in the estimation. • Estimation Methods • The data are split into the estimation sample (with more than half of the total sample) and the validation sample, and the coefficients from the two samples are compared. • The coefficients from the estimated model are applied to the data in the validation sample to predict the values of the dependent variable Yi in the validation sample, and then the model fit is assessed. • The sample is split into halves – estimation sample and validation sample for conducting cross-validation. The roles of the estimation and validation halves are then reversed, and the cross-validation is repeated Predictive Validity
Regression with Dummy Variables Yi= a + b1D1 + b2D2 + b3D3 + error • For rational buyer, Ŷi = a • For brand-loyal consumers, Ŷi = a+ b1