1 / 15

Lecture 20 – Tues., Nov. 18th

This lecture covers multiple regression analysis in the context of case studies and interpretation of regression coefficients. Topics include the estimation of multiple linear regression models, JMP output, and the application of regression analysis in various scenarios.

haack
Download Presentation

Lecture 20 – Tues., Nov. 18th

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lecture 20 – Tues., Nov. 18th • Multiple Regression: • Case Studies: Chapter 9.1 • Regression Coefficients in the Multiple Linear Regression Model: Chapter 9.2 • JMP Output: Chapter 9.6.1 • Office Hours: Today and Thursday after class, tomorrow (Wednesday) 11-12 instead of 1:30-2:30 or by appointment.

  2. Midterm 2 Scores

  3. Multiple Regression • Multiple Regression: Seeks to estimate the mean of Y given multiple explanatory variables X1,…,Xp, denoted by • Examples: • Y=1st year GPA, X1=Math SAT, X2=Verbal SAT, X3=High School GPA • Y=Sales price of house, X1=Square Footage, X2=Number of Rooms • Uses of Regression Analysis • Describe association between mean of Y and X1,…,Xp; describe association between mean of Y and X1 after taking into account X2,…,Xp. • Passive Prediction: Predict Y based on X1,…,Xp. • Control: Predict what Y will be if you change X1,…,Xp.

  4. Multiple Linear Regression Model • There is a normally distributed subpopulation of responses for each combination of the explanatory variables with • The observations are independent of one another.

  5. Case Study 9.1.1 • Meadowfoam is a small plant found growing in moist meadows in Northwest. • Researchers conducted a randomized experiment to find out how to elevate meadowfoam production • In a controlled growth chamber, they focused on the effects of two-light related factors: light intensity and timing of onset of light treatment. • Light intensity levels: 150,300,450,600,750,900 • Timing of onset: Early, Late

  6. Case Study 9.1.1. Cont. • Variables: • Y = average number of flowers per meadowfoam plant • X1=light intensity • X2=1 if late timing, 0 if early timing • Multiple Linear Regression Model:

  7. Interpretation of Coefficients • = the change in the mean of y that is associated with a one unit increase in where is held fixed. • = the change in the mean of y that is associated with a one unit increase in where is held fixed. • = mean of y when

  8. Coefficients in Meadowfoam Study • For meadowfoam study: • = change in mean flowers per plant associated with 1 increase in light intensity for fixed time of onset • = change in mean flowers per plant associated with switching from late to early onset for fixed light intensity.

  9. Estimation of Multiple Linear Regression Model • The coefficients are estimated by choosing to make the sum of squared prediction errors as small as possible, i.e., choose to minimize • Predicted value of y given x1,…,xp: • = SD(Y|X1,…,Xp), estimated by = root mean square error

  10. Multiple Linear Regression in JMP • Analyze, Fit Model • Put response variable in Y • Click on explanatory variables and then click Add under Construct Model Effects • Click Run Model.

  11. JMP Output from Meadowfoam Study

  12. Reading JMP Output • Estimated multiple linear regression model: • . Approximately 95% of flowers per plant will lie within 2*6.44 =12.88 flowers per plant of • p-values for coefficients indicate that there is strong evidence that higher light intensity is associated with less flowers per plant on average for fixed time onset and that early time onset is associated with more flowers per plant on average for fixed light intensity.

  13. Case Study 9.1.2 • What characteristics are associated with bigger brain size after accounting for body size, i.e., what characteristics are associated with bigger brain size holding body size fixed? • Y=brain weight, X1=body weight, X2=gestation period, X3=litter size • Multiple Linear Regression Model

  14. Interpretation in Randomized Experiments vs. Obs. Studies • Randomized Experiments: Interpretation of an “effect” of an explanatory variable is straightforward and causation is implied. Example: “A 1-unit increase in light intensity causes the mean number of flowers to increase by “ • Observational Studies: Cannot make causal conclusions from statistical association. “For any subpopulation of mammal species with the same body weight and litter size, a 1-day increase in the species’ gestation length is associated with a - gram increase in mean brain weight.” Interpretation is only useful if subpopulation of mammals with fixed values of body weight and litter size, but varying gestation lengths, exist.

  15. Interpreting Coefficients • Interpretation depends on what other X’s are included. • measures rates of change in mean brain weight with changes in gestation length in population of all mammal species where body size is variable • measures the rate of change in mean brain weight with changes in gestation length within subpopulations of fixed body size.

More Related