1 / 11

Regression and Correlation

Regression and Correlation. Module 8. Relationship between two variables. Changing wind speed, humidity, or other met parameters, and Pollutant concentrations. Causality?. Simultaneous change does not imply causality seat belt use on airplanes

Download Presentation

Regression and Correlation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Regression and Correlation Module 8 module 8

  2. Relationship between two variables • Changing wind speed, humidity, or other met parameters, and • Pollutant concentrations module 8

  3. Causality? • Simultaneous change does not imply causality • seat belt use on airplanes • ADHD rate among children and the number of child therapists in the U.S. • Snoring and sleeping with someone else in your bedroom • Other factors may be root cause of both, or may be an “artifact” of your data module 8

  4. Linear regression: • Y=mx + b • The difference between the real y values and the predicted-based-on-a-straight-line y values is • “Residual” • This is used to calculate R squared • R squared is a measure of the difference between a perfect line and your data module 8

  5. methods in Excel: • Create an XY chart • With chart selected, click on Chart, Add Trendline module 8

  6. Within the chart method, cont: • Click on Options, • Display equation and R2 on chart, • can also create a regression line based on nonlinear correlation module 8

  7. Excel method: • Use functions • =slope( Ys FIRST, Xs) • =intercept (Ys, Xs) • =steyx (Ys, Xs) • =forecast (Ys, Xs) module 8

  8. method in Excel: • Data Analysis Toolpak • Regression • Advantages: creates a normal probability plot, if you select this option • Creates a tabled output (be careful do not write over data) module 8

  9. R squared: • from to • how closely the estimated values for the trendline correspond to your actual data • trendline is most reliable when its R-squared value is at or near Also known as the coefficient of determination module 8

  10. Regression vs Correlation: • Regression based on how far Ys differ from their predicted values • Regression looks at the variability in X and uses it to predict variability in Y • Correlation (aka Pearson correlation coefficient) evaluates the proportion of the y-change that is DUE to the x-y relationship • RSQ(known_y's,known_x's) module 8

  11. Correlation: • Three excel methods: • = RSQ (Ys, Xs) • = CORREL (array) • = PEARSON (array) • Cautions: must arrange data first to use array • Check if R or R2 value is returned module 8

More Related