1 / 28

Happiness comes not from material wealth but less desire.

Happiness comes not from material wealth but less desire. Applied Statistics Using SAS and SPSS. Topic: Simple linear regression By Prof Kelly Fan, Cal State Univ, East Bay.

horace
Download Presentation

Happiness comes not from material wealth but less desire.

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Happiness comes not from material wealth but less desire.

  2. Applied Statistics Using SAS and SPSS Topic: Simple linear regression By Prof Kelly Fan, Cal State Univ, East Bay

  3.  A company markets and repairs small computers. How fast (Time) an electronic component (Computer Unit) can be repaired is very important to the efficiency of the company. The Variables in this example are: Timeand Units. Example: Computer Repair

  4. Humm… How long will it take me to repair this unit? Goal: to predict the length of repair Time for a given number of computerUnits

  5. Computer Repair Data

  6. Graphical Summary of Two Quantitative Variable Scatterplot of response variable against explanatory variable • What is the overall (average) pattern? • What is the direction of the pattern? • How much do data points vary from the overall (average) pattern? • Any potential outliers?

  7. Time is Linearly related with computer Units. (The length of) Time is Increasingas (the number of) Units increases. Data points are closed to the line. No potential outlier. Summary for Computer Repair Data Some Simple Conclusions Scatterplot (Time vs Units)

  8. Numerical Summary of Two Quantitative Variable • Regression equation • Correlation

  9. Review: Math Equation for a Line • Y: the response variable • X: the explanatory variable Y=b0+b1X Y } b1 1 } b0 X

  10. Regression Equation • The regression line models the relationship between X and Y on average. • The math equation of a regression line is called regression equation.

  11. The Usage of Regression Equation • Predict the value of Y for a given X value Eg. How long will it take to repair 3 computer units?

  12. General Notation • is called “predicted Y,” pronounced as “y hat,” which estimates the average Y value for a specified X value. Eg. • The predicted repair time of a given # of units

  13. The Limitation of the Regression Equation • The regression equation cannot be used to predict Y value for the X values which are (far) beyond the range in which data are observed. Eg. The predicted WT of a given HT: Given HT of 40”, the regression equation will give us WT of -205+5x40 = -5 pounds!!

  14. The Unpredicted Part • The value is the part the regression equation (model) cannot predict, and it is called “residual.”

  15. residual {

  16. Correlation between X and Y • X and Y might be related to each other in many ways: linear or curved.

  17. Examples of Different Levels of Correlation r=.71 Median Linearity r=.98 Strong Linearity

  18. Examples of Different Levels of Correlation r=.00 Nearly Curved r=-.09 Nearly Uncorrelated

  19. (Pearson) Correlation Coefficient of X and Y • A measurement of the strength of the “LINEAR” association between X and Y • Sx: the standard deviation of the data values in X, Sy: the standard deviation of the data values in Y; the correlation coefficient of X and Y is:

  20. Correlation Coefficient of X and Y • -1< r < 1 • The magnitude of r measures the strength of the linear association of X and Y • The sign of r indicate the direction of the association: “-”  negative association “+”  positive association

  21. Goodness of Fit • R^2 is the proportion of Y variance explained/accounted by the model we use to fit the data • When there is only one X (simple linear regression) R^2 = r^2.

  22. SPSS Output Analyze >> Regression >> Linear

  23. Confidence Intervals

  24. Check for Normality

  25. Check for Equal Variances >> plots >> zresid & zpred

  26. The Influence of Outliers • The slope becomes smaller (toward outliers) • The r value becomes smaller (less linear)

  27. The Influence of Outliers • The slope becomes clear (toward outliers) • The | r | value becomes larger (more linear: 0.1590.935)

  28. Identify Outliers using Residual Plots • Use “standardized” residuals!! • The cases with standardized residuals of size 3 or more outliers

More Related