1 / 62

Probability Distribution of Random Error

Probability Distribution of Random Error. Regression Modeling Steps . 1. Hypothesize Deterministic Component 2. Estimate Unknown Model Parameters 3. Specify Probability Distribution of Random Error Term Estimate Standard Deviation of Error 4. Evaluate Model

vian
Download Presentation

Probability Distribution of Random Error

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Probability Distribution of Random Error EPI 809/Spring 2008

  2. Regression Modeling Steps • 1. Hypothesize Deterministic Component • 2. Estimate Unknown Model Parameters • 3. Specify Probability Distribution of Random Error Term • Estimate Standard Deviation of Error • 4. Evaluate Model • 5. Use Model for Prediction & Estimation EPI 809/Spring 2008

  3. Linear Regression Assumptions Assumptions of errors 1, ..., n - Gauss-Markov condition • Independent errors • Mean of probability distribution of errors is 0 • Errors have constant variance σ2, for which an estimator is S2 • Probability distribution of error is normal • Potential violation of G-M condition. EPI 809/Spring 2008

  4. Error Probability Distribution EPI 809/Spring 2008

  5. Random Error Variation EPI 809/Spring 2008

  6. Random Error Variation • 1. Variation of Actual Y from Predicted Y EPI 809/Spring 2008

  7. Random Error Variation • 1. Variation of Actual Y from Predicted Y • 2. Measured by Standard Error of Regression Model • Sample Standard Deviation of , s ^ EPI 809/Spring 2008

  8. Random Error Variation • 1. Variation of Actual Y from Predicted Y • 2. Measured by Standard Error of Regression Model • Sample Standard Deviation of , s • 3. Affects Several Factors • Parameter Significance • Prediction Accuracy ^ EPI 809/Spring 2008

  9. Evaluating the Model Testing for Significance EPI 809/Spring 2008

  10. Regression Modeling Steps • 1. Hypothesize Deterministic Component • 2. Estimate Unknown Model Parameters • 3. Specify Probability Distribution of Random Error Term • Estimate Standard Deviation of Error • 4. Evaluate Model • 5. Use Model for Prediction & Estimation EPI 809/Spring 2008

  11. Test of Slope Coefficient • 1. Shows If There Is a Linear Relationship Between X & Y • 2. Involves Population Slope 1 • 3. Hypotheses • H0: 1 = 0 (No Linear Relationship) • Ha: 1 0 (Linear Relationship) • 4. Theoretical basis of the test statistic is the sampling distribution of slope EPI 809/Spring 2008

  12. Sampling Distribution of Sample Slopes EPI 809/Spring 2008

  13. Sampling Distribution of Sample Slopes EPI 809/Spring 2008

  14. Sampling Distribution of Sample Slopes • All Possible Sample Slopes • Sample 1: 2.5 • Sample 2: 1.6 • Sample 3: 1.8 • Sample 4: 2.1 : :Very large number of sample slopes EPI 809/Spring 2008

  15. Sampling Distribution of Sample Slopes • All Possible Sample Slopes • Sample 1: 2.5 • Sample 2: 1.6 • Sample 3: 1.8 • Sample 4: 2.1 : :large number of sample slopes Sampling Distribution ^ S 1 ^ 1 EPI 809/Spring 2008

  16. Slope Coefficient Test Statistic EPI 809/Spring 2008

  17. Test of Slope Coefficient Rejection Rule • Reject H0 in favor of Ha if t falls in colored area • Reject H0 for Ha if P-value = P(T>|t|) < α Reject H Reject H 0 0 α/2 α/2 T=t(n-2) 0 t1-α/2, (n-2) -t1-α/2, (n-2) EPI 809/Spring 2008

  18. Test of Slope Coefficient Example • Reconsider the Obstetrics example with the following data: Estriol(mg/24h)B.w.(g/1000) 1 1 2 1 3 2 4 2 5 4 • Is the Linear Relationship betweenEstriol & Birthweight significant at .05 level? EPI 809/Spring 2008

  19. Solution Table For β’s EPI 809/Spring 2008

  20. Solution Table for SSE ^ ^ ^ ^ EPI 809/Spring 2008

  21. Test of Slope Parameter Solution • H0: 1 = 0 • Ha: 1 0 • .05 • df 5 - 2 = 3 • Critical Value(s): Test Statistic: EPI 809/Spring 2008

  22. Test StatisticSolution From Table EPI 809/Spring 2008

  23. Test of Slope Parameter • H0: 1 = 0 • Ha: 1 0 • .05 • df 5 - 2 = 3 • Critical Value(s): Test Statistic: Decision: Conclusion: Reject at  = .05 There is evidence of a linear relationship EPI 809/Spring 2008

  24. Parameter Estimates Parameter Standard Variable DF Estimate Error t Value Pr > |t| Intercept 1 -0.10000 0.63509 -0.16 0.8849 Estriol 1 0.70000 0.19149 3.66 0.0354 Test of Slope ParameterComputer Output ^ ^ t = k/ S ^ k S ^ k k P-Value EPI 809/Spring 2008

  25. Measures of Variation in Regression • 1. Total Sum of Squares (SSyy) • Measures Variation of Observed Yi Around the MeanY • 2. Explained Variation (SSR) • Variation Due to Relationship Between X & Y • 3. Unexplained Variation (SSE) • Variation Due to Other Factors EPI 809/Spring 2008

  26. Variation Measures Unexplained sum of squares (Yi -Yi)2 ^ Yi Total sum of squares (Yi -Y)2 Explained sum of squares (Yi -Y)2 ^ EPI 809/Spring 2008

  27. Coefficient of Determination • 1.Proportion of Variation ‘Explained’ by Relationship Between X & Y 0 r2 1 EPI 809/Spring 2008

  28. Coefficient of Determination Examples r2 = 1 r2 = 1 r2 = .8 r2 = 0 EPI 809/Spring 2008

  29. Coefficient of Determination Example • Reconsider the Obstetrics example. Interpret a coefficient of Determination of0.8167. • Answer: About 82% of the total variation of birthweight Is explained by the mother’s Estriol level. EPI 809/Spring 2008

  30. Root MSE 0.60553 R-Square 0.8167 Dependent Mean 2.00000 Adj R-Sq 0.7556 Coeff Var 30.27650 r 2 Computer Output r2 r2 adjusted for number of explanatory variables & sample size S EPI 809/Spring 2008

  31. Using the Model for Prediction & Estimation EPI 809/Spring 2008

  32. Regression Modeling Steps • 1. Hypothesize Deterministic Component • 2. Estimate Unknown Model Parameters • 3. Specify Probability Distribution of Random Error Term-Estimate Standard Deviation of Error • 4. Evaluate Model • 5. Use Model for Prediction & Estimation EPI 809/Spring 2008

  33. Prediction With Regression Models What Is Predicted? • Population Mean Response E(Y) for Given X • Point on Population Regression Line • Individual Response (Yi) for Given X EPI 809/Spring 2008

  34. What Is Predicted? EPI 809/Spring 2008

  35. Confidence Interval Estimate of Mean Y EPI 809/Spring 2008

  36. Factors Affecting Interval Width • 1. Level of Confidence (1 - ) • Width Increases as Confidence Increases • 2. Data Dispersion (s) • Width Increases as Variation Increases • 3. Sample Size • Width Decreases as Sample Size Increases • 4. Distance of Xp from MeanX • Width Increases as Distance Increases EPI 809/Spring 2008

  37. Why Distance from Mean? Greater dispersion than X1 X EPI 809/Spring 2008

  38. Confidence Interval Estimate Example • Reconsider the Obstetrics example with the following data: Estriol(mg/24h)B.w.(g/1000) 1 1 2 1 3 2 4 2 5 4 • Estimate the mean BW and a subject’s BW response when the Estriol level is 4 at .05 level. EPI 809/Spring 2008

  39. Solution Table EPI 809/Spring 2008

  40. Confidence Interval Estimate Solution - Mean BW X to be predicted EPI 809/Spring 2008

  41. Prediction Interval of Individual Response Note! EPI 809/Spring 2008

  42. Why the Extra ‘S’? EPI 809/Spring 2008

  43. SAS codes for computing mean and prediction intervals • Data BW; /*Reading data in SAS*/ • input estriol birthw; • cards; • 1 1 • 2 1 • 3 2 • 4 2 • 5 4 • ; • run; • PROC REG data=BW; /*Fitting a linear regression model*/ • model birthw=estriol/CLI CLM alpha=.05; • run; EPI 809/Spring 2008

  44. The REG Procedure Dependent Variable: y Output Statistics Dep VarPredicted Std Error Obs yValue Mean Predict 95% CL Mean95% CL Predict Residual 1 1.0000 0.6000 0.4690 -0.8927 2.0927 -1.8376 3.0376 0.4000 2 1.0000 1.3000 0.3317 0.2445 2.3555 -0.8972 3.4972 -0.3000 3 2.0000 2.0000 0.2708 1.1382 2.8618 -0.1110 4.1110 0 4 2.0000 2.7000 0.3317 1.6445 3.7555 0.5028 4.8972 -0.7000 5 4.0000 3.4000 0.4690 1.9073 4.8927 0.9624 5.8376 0.6000 Interval Estimate from SAS- Output Predicted Y when X = 3 Confidence Interval Prediction Interval SY ^ EPI 809/Spring 2008

  45. Hyperbolic Interval Bands EPI 809/Spring 2008

  46. Correlation Models EPI 809/Spring 2008

  47. Types of Probabilistic Models EPI 809/Spring 2008

  48. Correlation vs. regression • Both variables are treated the same in correlation; in regression there is a predictor and a response • In regression the x variable is assumed non-random or measured without error • Correlation is used in looking for relationships, regression for prediction EPI 809/Spring 2008

  49. Correlation Models • 1. Answer ‘How Strong Is the Linear Relationship Between 2 Variables?’ • 2. Coefficient of Correlation Used • Population Correlation Coefficient Denoted  (Rho) • Values Range from -1 to +1 • Measures Degree of Association • 3. Used Mainly for Understanding EPI 809/Spring 2008

  50. Sample Coefficient of Correlation • 1. Pearson Product Moment Coefficient of Correlation between x and y: EPI 809/Spring 2008

More Related