1 / 25

Lecture 3

Lecture 3. HSPM J716. Efficiency in an estimator. Efficiency = low bias and low variance Unbiased with high variance – not very useful Biased with low variance -- worthless. A no-variance, reliable estimator?. The 0 estimator. Eyeball vs. Least squares for assignment 1.

chace
Download Presentation

Lecture 3

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lecture 3 HSPM J716

  2. Efficiency in an estimator • Efficiency = low bias and low variance • Unbiased with high variance – not very useful • Biased with low variance -- worthless

  3. A no-variance, reliable estimator? • The 0 estimator

  4. Eyeball vs. Least squares for assignment 1 • http://hspm.sph.sc.edu/COURSES/J716/demos/StudentLines/StudentLines.html

  5. Hypothesis testing – parallels among the coin toss, card trick, and assignment 1A experiments • A statistic calculated from our data • A critical value for that statistic calculated theoretically based on a hypothesis about how the data were generated • If our statistic were greater than the critical value, we would reject the hypothesis.

  6. Hypothesis testing – all about calculating the probability of what you got and drawing an inference • With the coin toss experiment • A statistic calculated from our data • Counted how many tails came up • A critical value for that statistic calculated theoretically based on the hypothesis that the coin was fair • 5 consecutive results that are all the same • When our statistic was greater than the critical value, we rejected the hypothesis

  7. Hypothesis testing – all about calculating the probability of what you got and drawing an inference • With the card experiment • A statistic calculated from our data • Counted how many times I guessed the card • A critical value for that statistic calculated theoretically based on the hypothesis that the any of 52 cards could come up • Even one right guess has a probability less than 0.05, so the critical value is 1. • When our statistic was as big as the critical value, we rejected the hypothesis

  8. T statistic hypothesis tests calculate a probability and draw an inference • With the assignment 1A spreadsheet • A statistic calculated from our data • The estimated coefficient divided by its standard error • A critical value for that statistic calculated theoretically based on the hypothesis that the true line’s slope is 0. • 2.571 • When our statistic is greater than the critical value, we reject the hypothesis

  9. Not rejecting a false hypothesisType II error in assignment 1A part 2

  10. How the assumptions apply to the eyeball line and the least squares line

  11. Assumption 1 is that there is a true line and that what you see differs from the true line because of random errors up or down for each point. • Eyeball line: It's why you drew a line through the points, instead of using a curve or a wiggly line that goes from one point to the next.  • Least squares: It’s why you built a spreadsheet that calculates the slope and intercept of a line.

  12. Assumption 2 is that the errors have an expected value of 0. • Eyeball line: it's why you try to draw the line through the middle of the points, rather than off to one side or tilting differently.  • Least squares: The average of the residuals is 0. • (The residuals are your estimates of the errors.)

  13. Assumption 3 is that the errors all have the same variance. • Eyeball line: It's why you don't favor one point over another in drawing the line.  • Least squares: The spreadsheet’s sum and average rows are simples sums and averages. No data row gets a different weight from another.

  14. Assumption 4 is that the errors are independent, not correlated with each other. • Eyeball line: It's why you predict for X=800 using a point on the line • Least squares: Its why you predict for X=800 with 800*slope + intercept.

  15. Confidence interval for a coefficient • Coefficient ± its standard error × t from table • 95% probability that the true coefficient is in the 95% confidence interval? • If you do a lot of studies, you can expect that, for 95% of them, the true coefficient will be in the 95% confidence interval. • If 0 is in the confidence interval, then the coefficient is not significant.

  16. Assignment 2 • All regression results are the same • Graphs differ • Need reason to use or doubt least squares prediction • The reason is in the form of rejecting one or more of the assumptions

  17. Durbin-Watson statistic • Serial correlation • Finds significant pattern for clinic 2

  18. Confidence interval for prediction • The hyperbolic outline

  19. Formal outlier test? • Use confidence interval of prediction • With and without the suspect point? • How do you predict when your data have an outlier? • Totally ignoring it seems wrong. • So does letting it sway your results too much. • Investigate and use judgment.

  20. Multiple regression • 3 or more dimensions • 2 or more X variables • Y = α + βX + γZ + error • Y = α + β1X1 + β2X2 + … + βpXp+ error

  21. Fitting a plane in 3D space • Linear assumption • Now a flat plane • The effect of a change in X1 on Y is the same at all levels of X1 and X2 and any other X variables. • Residuals are vertical distances from the plane to the data points floating in space.

  22. Multiple regression • Separating effects • Example from literature • Example from handout

  23. β interpretation • in Y = α + βX + γZ + error • β is the effect on Y of changing X by 1, holding Z constant. • When X is one unit bigger than you would predict it to be from what Z is, then we expect Y to be β more than what you would predict it would be from what Z is. • Those prediction are based on linear relationships.

  24. β-hat formula

  25. LS • Spreadsheet as front end • Word processor as back end • Interpretation of results • Coefficients • Standard errors • T-statistics • P-values • Prediction

More Related