160 likes | 184 Views
Explore the relationship between father's and son's heights using Galton's survey data in a simple linear regression model. Learn about assumptions, inferences, and properties of estimators in Utopia JMP simulations.
E N D
Stat 112: Notes 2 • This class: Start Section 3.3. • Thursday’s class: Finish Section 3.3. • I will e-mail and post on the web site the first homework tonight. It will be due next Thursday.
Father and Son’s Heights • Francis Galton was interested in the relationship between • Y=son’s height • X=father’s height • Galton surveyed 952 father-son pairs in 19th Century England. • Data is in Galton.JMP
Sample vs. Population • We can view the data – -- as a sample from a population. • Our goal is to learn about the relationship between X and Y in the population: • We don’t care about how father’s heights and son’s heights are related in the particular 952 men sampled but among all fathers and sons. • From Notes 1, we don’t care about the relationship between tracks counted and the density of deer for the particular sample, but the relationship among the population of all tracks; this enables to predict in the future the density of deer from the number of tracks counted.
Sampling Distribution of b0,b1 • Utopia.JMP contains simulations of pairs and from a simple linear regression model with • Notice the difference in the estimated coefficients from the y’s and y*’s. • The sampling distribution of describes the probability distribution of the estimates over repeated samples from the simple linear regression model with fixed.
Utopia Linear Fit y = 1.4977506 + 0.9876713 x Parameter Estimates Term Estimate Std Error t Ratio Prob>|t| Intercept 1.4977506 0.300146 4.99 <.0001 x 0.9876713 0.016907 58.42 <.0001 Linear Fit y* = 0.9469452 + 1.0216591 x Parameter Estimates Term Estimate Std Error t Ratio Prob>|t| Intercept 0.9469452 0.364246 2.60 0.0147 x 1.0216591 0.020517 49.79 <.0001
Sampling distributions • Sampling distribution of • Sampling distribution is normally distributed. • Sampling distribution of • Sampling distribution is normally distributed. • Even if the normality assumption fails and the errors e are not normal, the sampling distributions of are still approximately normal if n>30.
Properties of and as estimators of and • Unbiased Estimators: Mean of the sampling distribution is equal to the population parameter being estimated. • Consistent Estimators: As the sample size n increases, the probability that the estimator will become as close as you specify to the true parameter converges to 1. • Minimum Variance Estimator: The variance of the estimator is smaller than the variance of any other linear unbiased estimator of , say