1 / 15

Chaper 3

Chaper 3. Some basic concepts of statistics. Population versus Sample. Population. Sample. Numbers that describe the sample are called __________________ Sample mean is represented by ________ Sample variance is represented by ________.

cricket
Download Presentation

Chaper 3

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chaper 3 Some basic concepts of statistics

  2. Population versus Sample Population Sample Numbers that describe the sample are called __________________ Sample mean is represented by ________ Sample variance is represented by ________ • Numbers that describe the population are called _________________ • Population mean is represented by ________ • Population variance is represented by ________

  3. Sample mean and variance • Calculate sample mean: • Calculate sample variance: • Sample standard deviation:

  4. Population Mean and Standard deviation • m = E(Y) = Syp(y) • Population standard deviation: s2= S(y-m)2p(y)

  5. Sampling distribution • The distribution of all y-bars possible with n=50. • E(y-bar)= m • Var(y-bar)= s2/n

  6. Section 3.3 Summarizing Information in Populations and Samples: The Finite Population Case • If the population is infinitely large, we can assume sampling without replacement (probabilities of selecting observations are independent) • However, if population is finite, then probabilities of selecting elements will change as more elements are selected (Example: rolling a die versus selecting cards from standard 52 card deck)

  7. Estimating total population • (Infinitely large population) Let t denote the population total (parameter) and let t-hat denote the estimated total (statistics); let y1,…yn be a random sample of size n from the population and let d1,…dn be the probabilities for being selected for each of the sample observations, respectively. Then the estimated population total is t-hat = (1/n)Si(yi/di) (estimator is unbiased for true parameter t Can think of this estimator as a weighted estimator with weights wi = 1/di so t-hat = (1/n)Si(wiyi)) The estimated variance of t-hat is (1/n)*(1/(n-1))*Si((yi/di)-t)2. Calculate the estimated variance for each scenario.

  8. Sampling without replacement • Same idea can be used with sampling without replacement, but probabilities become more difficult to find (STT 315 helps to understand how to calculate these).

  9. 3.4 Sampling distribution • In your introductory statistics class, you discovered that the sampling distribution of y-bar was normally distributed (if n was large enough) with mean m and standard deviation s/sqrt(n).

  10. Tchebysheff’s theorem • If n is NOT large enough to assume CLT and the population distribution is NOT normal, then we can still use Tchebysheff’s theorem to get a lower bound: For any k > 1, at least (1-(1/k2)) will fall within k standard deviations of the mean (this is a LOWER BOUND!!) . Therefore, within 1 standard deviation, at least 0% (not very useful); within 2 standard deviations, at least 75%; within 3, at least 88.88889%

  11. Finite population size • All the theory in introductory statistics class (and so far in this class) assumes INDEPENDENT observations (infinite population…..or so large that we can assume infinite population) • What happens when this is not true? R-code: x<-rgamma(80,shape=0.5,scale=9) hist(x) x.bar.dist<-function(x,n) {xbar<-vector(length=100) for (i in 1:100) { temp<-sample(x,n,replace=FALSE) xbar[i]<-mean(temp) } return(xbar)}

  12. 3.5 Covariance and Correlation • Relationship between two random variables: covariance • The covariance indicates how two variables “covary” • Positive covariance indicates a positive “covary” or association • Negative covariance indicates a negative “covary” or association • Zero covariance indicates no association (NOT necessarily independence!!!)

  13. More on Covariance • We calculate covariance by E[(y1-m1)(y2-m2)]. • Look at graphs to discuss covariance (measure of LINEAR dependency) • However, covariance depends on the scale of the two variables • Correlation “standardizes” the covariance • Correlation = cov(y1,y2)/(s1s2) = r • Note that -1<r<1

  14. 3.6 Estimation • Since we do not know parameters, we estimate them with statistics!! If q is the parameter of interest, then q-hat is the estimator of q. We want the following properties to hold: • E(q-hat) = q • V(q-hat) = s2(q-hat) is small

  15. Error of Estimations and Bounds • The error of estimation is defined as |(q-hat)-q| • Set a bound on this error of estimation (B) such that P(|(q-hat)-q| < B) = 1-a The value of B (bound) can be thought of as the margin of error. In fact, this is how confidence intervals (when the sampling distribution of the statistics is normally distributed).

More Related