1 / 61

Population proportion and sample proportion

Population proportion and sample proportion. 生活中很多的調查都僅問是否贊成 … 、是否支持 … ,然後計算「贊成」與「反對」的人數( count) 所佔之比例 (proportion) 。 本章要介紹如何用統計方法來推論單一的「比例」 (a single proportion) 。下一章將會介紹如何來推論一組比例的分配。. Population proportion and sample proportion.

illana-york
Download Presentation

Population proportion and sample proportion

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Population proportion and sample proportion • 生活中很多的調查都僅問是否贊成…、是否支持…,然後計算「贊成」與「反對」的人數(count)所佔之比例(proportion)。 • 本章要介紹如何用統計方法來推論單一的「比例」(a single proportion)。下一章將會介紹如何來推論一組比例的分配。 社會統計(上)

  2. Population proportion and sample proportion • 想要估計總統大選阿扁的得票率,即投票給阿扁的人佔所有投票者的比例,我們可以利用適當的抽樣方法取處樣本數為n的樣本,然後觀察樣本中支持阿扁的人數佔整個樣本n的比例,即可得到樣本中的阿扁支持率,稱之為樣本比例。 • 如果我們知道樣本比例的抽樣分配,即樣本比例的期望值,變異數,及分配形狀,則可以用樣本比例來推估母體比例。 社會統計(上)

  3. Sampling Distribution of the Sample Proportion • Let p denote the proportion of items in a population that possess a certain characteristic (unemployed, income below poverty level). • To estimate p, we take a random sample of n observation from the population and count the number X of items in the sample that possess the characteristic. • The sample proportion p^ = X/n is used to estimate the population proportion p. 社會統計(上)

  4. Sampling Distribution of the Sample Proportion 定義 • 若一隨機試驗只有兩種課能的結果(X=1支持阿扁, X=0不支持阿扁),若母體數總共為N(所有投票人),若母體中有K個人會投票給阿扁,則支持阿扁的母體比例(population proportion)為 • p = K/N (N=母體個數,K=支持阿扁總人數) 社會統計(上)

  5. Sampling Distribution of the Sample Proportion 定義 • 上次總統大選的有效投票數12,664,393 (N) • 其中阿扁得4,977,697 (K) • 母體比例為39.30% 社會統計(上)

  6. Sampling Distribution of the Sample Proportion 定義 • 若母體N中隨機抽取n個元素為樣本,表為(X1, X2, …Xn),且n個樣本中有k個人支持阿扁,支持阿扁所佔的比例稱為樣本比例(sample proportion): • (n=樣本個數,k=樣本個數) • k為樣本中,支持阿扁(X=1)的個數總和。 社會統計(上)

  7. Sampling Distribution of the Sample Proportion 定義 • 在大選前,民調中心調查1500個樣本(n=1500),其中有573人支持阿扁(k=573),樣本支持比例為38.2% • 抽樣誤差為 隨著每一次樣本所抽取的對象不同,所計算出的樣本比例也會有差異,因此樣本比例本身為一隨機變數。 社會統計(上)

  8. The Bernoulli Distribution 定義 • P(X=1) = p • P(X=0) = (1-p) • If we let q = 1- p, then the p.f of X can be written as follows: 社會統計(上)

  9. The Bernoulli Distribution 定義 • E(X) = 1·p +0·q = p (X的期望值等於母體比例) • E(X2) =X2 f(x)=12·p+02·q = p • Var(X) = E(X2) –[E(X)]2 =p-p2=p(1-p) = p·q 社會統計(上)

  10. Sampling Distribution of the Sample Proportion • The Normal Approximation Rule for Proportion: Let p denote the proportion of a population possessing some characteristics of interest. Take a random sample of n observations from the population. Let X denote the number of items in the sample possessing the characteristic. We estimate the population proportion p by the sample proportion p^=X/n. If np5, and nq 5, the random variable p^ has approximately a normal distribution with: 社會統計(上)

  11. Sampling Distribution of the Sample Proportion • 證明 社會統計(上)

  12. Sampling Distribution of the Sample Proportion • 證明 assume X1, X2…Xn independent 社會統計(上)

  13. Sampling Distribution of the Sample Proportion • If the distribution of p^ is approximately normal, and 社會統計(上)

  14. 例題 • 假設這一次的大選會有55%的選民支持阿扁,假設我們任取n=400人的隨機樣本來預測阿扁的當選率,我們預測阿扁會輸的的機率為? 社會統計(上)

  15. 例題 • Of your first 15 grandchildren, what is the chance there will be more than 10 boys? (assume equal probability of male/female) • “more than 10 boys””the proportion of boys is more than 10/15” • Use the Normal Approximation Rule: 社會統計(上)

  16. Confidence intervals for proportions (large samples) we know that p^ ~N(p, pq/n) , where q = 1-p and np≧5 and nq≧5) 社會統計(上)

  17. Value of Zα • P(Z≧ zα/2) =α/2 • P(Z≦ -zα/2) =α/2 • P(-zα/2 ≦Z≦ zα/2) =(1-α) 1-α/2-α/2 =1-α α/2 社會統計(上)

  18. Confidence intervals for proportions (large samples) 上面的公式必須要有母體比例p才能估計標準誤 社會統計(上)

  19. Confidence intervals for proportions (large samples) 因為沒有p與q的資訊,在樣本數夠大時,我們通常以樣本的比例p^來估計母體的標準誤: 社會統計(上)

  20. Confidence interval for the population proportion p 定義 Let p denote the population proportion. Suppose we take a large random sample of n observations and obtain the sample proportion p^. A confidence interval for the population proportion having level of confidence 100(1-α)% is given by 社會統計(上)

  21. 社會統計(上)

  22. Wilson estimate • 用樣本比例取代母體比例來估計標準誤並不一定正確。 • 例如:丟一個銅板三次得到三次都得正面,則 • p^=3/3=1 • p^(1-p^)/n=1(1-0)/3=0 社會統計(上)

  23. Wilson estimate We must know the s.d. of the population to get a CI for p. • Unfortunately, modern computer studies reveal the confidence intervals based on this approach can be quite inaccurate, even for large samples. -- When the sample is not a SRS. -- When the sample size is small 社會統計(上)

  24. Wilson estimate • The Wilson estimate ~ Add 2 successes and 2 failures(so that the sample proportion is slightly moved away from 0 and 1.) -- Because this estimate was first suggested by Edwin Bidwell Wilson in 1927, we call it theWilson estimate. 社會統計(上)

  25. Wilson estimate • 的抽樣分配趨近於平均數為p、標準差為 的常態分配。 • An approximate level C confidence interval for p is • The margin of error is 社會統計(上)

  26. Confidence interval for the population proportion p 例題 政府想要估計月收入低於$25,000NT的家庭。500個家庭接受訪問,其中有200戶人家年收入少於 25000. 求p的95%信賴區間? (.3572, .4428) 社會統計(上)

  27. 例題 • 從台北市隨機抽取500個人,詢問是否贊成公投,結果有312名贊成。試求台北市贊成公投比率95%信賴區間。 ,p的信賴區間為: 社會統計(上)

  28. One-sided confidence intervals for the population proportion Suppose that we take a random sample of n observations from some population having unknown proportion p. Suppose we wish to find the lower confidence limit LCL such that the probability is (1-) that p exceeds LCL. The one-sided interval (LCL, 1.00) is a left-sided confidence interval. The LCL is given by: 社會統計(上)

  29. One-sided confidence intervals for the population proportion Construct a right-sided 95% CI for the proportion of defective items produced by a machine if 16 items are found to be defective in a random sample of 100 items. The 95% right-sided CI for p is (0, .2306) This mean that we can be 95% confident that the population proportion is less than .2306 社會統計(上)

  30. Determining the sample size決定樣本大小 Margin of Error Suppose that we take a random sample from some population. Then a 100(1-)% confidence interval for the population proportion extends at most a distance m on each side of the sample proportion if the number of observations is ? 社會統計(上)

  31. Determining the sample size決定樣本大小 問題是我們還不知道 (因為樣本數都還沒決定),所以上述公式無法使用,除非我們有p的推估值。 (1) 我們可以用pilot study來得到p的估計值。 (2) 在不知道的樣本比例情形下,我們可以採用最保守的估計,也就是最大的變異.5*.5=.25來估計n。 社會統計(上)

  32. Sample size and confidence interval for the proportion 如果母體比率無法推估,則樣本數: 如果母體比率p可以推估,則樣本數: 社會統計(上)

  33. Sample size and confidence interval for the proportion 民意調查機構想知道某總統候選人得票的比率,請問至少要多大的樣本數才可以使此機構在95%的信賴度下,估計的誤差界不會超過.03? 社會統計(上)

  34. Sample size and confidence interval for the proportion 民意調查機構想知道某總統候選人得票的比率。假設該公司要求樣本比例與母體之誤差不能超過0.01,且有95%的信賴度,則樣本數應為何? 代入, p未知,故以 故至少應選取9,600個樣本點。 社會統計(上)

  35. Tests of the population proportion 樣本比例的抽樣分配 f(p^):如果母體的比例為p, 且np5 and nq 5, 則樣本比例p^為一常態分配~N(p, pq/n) The Normal Approximation Rule for Proportion: If np5, and nq 5, the random variable p^ has approximately a normal distribution with: 社會統計(上)

  36. Sampling Distribution of the Sample Proportion • If the distribution of p^ is approximately normal, then random variable 社會統計(上)

  37. Tests of the population proportion 設np5 and nq 5 檢證下列假說: H0: p = p0 or H0: pp0 H1: p < p0 如果H0為真,則樣本比率~N(p0, p0q0/n) 假設為真時的母體比例 Reject H0 if Z < -z or p^ < p^* (critical value approach) 社會統計(上)

  38. 社會統計(上)

  39. 社會統計(上)

  40. 社會統計(上)

  41. Page 614, Procedure 12.2B (cont.) 社會統計(上)

  42. 例:Testing a population Proportion 藍營立法委員宣稱民調顯示60%的民眾支持連戰出訪中國,綠營團體宣稱支持的民眾不會超過60%,妳用100的樣本來驗證: H0: p = .6 v.s. H1: p < .6 假設55個樣本支持連戰出訪,以5%的顯著水準,我們可以推翻藍營立委的說法嗎? 社會統計(上)

  43. 例:Testing a population Proportion Solution: If H0 is true, then p^ has a normal distribution with mean p =.6 and variance pq/n = (.6)(.4)/100 = .0024 If we use a one-tailed test at the 5% level of significance, the critical region consists of all values of Z less than –z = -z.05 = -1.645 從樣本中得知p^=x/n = 55/100 =.55 社會統計(上)

  44. 例:Testing a population Proportion We do not reject H0 1 0 -1.02 實際上觀察到的樣本比例為.55>.519因此無法推翻虛擬假設 社會統計(上)

  45. Sampling distribution of the difference between sample proportions • Suppose we take independent sample of size n1and n2from two population. Let p1 and p2 be the proportion of items in each population that possess a certain characteristics, and let q1=(1-p1), q2=(1-p2). If n1p1>5, n1q1>5, n2p2>5, n2q2>5, then the random variable (p1^-p2^) is approximately normally distributed with 社會統計(上)

  46. 例題 • 假設某行銷公司想要知道某電視節目在高、低收入人口中受歡迎的程度。假設高收入的人中有40%喜歡看此節目,在低收入人口中喜歡此節目的佔50%。這家行銷公司從高收入的人口中抽取100人的樣本,從低收入中抽200人的樣本。請問兩樣本比率差距小於.05的機率? 社會統計(上)

  47. 例題 社會統計(上)

  48. Confidence intervals for the difference of Two population proportion Let p1 denote the observed proportion of successes in a random sample of n1 observation from a population with proportion p1 successes, and let p2 denote the observed proportion of successes in an independent random sample of n2 observations from a population with proportion p2 successes. A 100(1- α) % confidence interval for (p1 – p2) is given by the interval This result holds provided n1p1≧ 5 n1q1≧5 n2p2≧ 5 and n2q2≧5 社會統計(上)

  49. Tests concerning differences of proportions • 欲檢定兩母體的比率是否等於某特定值(相等),假設母體1的比率為p1,母體2的比率為p2: • H0: p1 –p2 = D0 • 分別從兩母體中抽取樣本n1, n2並計算樣本比率為p^1 p^2。 社會統計(上)

  50. Tests concerning differences of proportions • 若虛擬假設為真H0: p1 –p2 = D0,且n1p1≥5, n1q1≥5, n2p2≥5, n2q2≥5 • 通常我們想要檢驗虛擬假設H0: p1 –p2 =0的情形,即H0: p1 = p2 社會統計(上)

More Related