380 likes | 483 Views
Welcome to BUAD 310. Instructor: Kam Hamidieh Lecture 13, Monday March 5, 2014. Agenda & Announcement. Today: Chapter 15, Confidence intervals Reading for Chapters 13, 14, and 15: 13: Read sections 13.1 & 13.2, 13.3 & 13.4 not required but interesting stuff
E N D
Welcome to BUAD 310 Instructor: Kam Hamidieh Lecture 13, Monday March 5, 2014
Agenda & Announcement • Today: • Chapter 15, Confidence intervals • Reading for Chapters 13, 14, and 15: • 13: Read sections 13.1 & 13.2, 13.3 & 13.4 not required but interesting stuff • 14: Read 14.1 carefully but ignore sample size calculations based on kurtosis, Skip the rest of the chapter • 15: Read all sections but I won’t emphasize section 15.4 BUAD 310 - Kam Hamidieh
From Last Time Population , population mean is unknown. Random Sample Compute the sample mean… Now is our estimates of . BUAD 310 - Kam Hamidieh
THE CLT The Central Limit Theorem: Suppose we draw a random sample of size n from a population with mean μ and standard deviation of σ. When n is large, the sampling distribution of the sample mean is approximately normal: Standard Error BUAD 310 - Kam Hamidieh
Confidence Intervals - Example • Launching a new “affinity” credit card: contemplating sending 100,000 pre-approved applications to alumni of a large university. • Should you send out 100,000 cards now? • Not a good idea! Too expensive. • Send out a few and see how profitable the cards are. BUAD 310 - Kam Hamidieh
Our Scenario Alumni who accept the card = mean monthly balanceof those who accept 140 Accepted RS Compute the sample mean… Now is our estimates of . BUAD 310 - Kam Hamidieh
Example Continued • n = 140 out of 1,000 took the card. • After three months we have: • Mean monthly balance: $1,990 • SD: s = $2,800 • The histogram of the data: Why consider the mean? BUAD 310 - Kam Hamidieh
Example Continued • Our point estimate of the population mean is $1,990. • Questions: • Are you ready to make your decision to go with the launch? • What would happen if you did another survey? • How close are we to μ? BUAD 310 - Kam Hamidieh
General Comments about CIs • A confidence interval is a range of plausible values for a parameter of interest. • It is meant to account for the influence of sampling variation. • Confidence intervals: Point estimate ± margin of error • It has two main parts: • An interval of form (a, b), a < b. • A confidence level such as 95%, 99%, or 90%; • The general notation is (1-α)100% so for 95% CI is α = 5%, for 99% CI α = 1%, etc. BUAD 310 - Kam Hamidieh
CLT • CLT tell us that for large n: • The above statement is true regardless of the population shape. • Now we can make probability statements! BUAD 310 - Kam Hamidieh
Confidence Intervals μ +2σ μ -2σ 95% Confidence Interval BUAD 310 - Kam Hamidieh
Confidence Intervals • One issue: we don’t know σ! • For now only, let’s assume that σ = s! • The standard error SE() = 2800/sqrt(140) ≈ 237 • Now we have: [1990 – 2(237) , 1990 + 2(237) ] ≈ $[1520 , 2460] • We say:We are 95% confident that (population) mean monthly balance of those who will accept the card is between $1,520 to $2,460. BUAD 310 - Kam Hamidieh
From Last Time • We are 95% confident that (population) mean monthly balance of those who will accept the card is between $1,520 to $2,460. • What does “95% confident” mean? 95% confident is not the same as 95% probability! BUAD 310 - Kam Hamidieh
CI Interpretation • For this particular interval of (1520, 2460), there are only two possibilities: • μ is in (1520, 2460) • μ is not in (1520, 2460) • You have no idea if your interval got μ! • But: the statement of 95% confidence says that “we arrived at these numbers by a method that gives the correct results (gets the population mean) 95% of the times.” BUAD 310 - Kam Hamidieh
Simulation • Take a look at:http://onlinestatbook.com/stat_sim/conf_interval/index.html BUAD 310 - Kam Hamidieh
The Unreasonable Assumption • Recall that we made an unreasonable assumption: we know the population standard deviation . • We can estimate by the sample standard deviation s. • Recall that • Like the sample mean, the sample standard deviation is a random variable. BUAD 310 - Kam Hamidieh
Unreasonable Assumption • With the assumption, ~ N(0,1), we get: • Once we replace by s, we have: ~ t-distribution with degrees of freedom n -1: BUAD 310 - Kam Hamidieh
The t distribution • Like the normal distribution, the t distribution is bell shaped but has heavier tails. • The t distributions become wider for smaller sample sizes, reflecting the lack of precision in estimating s from s. • Theoretically as n gets bigger, it looks more and more like a standard normal random variable. BUAD 310 - Kam Hamidieh
The Brewmaster! • The fellow to the right is William Gosset. • He is the discoverer of the t distribution. • He came across it while working as a chemist at a Guinness brewery! • He had to publish his research under the pseudonym Student. BUAD 310 - Kam Hamidieh
Getting Values for T-Distribution • A nice site to get probabilities for t-distribution:http://www.stat.tamu.edu/~west/applets/tdemo.html • Or StatCrunch: BUAD 310 - Kam Hamidieh
Example What should be the t values when our confidence levels are at 95% and n = 16? How about 99% and n = 16? (Use a new Table!) BUAD 310 - Kam Hamidieh
T-Table (Used mainly for CI) BUAD 310 - Kam Hamidieh
corresponds to the picture we get this value from software or table The General Form of CI for μ Choose a rs of size n from a population having unknown mean μ. The 100(1-α)% confidence interval for μ is This interval is approximately correct for large n BUAD 310 - Kam Hamidieh
Some Comments Notation • Standard Errors: Complication • CLT holds regardless of what the distribution of the original population is. • Issue: Theoretically, for ~ t distribution, we must have data come from a normal population. However, it has been shown empirically that CI results are robust when n is large. • How large is large? BUAD 310 - Kam Hamidieh
Guidelines for Sample Sizes (Exploring the Practice of Statistics by Moore, McCabe & Craig, 2014) • Robust method = the method still works when some assumptions are violated. • Check your data via histogram to see if they could come from a normal population. • The confidence interval should still be useful for non-normal data when 15 ≤ n < 40 unless you have extreme outliers. (Check via boxplots). • When n ≥ 40, the confidence interval can be used even with skewed distributions. BUAD 310 - Kam Hamidieh
Example Suppose we survey a random sample of 30 people, record their incomes and get an average of $50,000 with an SD of $10,000. Let μ = population mean income • Find the 95% confidence for μ& interpret the interval. • Find the 99% confidence for μ. • Refer to the confidence interval you got from part 1. Does the confident interval imply that 95% of the people have incomes in the range of the interval you computed? • What happens to width of your confidence interval as the confidence level is increased (while holding all else fixed)? BUAD 310 - Kam Hamidieh
In Class Exercise 1 A market research firm asks 28 consumers randomly chosen to rate a product from 1 to 10. She gets answers with a mean of 7.2 and a standard deviation of 2. • Define your population parameter of interest. • What is the 95% confidence interval for the parameter of interest? • Interpretation? BUAD 310 - Kam Hamidieh
Inference for Population Proportion • Our goal will be to estimate p, the unknown population proportion. • We will estimate p with , the sample proportion. • Do not confuse p with µ. BUAD 310 - Kam Hamidieh
Some Examples • President’s approval: p = population proportion of a people approving president’s work. (Do you approve? Y/N) • Graduation rates: p = population proportion of people graduating from a college. (Did you graduate? Y/N) • Health insurance coverage rates: p = population proportion of adults with comprehensive health insurance. (Do you have health insurance? Y/N) BUAD 310 - Kam Hamidieh
In Class Exercise 2 A study of religious practices among college students interviewed a sample of 127 students. 107 of the students said that they prayed at least once in a week. We are interested in the proportion of college students who pray at least once in a week. • What is your parameter of interest here? • Can you give an estimate of your parameter of interest? BUAD 310 - Kam Hamidieh
Sample Proportion as a RV • is a random variable. • Good news! When n is large: BUAD 310 - Kam Hamidieh
CI for Population Proportion p (1-α)100% confidence interval for population proportion p using sample proportionis the z value corresponds to the picture BUAD 310 - Kam Hamidieh
Comment on Notation Standard Errors: BUAD 310 - Kam Hamidieh
Back to the Credit Card Example Alumni of the school p = proportion who will acceptthe card. 1000 Pre-approved Cards sent RS Compute the sample proportion… Now is our estimates of p. BUAD 310 - Kam Hamidieh
Example • Note that sample sizes are large enough:n = (1000)(0.14) > 10, n = (1000)(0.86) > 10 • 95% CI: • Interpretation: We are 95% confident that the proportion of the alumni who will accept the card is between 11.85% and 16.15%. BUAD 310 - Kam Hamidieh
Polling Pew Research +/- 3% or something?http://www.pewresearch.org/ (http://www.pewinternet.org/files/2014/02/PIP_25th-anniversary-of-the-Web_022714_pdf.pdf) How does the Gallop poll get its +/- 3% or something?http://www.gallup.com/home.aspx (http://www.gallup.com/poll/151550/Gallup-Daily-Economic-Confidence-Index.aspx) BUAD 310 - Kam Hamidieh
In Class Exercise 3 A survey of 13,819 college students collected information on drinking behavior and alcohol related problems. “Frequent binge drinking” was defined as binge drinking three or more times in the past two weeks. 3140 students were classified as frequent binge drinkers. • What is an estimate the population proportion of college students, p, who are frequent binge drinkers? • Create a 99% confidence interval for p. • Interpretation? BUAD 310 - Kam Hamidieh
Next Time • Chapter 16: Statistical Tests BUAD 310 - Kam Hamidieh