440 likes | 584 Views
Review # 1. Chapter 4 Chapter 6 Chapter 7 Chapter 8. Chapter 4: Descriptive Statistics Numerical Methods. Problem 1 (Calculations) The ages of employees of a fast-food outlet are as follows: 19, 19, 65, 20, 21, 18, 20. a) Compute the mean, the median, and the mode of the ages
E N D
Review # 1 Chapter 4 Chapter 6 Chapter 7 Chapter 8
Chapter 4: Descriptive Statistics Numerical Methods • Problem 1 (Calculations) The ages of employees of a fast-food outlet are as follows: 19, 19, 65, 20, 21, 18, 20. a) Compute the mean, the median, and the mode of the ages Mean = (19+19+…+20)/7 = 26 Median = the center of the sorted series = 20 {18, 19, 19, 20, 20, 21, 65} Mode = the number with the largest frequency = 19, 20 Mean = 19.5 Median = 19.5 = (19+20)/2 Mode = 19, 20 no change. b) Assume the oldest employee retires The mean is sensitive to extreme value; the median and the mode are less sensitive.
Measures of Central location (Excel, interpretation) • Problem 2 (Excel, interpretation) • The summer income of a sample of 125 second-year business students are stored in Prob 2 • Calculate the mean and the median • What do the two measures of central location tell you about the income? • Which measure should be used to summarize the data?
Ch. 4: Measures of Central location Problem 2 - solution • The distribution is reasonably symmetrical, but a few low incomes have pulled the mean below the median, resulting in a distribution slightly skewed to the left. • Either measure could be used, but the median is better because it is not affected by a few low incomes.
Measures of Central location Problem 2 - solution • The distribution is reasonably symmetrical, but a few low incomes have pulled the mean below the median, resulting in a distribution slightly skewed to the left. • Either measure could be used, but the median is better because it is not affected by a few low incomes.
Measures of Central location • Problem 3 (Excel, interpretation) The owner of a hardware store that sells electrical wires by the meter is considering selling the wire in pre-cut lengths to save on labor cost. A sample of wire sold over the course of 1 week was recorded (Prob3.xls). A) Compute the mean, median and mode B) What is the weakness of each measure in providing useful info? C) How might the owner decide on the lengths to pre-cut?
Measures of Central location A) The mean resides to the right of the median. The distribution of lengths is somewhat asymmetrical, skewed to the right (there must be some long wires sold that affect the mean value).
Measures of Central location B) The mean is unduly influenced by extreme observations. The median doesn’t indicate what lengths are most preferred. The mode doesn’t consider any desired lengths other than the one most frequently purchased.
Chapter 3: Measures of Central location C) You may draw the cumulative distribution and trim the tails 3 4 10 11
! The mean (both x an m ) = (14+7+…+5)/5 = 9 The sample variance = s2 = [(14-9)2+(7-9)2+…+(5-9)2]/(5-1) = 12.5 The population variance = s2 = [(14-9)2+(7-9)2+…+(5-9)2]/5 = 10 Measures of Variability • Problem 4Calculate the mean, variance and the standard deviation of the following set of numbers, treating them as • Sample • Population • The set is: 14, 7, 8, 11, 5 For the standard deviation: take the square root of the variance
The variance • Problem 5 (The variance, Excel) • The number of customers entering a bank each hour for the last 100 days was recorded (Problem 5). • For each hour determine the mean and standard deviation. • What do these statistics tell you?
The Variance • Problem 5 – solution • The noon hour (12–1) is the busiest, followed by the (2–3 P.M.) and (10–11 A.M.) periods. • The variances during the noon hour and between 10-11 AM are the largest, which makes it difficult to predict the number of customer entering the bank. • Staff lunch breaks and coffee breaks should be scheduled with this in mind. Comment: All the samples can be analyzed in a single run by Excel > Descriptive statistics.
Problem 5 (interpretation, empirical rule, Chebyshev) • The mean and standard deviation of the grades of 500 students who took an economic exam were 69 and 7, respectively. • What are the numerical endpoints of the intervals (x-s, x+s), (x-2s, x+2s), x-3s, x+3s) • If the grade have a mound-shape distribution, approximately how many students received a grade in each of the three intervals specified above? • If the grades do not have a mound shaped distribution,at least how many students received grades in the interval The Variance
The Variance 69 – 7 = 62 69 + 7 = 76 Problem 5 - solution b) For a mound shaped distribution the Empirical Rule applies. Thus, Approximately (.68)(500) = 340 grades are in (62, 76) Approximately (.95)(500) = 475 grades are in (55, 83) Virtually all of the grades are in (48, 90) a)
1-1/12 = 0. None or more of the Observations can be found within ±s around the mean 1-1/22 =3/4. At least ¾ of the observationscan be found within ±2s around the mean. 1-1/32 = 8/9. At least 8/9 of the observationscan be found within ±3s around the mean. Chebychev Theorem • If the distribution is not mound shaped we need to use Chebychev Theorem: c) If the distribution is not mound shaped, at least (8/9)(500) = 444.4 (or 445) grades are within (48, 90)
Chapter 6 (Probability) • Probability is a numerical measure that represents the likelihood of occurrence of a random event. 0£P(A) £1 Relationships between events Union: Event A or event B have occurred (at least one of them took place). • Intersection: Event A and event B have occurred (both event took place simultaneously). • Complement event: If event A did not occur, then event called “Not A” (A) occurred.
Marginal probability Joint probability Chapter 6 • Problem 6 • A firm classifies its customers’ accounts in two ways: By balance and whether it is overdue. Account balance Overdue Not OverdueUnder $100 .08 .42$100 - $500 .08 .22Over $500 .04 .16 • Define the following events: • A: An account is under $100 • B: An account is overdue • An account is selected at random Find the following probabilities: P(under $100)=P(A)= .08+.42=.50 P(under $100 and overdue)= P(A and B)=.08.
Chapter 6 • Problem 6 • A firm classifies its customers’ accounts in two ways: By balance and whether it is overdue Account balance Overdue Not OverdueUnder $100 .08 .42$100 - $500 .08 .22Over $500 .04 .16 • Define the following events: • A: An account is under $100 • B: An account is overdue Find the following probabilities: P(under $100 or overdue)= P(A or B)= [.08+.42]+ [.08+.08+.04]=.70
Chapter 6 • Problem 6.11 (Relationships:and, or, conditional) • A firm classifies its customers’ accounts in two ways: By balance and whether it is overdue Account balance Overdue Not OverdueUnder $100 .08 .42$100 - $500 .08 .22Over $500 .04 .16 • Define the following events: • A: An account is under $100 • B: An account is overdue Find the following probabilities: P(under $100 or overdue)= P(A or B)= [.08+.42]+ [.08+.08+.04]=.62 P(not overdue)=P(not B)= .42+.22+.16=.80. Or, P(not B) = 1 – P(B) = 1 – (.08+.08+.04) = 1 - .20. More events and their probabilities…
Note: P(A)=.50 but P(A|B)=.40 Chapter 6 • Problem 7 (conditional probability) • A firm classifies its customers’ accounts in two ways: By balance and whether it is overdue Account balance Overdue Not OverdueUnder $100 .08 .42$100 - $500 .08 .22Over $500 .04 .16 • Define the following events: • A: An account is under $100 • B: An account is overdue Find the following probabilities: • If the account selected is overdue, what is the probability that its balance is under $100? That is… • P(A|B)=? P(A|B)=.08/(.08+.08+.04)= .08/(.20)=.40 P(A|B)=P(A and B)/P(B)
Chapter 6 • Problem 7 • A firm classifies its customers’ accounts in two ways: By balance and whether it is overdue. Account balance Overdue Not OverdueUnder $100 .08 .42$100 - $500 .08 .22Over $500 .04 .16 • Define the following events: • A: An account is under $100 • B: An account is overdue Find the following probabilities: If the account selected is overdue what is the probability that its balance is $500 or less? P(C|D)=P(C and D)/P(D) P($500 or less|Overdue)= P($500 or less, and Overdue)/P(Overdue) (.08+.08)/(.08+.08+.04)=.80
Chapter 6 • Problem 8 (Multiplication rule) • Sporting goods store estimates that 20% of the students at a nearby university ski downhill, and 15% ski cross-country. Of those who ski downhill, 40% also ski cross-country. • What percentage of the students ski both downhill and cross-country? Define events: A: a student ski downhill B: a student ski cross-country Given probabilities:P(A) = .2; P(B) = .15; P(B|A) = .4 Calculate P(A and B) = P(B|A)P(A) = (.4)(.2) = .08
Review problems: 54 - 56, p.180 Chapter 6 • Problem 9 (Addition rule) • Sporting goods store estimates that 20% of the students at a nearby university ski downhill, and 15% ski cross-country. Of those who ski downhill, 40% also ski cross-country. • What percentage of the students ski both downhill and cross-country? • What percentage of the students do not ski at all? Calculate P(not A and not B) = 1 – P(A or B); P(A or B) = P(A) + P(B) – P(A and B) =(.2) + (.15) – (.08) = .27 Therefore P(not A and not B) = 1 – P(A or B) = 1-.27 = .73;
Chapter 6 • Problem 10 (Independent events, multiplication rule) • Approx. 3 out of every 4 Americans received a refund from the IRS in 1995. If 3 individuals are selected at random find the probabilities of the following events: • All three received a refund • None received a refund • At least one received a refund • Exactly one received a refund
Chapter 6 • Problem 10 – solution Let A be the event: Individual 1 received a refund. Define B and C similarly for individual 2 and 3. Then P(A)=P(B)=P(C)=3/4. • P(All the three received a refund)=P(A and B and C)=P(A)P(B)P(C)=(3/4)3 • P(None received a refund)=P(Not A, and not B, and not C)= P(not A)P(not B)P(not C)=(1/4)3 • P(At least one received…)=1-P(none received…)=1-(1/4)3. • P(Exactly one received a refund)= P(A and not B and not C)+ P(not A and B and not C)+ P(not A and not B and C)= (3/4)(1/4)(1/4)+ (1/4)(3/4)(1/4)+ (1/4)(1/4)(3/4)= 3(3/4)(1/4)2.
Chapter 7 • Problem 11 (discrete random variable, expected value, variance) • You and a friend have contributed equally to a portfolio of $500. The annual income (X) has the following distributionx 500 1,000 2,000 P(x) .5 .3 .2 • Determine the annual expected value and variance of the income earned on this portfolio. • Determine the net annual profit and variance to you. • What is the expected profit and variance to you for the next two years?
Chapter 7 • Problem 11 – solution • E(X)=(500)(.5)+(1000)(.3)+(2000)(.2)= $950V(X)=(500-950)2(.5)+(1000-950)2(.3)+(2000- 950)2(.2)=$2 322500 • E(Ann. profit)=E(X/2 – 250)=E(X/2)-E(250)= 1/2E(X) – 250 = (½)950 – 250= $225V(Ann. profit)=V(X/2 – 250)= V(X/2)+V(250) = V(X/2)+0=(1/2)2V(X)=1/4(322500).
Chapter 7 • Problem 11 – solution continued • If Xi is the income for year i, the income for the next two years is X1 + X2. Your profit is therefore, (½)(X1 + X2) – 250. • E(2 years Profit) = E(½X1+½X2 – 250) = ½ E(X1) + ½ E(X2) – 250 = [assuming the income distribution does not change between the two years] = (½) 950 + (½)950 – 250 = $700. • V(2 years Profit) = V(½X1+½X2 – 250) = [assuming the income distribution does not change between the two years, and the incomes in the two years are independent random variables] = (½)2V(X1)+(½)2V(X2)
Chapter 7 • Example 12 (The binomial distribution) • A survey reported that 20% of elementary school teachers use the Web. Fifteen teachers are selected at random. • Answer the following questions.
Chapter 7 • Solution: Let us analyze this experiment first. • There are n=15 independent experiments. • Each experiment has two possible outcomes. • The probability of success in each experiment is p=.20. which does not change from experiment to experiment. • Therefore: this is a binomial experiment. • Define: X – the number of teachers that use the Web. X is binomial with parameters n=15, and p=.2.
Chapter 7 • P(No teacher uses the Web) = • P(One teacher uses the Web)= • P(# of Web users does not exceed 8)= • P(More than 2 Web users)=P(X³3)=P(X=3)+P(X=4)+…+P(X=15)<Let us use the binomial table> =1- P(X£2)=1-.398=.602 Binomial table
Chapter 7 • The expected number of teachers using the internet = E(X)=np=15(.2)=3 user;The variance of the number of Web users=.V(X)=np(1-p)=15(.2)(.8)=2.4 users2. Standard deviation=V(X)1/2. • P(Less than 8 are Web users, given that more than 2 are users)= P(X£7|X³3 =P(X£7 and X³3)/P(X³3) =[P(X=3)+…+P(X=7)]/P(X³3)<Let us use the table> P(X=3)+…+P(X=7) = P(X£7)-P(X£2) = .996 - .398 = .598P(X³3)=1-P(X£2)=1-.398=.602 P(X£7|X³3) = .598/.602 = .993
Chapter 7 • Solution continued • Repeat this problem assuming 50 teachers were sampled. Since there are no tables available for this number of repeated trials (n), we’ll use Excel. • P(X=0)=.850 • P(X=1)=50(.2)(.8)49 • P(X£4)=0.018496 <Go to Excel > Type: =BINOMDIST(4,50,.2, True)
Chapter 8 The normal distribution Problem 1 (calculating normal probabilities) • Find the following probabilities using the normal table.P(Z>=1.7)= Normal Table ? Z 0 1.7
Chapter 8 The normal distribution Problem 1 (calculating normal probabilities) • Find the following probabilities using the normal table.P(Z>=1.7)=? .1-.9554=.0446 .9554 Normal Table Z 0 1.7
Chapter 8 The normal distribution Problem 1 • Find the following probabilities using the normal table.P(Z>= –.95)= 1-.3289=0.8289 Normal Table 0 -.95
.9394 Chapter 8The normal distribution Problem 1 (calculating normal probabilities) • Find the following probabilities using the normal table. P(-1.14£Z£1.55)=? P (Z<1.55) – P(Z<-1.14) Normal Table -1.14 1.55 0
Chapter 8The normal distribution Problem 1 (calculating normal probabilities) • Find the following probabilities using the normal table. P(-1.14£Z£1.55)=? P (<-1.14) Normal Table .1271 -1.14 0
Chapter 8The normal distribution Problem 1 (calculating normal probabilities) • Find the following probabilities using the normal table.P(-1.14 £ Z £ 1.55)= .9394 P (Z<1.55) - P(0<Z<-1.14) =.9394 - .1272 = .8122 .8122 Normal Table -1.14 1.55 0
Chapter 8The normal distribution Problem 1 (calculating normal probabilities) • Find the following probabilities using the normal table. P(-2.97 £ Z £ -1.38)=? P(0<Z< 2.97)-P(0<Z< P(Z<-1.38)-P(Z< -2.97 With Excel type:=normsdist(-1.38)-normsdist(-2.97) Normal Table -2.97 - 1.38 0
Chapter 8The normal distribution Problem 2 (application) • Mensa is an organization whose member posses IQs in the top 2% of the population. IQ is normally distributed with a mean of 100 and standard deviation of 16. • Questions • What is the probability that a randomly selected person have an IQ of 140 or more? • What minimum IQ qualifies a person to be admitted to Mensa? • What is the probability that a randomly selected person from among Mensa’s members have an IQ of more than 140?
.0062 .4938 2.5 Chapter 8The normal distribution • Question 1: What is the probability that a randomly selectedperson have an IQ of 140 or more? • Answer: Define X as the IQ level of a person.P(X>140)=P(Z>(140 – 100)/16)=P(Z>2.5)= .5-.4938=.0062. With Excel type: =1-normdist(140,100,16,True) Normal Table 0
.48 .02 Z0 Chapter 8The normal distribution • Question 2: What minimum IQ qualifies a person to be admitted to Mensa? • Answer: Define X as the IQ level of a person. For a Mensa member P(X>X0)=.02 P(Z>(X0-100)/16)=.02If we define Z0=(X0-100)/16 then P(Z>Z0)=.02Let us first find Z0 and then X0. Finally, determine X0 by2.055=(X0 – 100)/16 X0 = 100+2.055(16) = 132.88 Normal Table = 2.055 0
132.88 140 Chapter 8The normal distribution • Question 3:What is the probability that a randomly selected person from among members of Mensa have an IQ of more than 140? • Answer: Define X as the IQ level of a person. • P(X>140|X>132.88)= P(X>140 and X>132.88)/P(X>132.88)= • For comparison we have seen that P(X>140)=P(Z>(140 – 100)/16)=P(Z>2.5)=.5- 4938 =.0062!! • No surprise. Given that a person belongs to Mensathe probability his/her IQ > 140 is much larger thanthis of a person from the general population. P(X>140)/P(X>132.88)=P(Z>2.5)/.02=.0062/.02=.31 Normal Table