190 likes | 292 Views
Statistics & Data Analysis. Course Number B01.1305 Course Section 31 Meeting Time Wednesday 6-8:50 pm. Midterm Review. Midterm Format. Open book and open notes No solution guides or other resources are permitted A scientific calculator will be required All questions will be short answer
E N D
Statistics & Data Analysis Course Number B01.1305 Course Section 31 Meeting Time Wednesday 6-8:50 pm Midterm Review
Midterm Format • Open book and open notes • No solution guides or other resources are permitted • A scientific calculator will be required • All questions will be short answer • Entire class period is available for exam
Exam Coverage • Chapter 1 • Understand reasons for statistics • Chapter 2 • Distinguish between qualitative and quantitative variables • Describe and interpret plots of data • Understand and calculate measures of center • Understand and calculate measures of variation
Exam Coverage • Chapter 3 • Understand different sources of probabilities • Understand and use basic principles of probability • Addition • Compliments • Multiplication • Calculate conditional and unconditional probabilities • Understand, use and determine statistical independence • Be able to construct and interpret probability tables and trees • Chapter 4 • Understand probability distributions • Calculate the expected value and standard deviation of a probability distribution
Exam Coverage • Chapter 5: Some Special Probability Distributions • Calculate probability of an event using • Counting methods • Binomial distribution • Normal distribution • Chapter 6: Random Samples and Sampling Distributions • Understand and identify sources of sample bias • Understand difference between the distribution of a summary statistic and distribution of a population • Identify the sampling distribution of the sample mean • Understand the use of the Central Limit Theorem • Interpret a normal probability plot
Exam Coverage • Chapter 7: Point and Interval Estimation • Understand unbiased and efficient estimators • Calculate and interpret confidence intervals • For population mean with standard deviation known • For population proportion • For population mean with standard deviation unknown • Determine sample sizes for a given confidence level and tolerance width • Understand t-distribution • Understand key assumptions underlying confidence interval methods
Practice Problems with Answers in Book • 2.26 • 3.35 • 3.36 • 3.46 • 3.47 • 3.48 • 3.53 • 3.54 • 3.55 • 3.59 • 3.60 • 3.63 • 3.64 • 3.65 • 3.66 • 3.67 • 3.68 • 4.35 • 4.36 • 5.37 • 5.38 • 5.40 • 5.41 • 6.29 • 6.35 • 6.36 • 6.37 • 7.41 • 7.42 • 7.47 • 7.48 • 7.58 • 7.59 • 7.60 • 7.76 • 7.77
Interpretation Review • Mode: value or category with the highest frequency in the data • Median: middle value when the data are arranged from lowest to highest • Mean: sum of measurements divided by the number of measurements • Variance: squared deviations from the mean • Empirical Rule: • IQR: 75th percentile – 25th percentile • Random Variable: quantitative result from an experiment that is subject to random variability • Expected Value: probability-weighted average of possible values • Permutations: number of sequences of r symbols taken k at a time • Combinations: number of subsets of r symbols taken k at a time • Central Limit Theorem: For any population, the sampling distribution of the sample mean is approximately normal if the sample size is sufficiently large. • Interval estimate: states the range within which a population parameter probably lies • 95% Confidence interval: • About 95% of similarly constructed intervals will contain the parameter being estimated
Question #1 • Fortune magazine publishes a list of the world's billionaires each year. The 1992 list includes 233 individuals. Describe this distribution of wealth. Why do you think the distribution is the way it is (Hint: is this a representative sample)?
Question #2 • As a marketing consultant, you observed 50 consecutive shoppers at a grocery store, and recorded how much money each shopper spent in the store. • (a)Create and interpret a histogram of these data. • (b)Create and interpret a stem-and-leaf plot of these data. • (c)Create and interpret a boxplot of these data. • (d)Provide your client with an executive summary of your analysis.
Question #3 • A narcotics enforcement unit works with customs officers at an airport that serves international travelers on a route that has plausible links to the drug trade. This enforcement unit has developed a smuggler profile that it uses to initiate full searches of people who meet the profile. These profiles typically require meeting a number of conditions such as (a) male under 40, (b) traveling alone, (c) loose clothing, and so on. • Fully 100% of the travelers who meet the profile were searched, and 10% of those who did not meet the profile were searched. After collecting considerable data, these figures resulted: • Percentage of people who meet the profile: 4% • Percentage of people who meet the profile and • then are found to have illegal drugs 35% • Percentage of people who do not meet the • profile and then are found to have • illegal drugs 3% • (a) Based on these figures, what percentage of travelers on this particular route is carrying illegal drugs? • (b) What percentage of the drug-carrying travelers will be captured by this procedure? Assume that all drug carriers who are searched will be captured. • (c) Given that a traveler is carrying illegal drugs (whether captured or not), what is the probability that this person will meet the profile?
Question #4 • A restaurant has collected data on its customers’ orders and had estimated probabilities about what happens after the main course. It was found that 20% of the customers had dessert only, 40% had coffee only, and 30% had both dessert and coffee. • (a)Draw a probability tree for this situation • (b)Find the probability of the event “had coffee.” • (c)Find the probability of the event “did not have dessert” • (d)What percentage of customers will have “neither coffee nor dessert”? • (e)What percentage of customers will have “coffee OR dessert”? • (f)Are the events “had coffee” and “had dessert” mutually exclusive? How do you know? • (g)Given that a customer had coffee, what is the probability that the same customer had dessert? • (h)Are “had dessert” and “had coffee” independent events? How do you know? • (i)Find the conditional probability of having dessert GIVEN that the customer did not have coffee • (j)Find the conditional probability of having dessert GIVEN that the customer did have coffee • (k)Based on your analyses above, who is more likely to order dessert, a customer who orders coffee, or one who does not?
Question #5 • Acorn is the acronym for Association of Community Organizations for Reform Now. • These data were presented by Acorn to a Joint Congressional Hearing on discrimination in lending. Acorn concluded, "Banks generally have exhibited a pervasive pattern of lending practices that have the effect, intended or not, of racial discrimination. Wide disparities in rejection rates for minority and white applicants, even in comparable income groups, were found in all SMA's, and at nearly every institution studied." • The data provide are as follows: • Data: bankdata.txt • Number of cases: 20 • Variable Names: • ·Name of bank • ·MIN = refusal rate for minority applicants • ·WHITE = refusal rate for white applicants • ·HIMIN = refusal rate for high income minority applicants • ·HIWHITE = refusal rate for high income white applicants • Using the data provided and the methods learned in class, write a short argument in support of or disputing Acorn’s claim that banks have exhibited racial discrimination. Use both graphics and text to help make you case.
Question #6 • Research on insider traders who were arrested revealed that 38% of them committed some other white-collar crime. • What is the probability that of the last 100 arrested insider traders, 30 committed another crime?
Question #8 • Identify a situation relating to your work or business interests in which statistical sampling might be (or has been) helpful • (a)Describe the population and indicate how a sample could be chosen • (b)Identify a population parameter of interest and indicate how a sample statistic could shed light on this unknown. • (c)Explain the concept of the sampling distribution of this statistic for your particular example.
Question #10 • A city decides to determine the mean expenditures per tourist per visit. A random sample of 100 finds that the average expenditure is $800. The standard deviation of expenditures for all tourists is $120. • A) What is the standard deviation of the mean, given that the standard deviation of the whole population is $120 and the number of people sampled is 100? • B) What is a 95% confidence interval for the value of the expenditures per tourist? Provide an interpretation. • C) If the city wants to determine the average expenditure within plus or minus $20, how many people does it need to sample?
Question #11 • In border towns such as Detroit and Buffalo, Canadian coins frequently end up in business cash registers. Canadian denominations are identical to U.S. denominations, and the coins are virtually identical in size, color, and weight. At present, the exchange rate favors the U.S., and banks encourage their customers to sort out the Canadian coins. • A Buffalo bank has been monitoring the deposits of one of its large customers, a supermarket. The bank has recorded on 45 days the face value of Canadian coins per $100 deposited. For these 45 days, the average amount was $3.46, with a standard deviation of $0.52. Give a 95% confidence interval for the population mean.