640 likes | 816 Views
Statistics Trivial Pursuit (Sort of) For Review (math 17). Colors and Categories. Blue – Basic Graphs and Descriptive Statistics Pink – Assumptions (cumulative) Yellow – Statistical Theory and History Brown – Interpretations Green – Last 1/3 Inference
E N D
Colors and Categories • Blue – Basic Graphs and Descriptive Statistics • Pink – Assumptions (cumulative) • Yellow – Statistical Theory and History • Brown – Interpretations • Green – Last 1/3 Inference • Orange – Other Hypothesis Testing Related
Blue 1 • What are the descriptive statistics that are sensitive to outliers?
Blue 2 • Provide the name and primary purpose of this graph.
Blue 3 • Provide a basic description of the distribution of this variable from its graph (remember there are 3 things to describe).
Blue 4 • What are the descriptive statistics used in the creation of a boxplot?
Blue 5 • Name the rule used to compute outliers, and describe how to apply it.
Blue 6 • Name graphs that are appropriate to display categorical variables, and state whether or not you should discuss the shape of distributions based on those graphs.
Blue 7 • Compare/contrast these 2 distributions based on the plot.
Blue 8 • A standard deviation of a measurement in feet is 3.4 feet, from a sample with a mean of 29.2. Interpret the standard deviation.
Blue 9 • This plot is part of the preliminary analysis for ….
Blue 10 • If there was a high outlier in the distribution of a particular variable, and it was removed, what descriptive statistics are likely (or certain) to change to a significant extent?
Pink 1 • What is the assumption that all chi-square tests have in common?
Pink 2 • What is the assumption related to sample sizes for a 2 sample z-test?
Pink 3 • What is the assumption related to sample size when constructing a confidence interval for p?
Pink 4 • What are the specifics of the nearly normal condition for a paired t-test?
Pink 5 • What are the specifics of the nearly normal condition for ANOVA?
Pink 6 • What are the specifics of the 2 assumptions in regression related to error terms?
Pink 7 • You are told that the randomization and independence condition is met for a sample of high school students who were asked how much money they received for their most recent birthday. Describe what the randomization and independence assumption means in this context.
Pink 8 • What are some example tests where assumptions related to normality are NOT required?
Pink 9 • What are the specifics of the nearly normal condition for a 2-sample t-test?
Pink 10 • What is the assumption that all tests/CIs have in common but which (since it is common to all) Prof. Wagaman doesn’t require that you write down when you list assumptions?
Yellow 1 • What is a sampling distribution for a statistic? (conceptually)
Yellow 2 • (Fill in at least 3 of the blanks for credit) The t distribution was discovered by ___________ who published under the pseudonym ____________. He discovered the t distribution while working for _____________ in Ireland. Specifically he was working in the field of ______________ (2 words, but one blank) and was primarily responsible for checking out _________, one of their many products.
Yellow 3 • What does the Central Limit Theorem say?
Yellow 4 • How are z-scores computed, and what are they useful for? (variety of answers)
Yellow 5 • When sampling distributions have standard deviations that involve unknown parameters, and we plug in estimates for those parameters, we obtain what value(s)?
Yellow 6 • Suppose 2 random variables X and Y are independent. X has mean 6 and standard deviation 3. Y has mean 14 and standard deviation 4. • What are the values of the mean and standard deviation of X+Y?
Yellow 7 • What are the differences between a chi-square test of homogeneity and a chi-square test of independence?
Yellow 8 • What are the three types of bias in sampling?
Yellow 9 • If you are designing an experiment and you have 3 different drugs you want to try, and you want to try them at 2 different doses each (1 pill or 2 pills daily), and you want to include (a) placebo group(s), how many treatments are there in your experiment?
Yellow 10 • Name and describe two different sampling techniques.
Brown 1 • Running a hypothesis test for slope equal to 0 or not, you obtain a t-test statistic value of -2.14. Interpret this test statistic.
Brown 2 • A linear regression results in an R-squared value of .81. Assuming linear regression was appropriate, interpret this R-square in terms of general X and Y variables.
Brown 3 • A random sample of n=16 observations yields an s=24 (sample standard deviation). What is the numerical value of the standard error of the sample mean? Also, interpret this value.
Brown 4 • Describe what is wrong with the statement: • “A p-value is the probability that the null hypothesis is true.”
Brown 5 • A 95% confidence interval for a mean weight of a new dog breed goes from (25.2, 34.6) pounds. Interpret the confidence interval given here.
Brown 6 • A regression results in an s_e value of 3.46. The y-axis goes from 36 to 109. What does the s_e value represent, and what does it tell you about how well the regression does?
Brown 7 • A p-value for an ANOVA testing for equality of 5 means with an F of 24.56 is .0359. Interpret this p-value.
Brown 8 • A 95% confidence interval for a mean weight of a new dog breed goes from (25.2, 34.6) pounds. Interpret the confidence level used here.
Brown 9 • A conclusion in a t-test of mu=150 vs. mu>150 is given as: Our evidence is not inconsistent with our null hypothesis. How should this conclusion be changed to be correct?
Brown 10 • A p-value for a two-sided two sample z-test is .1470 based on a Z of 1.45. Interpret this p-value.
Green 1 • Which set(s) of graphs indicate it would NOT be appropriate to perform an ANOVA? Explain.
Green 2 • You want to know if the distribution of class year among Reunion workers is equally split among first-years, sophomores, and juniors. What test is appropriate? • (Note, I am assuming that seniors can’t get hired to work Reunion, if they can, change this to equally split among all four class years).
Green 3 • An ANOVA where the null hypothesis is rejected results in multiple comparisons of: Estimate lwrupr 2-1 4.146737 -2.737867 11.031342 3-1 -3.742933 -10.627537 3.141671 3-2 -7.889670 -14.774274 -1.005066 Summarize what this multiple comparisons shows you.
Green 4 • If you wanted to know whether or not there is a significant association between heart rate and weight in rats, what statistical procedure would you perform?
Green 5 • You want to compare the means of 4 groups. Describe why you would want to do an ANOVA rather than 6 t-tests to compare all pairs of means.
Green 6 • You want to know if there is an association between t-shirt size (S,M,L,etc.) and class year at Amherst. What is the appropriate statistical procedure to perform?
Green 7 • You want to know if a higher proportion of underclassmen have corrective lenses compared to upperclassmen. Explain why there is no appropriate chi-square test for this situation. What analysis could you run?
Green 8 • A balanced ANOVA is an ANOVA where….