870 likes | 1.05k Views
BU255: Statistics Exam-AID. By: Ryan Pink Some images used from course slides. Agenda. Chapter 2-8.. Go through them all.. Show you the formulas.. Use examples for each.. Answer any questions you have.. Leave you with a sick package.. Then, tell your friends to come support! .
E N D
BU255: Statistics Exam-AID By: Ryan Pink Some images used from course slides
Agenda • Chapter 2-8.. • Go through them all.. • Show you the formulas.. • Use examples for each.. • Answer any questions you have.. • Leave you with a sick package.. • Then, tell your friends to come support!
Chapter 2 • What is statistics?: • A way of getting information from data • Is the science of estimating info about a POP based on analysis from a SAMPLE. • Population vs Sample • POP: complete set • SAM: subset of the POP • We make estimates or inferences about the POP from the sample data.
Chapter 2 • Parameter and Statistics • PARA: Describes the population ie pop. mean (μ) or pop. variance (σ2) • STAT: describes a sample, an estimate of the population parameter. ie sample mean or sample variance (s2)
Chapter 2 • Descriptive Statistics: Uses data collected on a group to describe or reach conclusions on that same group. • Inferential Statistics: Uses data collected on a sample to describe or reach conclusions on the population that the sample represents. • Types of Data: • Nominal • Ordinal • Interval
Chapter 2 • NOMINAL: can only be used to classify or categorize • Frequency: how many times did it occur? • Relative Frequency: what percentage of the time did it occur? • Only Pie Graphs and Bar graphs
Chapter 2 • ORDINAL:can be used to rank or order objects • Nominal and Ordinal level data are referred to as nonmetric or qualitative data
Chapter 2 • INTERVAL: distances b/w numbers have meaning • Can draw Histograms, to get probability, proportions. • Ie. average daily temperature or change in stock price • Skewness: a distribution lacks symmetry BIMODAL Negatively Skewed Positively Skewed
Chapter 2 • INTERVAL: • Relationships between two interval variables: • SCATTER DIAGRAM • We are interested in 1) Linearity and 2) Direction
Chapter 4 • Measure of Central Location • Mean, Median, Mode • Measure of Variability • Range, Standard Deviation, Variance, Coefficient of Variation. • Measure of Linear Relationship • Covariance, Correlation, Coefficient of determination, Least Squares Line
Chapter 4 • Measure of Central Location • Arithmetic Mean • Only for Interval data • Simple Average • 1, 1, 1, 4, 4, 7, 7, 10, 30 • Sum = 65, n = 9. Mean = 65/9 = 7.22 • Median • Value that falls in the middle of the set • 1, 1, 1, 4, 4, 7, 7, 10, 30. • Median = 4 • Mode • Most frequent number • 1 was present three times. NOTATION: N = number in POP n = number in SAM u = mean of POP x = mean of SAM
Chapter 4 • Geometric Mean (diff from Arithmetic) • If you invested in 2006 in RIM ($70), you doubled in 2007 (to $140) and lost your shirt in 2008 (to $50) (more accurately, you lost 64% in 2008 from your 2007 level). • Arithmetic mean = [1.00 + (-.64) ] / 2 = 18% • BUT WRONG! (cause you went from $70 down to $50..) • Geometric mean: • R1 = 100% (OR 1) • R2 = -64% (OR -.64) • Rg = -% • (your annual return is a loss of 15% - DON’T MESS THIS UP!)
Chapter 4 VARIENCE FOR SAMPLE • Measures of Variability • Measures spread • Range: difference between largest and smallest, but doesn’t tell you anything about the points in between. • Calculating Variability by sum of deviations does not work (since a mean of 10 with points 0, 10, and 20 (10-0, 10-10 and 20-10 = 0, but mean of 10 with points 9,10,11 is MUCH tighter, but still sum to 0) VARIENCE FOR POP
Chapter 4 • Standard Deviation • Square root of the variation • Used to compare variability in several pop’s and to make statements about the general shape of a dist. • EMPIRICAL RULE: 1 stdev encompasses 68% of points • 2 stdev’s 95% and 3 99.7% • CHEBYSHEFF’s THEOREM: k stdev encompasses of points (so for 2 1-(1/2)^2 = .75 or 75%) • DIFFERENCE: Empirical Rule is about NORMAL distributions, if NOT NORMAL (or if you don’t know), use Chebysheff to be safe!
Chapter 4 • Example: • If the midterm average of those who attended an SOS session is 80 with a standard deviation of 5 marks, if dist is normal, what range would include 95% of all marks? • Empirical, 2 stdev’s, so 70 - 90 • What range would include 88.9% of marks if the dist was not normal? • ChebySheff, 2 stdev’s is 75%, 3 is 88.9% (try it!) • SO a range of 65 – 95 would include 88.9% of marks.
Chapter 4 • Measure of Linear Relationship: • Three ways to infer strength and direction • Covariance • Coefficient of Correlation • Coefficient of Determination
Chapter 4 • Covariance • If sets are positively correlated, then positive. • If sets are negatively related, then negative. • If no real relationship, then around 0 = Sxy = σxy
Chapter 4 • Coefficient of Correlation • Covariance says ‘what is the relationship? + or – • Coeff. Of Corr says ‘how strong is that relationship? Is it really closely linked (close to 1 or -1) or is it a weak relationship (around 0)
Chapter 4 • Coefficient of Correlation: • If -1, 0, or 1 you can definitely indicate the relationship between the two (perfectly +ve etc) • But for all the others between, you don’t know the exact amount that they are affected by each other • Coefficient of Determination! • measures the amount of variation in the dependent variable that is explained by the variation in the independent variable. • Denoted by R2 so just square the coefficient of correlation.
EXAMPLE • How much of Obama’s change in popularity is directly attributed to the length of SNL skits of Palin? (assume normal) END GOAL: NEED Coefficient of Determination!! To get that: need Coefficient of Correlation To get that: need Covariance and both standard deviation’s To get that: need variance
Chapter 4 VARIENCE FOR SAMPLE = Sxy Length Mean = 35 / 5 = 7 mins Obama Mean = 220 / 5 = 44 % of votes (-4)(-9) +(-1)(-2)+(-3)(-6) +(1)(6) +(7)(11) = 139/4 = 34.75 LEN Variance: sx2 (4)2 +(1)2 +(3)2 +(1)2 +(7)2 = 76/n-1 = 76/4 = 19 Obama Var: Obama Variance: : sy2 (9)2 +(2)2 +(6)2 +(6)2 +(11)2 = 278/n-1 = 278/4 = 69.5
Chapter 4 Sxy = 34.75 Sx = √S2 = √19 = (len stdev) = 4.35 Sy = √69.5 = (obmam stdev) = 8.33 r = 34.75 / (4.35) * (8.33) = 0.959 (between -1 and 1) We know that there is a strong relationship, but since it is not 1 exactly, how much of the variance is due to the length of palin’s skits? – Coefficient of Determination R2 = 0.9592 = .92 92% of the variation in Obama’s % is due to the direct length of Palin’s skits. BOTTOM LINE: See if you can get Palin’s skits extended by any means necessary!!
Chapter 4 • Least Squares Method • The objective of the scatter diagram is to measure the strength and direction of the linear relationship • Both can be more easily judged by drawing a straight line through the data. • How to draw that line? LSM! • This line has the smallest sum of squared distances to all the points on the plot.
Chapter 4 • LSM: It creates a line, and it is created by: You calculate b1, then for b0 sub in the mean values of x and y, solve for b0 and then rewrite like the bottom one here. OBAMA EXAMPLE: B1 = 34.75 / 19* = 1.83*var not stdev B0 = 44 – (1.83)*7 = 31.2 FINAL LINE: Y = 31.2 + 1.83x At 0 mins, he has 31.2%, but with every minute of her skit, obama gets 1.83% of the supporters.
CHAPTER 5 • 1. Data Collection • 1a. Published data • 1b. Observational and Experimental data • 1c. Surveys • 1d. Sampling • 2. Sampling Methods • 2a. Non-probability sampling • 2b. Probability Sampling • 3. Errors • 3a. Sampling Errors • 3b. Non-sampling Errors
Chapter 5 • Reliability and accuracy depend on the method of collection, and affect the validity of the results. • Three most popular sources: • Published data (revenue can) • PRIMARY = done yourself! • SECONDARY = taking from another source • Observational studies • Uncontrolled recorded of results • Experimental studies • Recording of results while controlling factors
Chapter 5 • Survey • Solicit info from people • Personal / Phone / self-administered • Sampling • Why Sampling: • Lower Cost • Impossible population size • Possible destructive nature of the sampling process • Probability Sampling: 100% random selection • Non-probability sampling: selecting on researcher's judgment that they are representative.
Chapter 5 • Sampling: • Three Different Types: • Simple Random Sampling: Assign numbers, generate random numbers and sample! • Stratified Random Sampling: classify pop into strat’s and then selected randomly within each (age, education, race, province..) • Can get info about whole pop, about relationship between strata’s and among each strata! • Cluster Sampling: if you can’t get a full pop list, or they are hard to question, then take a cluster (GTA, or people on facebook) and sample them • Issue: may increase sampling error due to similarities in cluster!
Chapter 5 • Sampling Errors: • When the distribution of the sample is not the same as the population (means or stdev are different) • INCREASE SAMPLE SIZE to minimize this error!! • Non-sampling Error: • Mistakes made in data acquisition. • Inc sample size does NOT fix this. • 3 types: • Error in Data Acquisition • Non-Response Errors • Selection Bias
Chapter 6 • Introduction to Probability • Assigning Probabilities • Basic Relationships of Events • Joint, Marginal, Conditional Probability • Rules
Probability • Assigning Probabilities • Classical: assume equally likely and independent. • Rolling dice (1/6 chance) • Relative Frequency: assigning probabilities on experimental or historic data. • Forecasting based on previous demand. If you sold 1 computer 20% of all working days, use that going forward. • Subjective: assign on assignor’s judgment • When historic measure aren’t good enough, often used in conjunction with benchmarks. (WEATHER FORECASTING!) • Theoretical: use known probabilities. • Based on a calculated probability (like arrivals at Tim Horton’s in queue theory)
Events • 4 different type of events: • Complement of an Event • Union of Two Events • Intersection of Two Events • Mutually Exclusive Events
Chapter 6 • Joint Probability • Intersection of two events. • P(A and B) • Question: Odds you passed and you came to an SOS session? P(Pass and SOS)
Chapter 6 • Marginal Probability • The summation of a particular event • Add up each row and column (make new r/c) • Question: Probability that you will pass the exam?
Marginal Probability • The summation of a particular event • Add up each row and column (make new r/c) • P(A1) = P(A1 + B1) + P(A1 + B2) • Question: Probability that you will pass the exam?
Conditional Probability • The probability of an event GIVEN another event • P(A | B) = P(A B) / P(B) • Question: Probability that you passed given you came to an SOS session? • P(passed | attended SOS) = P(passed and came) / P(attended) • .40/ .45 = 88.8% U
RULES • No empty set • The probability of A is 1 minus its complement • Union is all of A + all of B, subtract what they have in common (don’t double count!) • If A and B are mutually exclusive (no touching of circles) then it is just P(A) + P(B) • Set of A is smaller or equal to set of B if A is a subset of B.
RULES • Independent Events • Events A and B are independent if P(A|B) = P(A) • If there is a 30% chance that it is going to rain on your exam day. • Question: Probability that you passed given that it rained? • P(passed| rained) = They are independent, no correlation, so • = P(passed) = 85%
Bayes Theorem • Start with your initial or prior probabilities. • You get new info. • So now with new info, you calculate revised or posterior probabilities • This process is Bayes Theorem
Bayes Theorem • Bayes’ theorem is applicable when the events for which we want to compute posterior probabilities are mutually exclusive and their union is the entire sample space KEY DIFFERENCE: You are just now, adding up all the partitions that contain B on the bottom, since you have them all split up. Conditional Probability: P(Ai|B) = P(Ai)*P(B|Ai) P(B)
Bayes Theorem • Example: • Two printer cartridge companies, Alamo and Jersey. • Alamo makes 65% of the cartridges • Jersey makes 35%. • Alamo has a defective rate of 8% • Jersey has a defective rate of 12% • Customer purchases a cartridge, prob that Alamo made it? - Cartridge is tested, and it is defective. (new info) b) What is the probability that Alamo made the cartridge? c) What is the probability that Jersey made the cartridge?
ANSWER • The knowledge of the producer breakdown is the prior probability: • Alamo = 65% P(E1) • Jersey = 35% P(E2) • We know the conditional probabilities of the defective rates: • Alamo = 8% P(D|E1) • Jersey = 12% P(D|E2)
ANSWER 1: TABLE Odds of getting an alamo cartridge that is defective if you bought it at futureshop by random Given that you got a defective cartridge, since there is a 9.4% chance of getting a defective one, and 5.2% of that 9.4% is Alamo’s, then you have a 55.3% of it being Alamo’s!
ANSWER 2: TREE Defective .08 .052 Alamo .65 .094 Acceptable .92 .598 Defective .12 .042 Jersey .35 Acceptable .88 .308 Revised Probabilty: Alamo = .052 / .094 = .553 Revised Probabilty: Jersey = .042 / .094 = .447
ANSWER 3: FORMULA • Chance defective will be an Alamo: Probably of defective given an Alamo (.08) Probably of an Alamo (.65) P(Alamo | D) = The summation of all the cartridge types * their defective probability (find out in total how many defective ones are there?) = (.094) P(Alamo) * P(D|Alamo) P(Alamo)*(P(D|Alamo) + P(Jersey)* P(D|Jersey) P(Alamo | D) = .65 * .08 / (.65*.08) + (.35*.12) P(Alamo | D) = .052 / .094 = 55.3%
CHAPTER 7 1. Random Variables and Probability • Distributions: Introduction 2. Discrete Probability Distributions • A. Introduction • B. Mean and Variance • C. Laws of Mean and Variance 3. Bivariate Distributions • A. Introduction, Marginal probability distribution • B. Mean, Variance, covariance, coefficient of correlation • C. Conditional probability, independence • D. Laws of summation
Random Variable • Random variable definition: a variable that contains the outcomes of a chance experiment. • Two types: • Discrete Random Variable • Countable number of values (students in a class) • Continuous Random Variable • Takes on an uncountable number of possible outcomes • Time in 100m sprint (could be 9.5s, or 9.51s, or 9.519s…)
Discrete Prob. Distributions • Table / Graph that lists all the outcomes and their probabilities = Discrete Prob. Dist. • You can calculate the prob of a certain outcome • P(x) • RULES: • P(x) MUST be between 0 and 1 • Sum of all P(xi) = 1
Continuous Prob. Distribution • This represents a population (since infinite amount of outcomes), and need to calculate parameters to depict distribution: • Need Pop mean and Pop variance. • Population Mean • (using discrete variables to determine parameter about pop): • Population Variance • (using discrete variables to determine parameter about pop): OR