410 likes | 592 Views
Psyc 235: Introduction to Statistics. DON’T FORGET TO SIGN IN FOR CREDIT!. http://www.psych.uiuc.edu/~jrfinley/p235/. Independent vs. Dependent Events. Independent Events : unrelated events that intersect at chance levels given relative probabilities of each event
E N D
Psyc 235:Introduction to Statistics DON’T FORGET TO SIGN IN FOR CREDIT! http://www.psych.uiuc.edu/~jrfinley/p235/
Independent vs. Dependent Events • Independent Events: unrelated events that intersect at chance levels given relative probabilities of each event • Dependent Events: events that are related in some way • So... how to tell if two events are independent or dependent? • Look at the INTERSECTION: P(AB) • if P(AB) = P(A)*P(B) --> independent • if P(AB) P(A)*P(B) --> dependent
Random Variables • Random Variable: • variable that takes on a particular numerical value based on outcome of a random experiment • Random Experiment(aka Random Phenomenon): • trial that will result in one of several possible outcomes • can’t predict outcome of any specific trial • can predict pattern in the LONG RUN
Random Variables • Example: • Random Experiment: • flip a coin 3 times • Random Variable: • # of heads
Random Variables • Discrete vs Continuous • finite vs infinite # possible outcomes • Scales of Measurement • Categorical/Nominal • Ordinal • Interval • Ratio
Data World vs. Theory World • Theory World: Idealization of reality (idealization of what you might expect from a simple experiment) • Theoretical probability distribution • POPULATION • parameter: a number that describes the population. fixed but usually unknown • Data World: data that results from an actual simple experiment • Frequency distribution • SAMPLE • statistic: a number that describes the sample (ex: mean, standard deviation, sum, ...)
So far... • Graphing & summarizing sample distributions (DESCRIPTIVE) • Counting Rules • Probability • Random Variables • one more key concept is needed to start doing INFERENTIAL statistics: SAMPLING DISTRIBUTION
Binomial Situation • Bernoulli Trial • a random experiment having exactly two possible outcomes, generically called "Success" and "Failure” • probability of “Success” = p • probability of “Failure” = q = (1-p) Examples: Robot Factory: “Success”=Good Robot p=.75 Coin toss: “Success”=Heads p=.5
Binomial Situation • Binomial Situation: • n: # of Bernoulli trials • trials are independent • p (probability of “success”) remains constant across trials • Binomial Random Variable: • X = # of the n trials that are “successes”
Binomial Situation:collect data! Population: Outcomes of all possible coin tosses (for a fair coin) Bernoulli Trial: one coin toss Success=Heads p=.5 Let’s do 10 tosses n=10 (sample size) Binomial Random Variable:X=# of the 10 tosses that come up heads (aka Sample Statistic) Sample:X = ....
Binomial Distributionp=.5, n=10 This is theSAMPLING DISTRIBUTION of X!
Sampling Distribution • Sampling Distribution: Distribution of values that your sample statistic would take on, if you kept taking samples of the same size, from the same population, FOREVER (infinitely many times). • Note: this is a THEORETICAL PROBABILITY DISTRIBUTION
Sampling Distribution 3 5 6 Binomial Situation:collect data! Population: Outcomes of all possible coin tosses (for a fair coin) Bernoulli Trial: one coin toss Success=Heads p=.5 Let’s do 10 tosses n=10 (sample size) Binomial Random Variable:X=# of the 10 tosses that come up heads (aka Sample Statistic) Sample:X = ....
Sampling Distribution Binomial Situation:collect data! Population: Outcomes of all possible coin tosses (for a fair coin) Bernoulli Trial: one coin toss Success=Heads p=.5 Let’s do 10 tosses n=10 (sample size) Binomial Random Variable:X=# of the 10 tosses that come up heads (aka Sample Statistic) Sample:X = 3
specific # ofsuccesses youcould get probabilityof success specific # offailures BinomialRandomVariable probabilityof failure combination called the Binomial Coefficient Binomial Formula
Population: Outcomes of all possible coin tosses (for a fair coin) Sampling Distribution Binomial Formula p=.5 n=10 3 p(X=3) = Hmm... what if we had gotten X=0?... pretty unlikely outcome... fair coin? Remember this idea....
Ex: # heads in 5 tosses of a coin: X~B(5,1/2) ExpectationVarianceStd. Dev. # heads in 5 tosses of a coin: 2.5 1.25 1.12 More on the Binomial Distribution • X ~ B(n,p) these are theparameters forthe samplingdistribution of X
Let’s see some moreBinomial Distributions • What happens if we try doing a different # of trials (n) ? • That is, try a different sample size...
Whoah. • Anyone else notice those DISCRETE distributions starting to look smoother as sample size (n) increased? • Let’s look at a few more binomial distributions, this time with a different probability of success...
Good Robot90% Bad Robot 10% Binomial Robot Factory • 2 possible outcomes: You’d like to know about how many BAD robots you’re likely to get before placing an order... p = .10 (... “success”) n = 5, 10, 20, 50, 100
Normal Approximation of the Binomial If n is large, then X ~ B(n,p) {Binomial Distribution} can be approximated by a NORMAL DISTRIBUTION with parameters:
Normal Distributions • (aka “Bell Curve”) • Probability Distributions of a Continuous Random Variable • (smooth curve!) • Class of distributions, all with the same overall shape • Any specific Normal Distribution is characterized by two parameters: • mean: • standard deviation:
differentmeans different standard deviations
Standardizing • “Standardizing” a distribution of values results in re-labeling & stretching/squishing the x-axis • useful: gets rid of units, puts all distributions on same scale for comparison • HOWTO: • simply convert every value to a: Z SCORE:
Standardizing • Z score: • Conceptual meaning: • how many standard deviations from the mean a given score is (in a given distribution) • Any distribution can be standardized • Especially useful for Normal Distributions...
Standard Normal Distribution • has mean: =0 • has standard deviation: =1 • ANY Normal Distribution can be converted to the Standard Normal Distribution...
StandardNormal Distribution
Normal Distributions & Probability • Probability = area under the curve • intervals • cumulative probability • [draw on board] • For the Standard Normal Distribution: • These areas have already been calculated for us (by someone else)
Standard Normal Distribution So, if this were a Sampling Distribution, ...
Next Time • More different types of distributions • Binomial, Normal • t, Chi-square • F • And then... how will we use these to do inference? • Remember: biggest new idea today was: • SAMPLING DISTRIBUTION