300 likes | 329 Views
Lecture 6:. Descriptive Statistics: Probability, Distribution, Univariate Data. Agenda. Wrap-up of experimental methods Intro to probability Examining data through univariate statistics. Generalizability (external validity) in Experiments.
E N D
Lecture 6: Descriptive Statistics: Probability, Distribution, Univariate Data
Agenda • Wrap-up of experimental methods • Intro to probability • Examining data through univariate statistics
Generalizability (external validity) in Experiments • Threats to external validity always involve an interaction of the treatment group with some other factor. • Threats usually fall into 3 types: • Setting • Population • History
Three threats to generalizability in experiments • Setting • Physical and social context of the experiment • Population • Is there something specific about the sample that interacts with the treatment? • History • Is there something about the time that interacts with the treatment?
Why Generalizability is not always a problem • Experiments often are trying to isolate specific causes and effects in controlled settings. Thus, they may not even be claiming to be generalizable to specific settings. • Experimental findings can provide theoretical basis for real-world tests. • It is often a balancing act for research: true causation versus large-scale associational and comparative testing.
Considerations before using experiments • Cost and Effort • Is the effort worth it to test the concepts you are interested in? • Manipulation and Control • Will you actually be able to manipulate the key concept(s)? • Importance of Generalizability • Are you testing theory, or trying to establish a real-world test?
Probability • Are the things that we observe different from what would be expected by chance? • Coin Example
Probability Concepts • Basic Concepts in Probability • Basic Probability Rules • Special Types of Probability • Joint Probabilities • Probabilities of Unions of Events • Conditional Probabilities
Basic Concepts in Elementary Probability • Random Selection • Every possibility has equal chance of being chosen. • Independence • The probability of a response on one trial does not depend on the outcome of any other trials. • Elementary Event • Possible outcomes of a probability experiment • E.g., each coin toss • Sample Space • The complete set of elementary events • E.g., all coin tosses
Mutually exclusive, exhaustive, events • Mutually exclusive events • Two or more events that cannot occur at the same time. • Exhaustive events • A set of events that accounts for all of the elementary events in the sample space.
Basic rules of probability • Multiplication Rule • For independent events, we can multiply the probabilities together to get the probability for all of the events occurring. • Example: Probability of rolling a die and getting 6 on both rolls. • But what happens if the events are not independent? • Example: probability of selecting a club from a deck of cards, then selecting another club (without replacement)?
Multiplication Rule • when two or more events will happen at the same time, and the events are independent, then the special rule of multiplication law is used to find the joint probability:P(X and Y) = P(X) x P(Y) • when two or more events will happen at the same time, and the events are dependent, then the general rule of multiplication law is used to find the joint probability:P(X and Y) = P(X) x P(Y|X)
Basic rules of probability (continued) • The addition rule • For independent events, we can add the probabilities to get the probability of either event occurring. • Example: Rolling die and getting a 4 or a 6. • Again, what happens if the events are not independent (in this case, mutually exclusive)?
Addition Rule • When two or more events will happen at the same time, and the events are mutually exclusive, then:P(X or Y) = P(X) + P(Y) • When two or more events will happen at the same time, and the events are not mutually exclusive, then:P(X or Y) = P(X) + P(Y) - P(X and Y)For example, what is the probability that a card chosen at random from a deck of cards will either be a king or a heart?P(King or Heart) = P(X or Y) = 4/52 + 13/52 - 1/52 = 30.77%
Special Types of Probability • Joint Probabilities • Probabilities of Unions of Events • Conditional Probabilities
Joint Probabilities • Probability of obtaining a particular combination of events. • E.g., probability of flipping a coin twice and getting heads both times. Just use multiplication rule! • P (A and B) = n(A and B) / n (S) • What about non-independent events? • E.g., probability of a given respondent in class survey being female and having downloaded music before. • P (A and B) = p(A|B) x p(B) • (9/11) (.579) = .474
Union Probabilities • A union of two elementary events consists of all the elementary events belonging to either of them. • Examples: • Probability of flipping a coin and it being heads or tails. (mutually exclusive union) • Non-independent events: Probability of being a female or having downloaded music before. • p(E1) + p(E2) – p(E1 and E2) • (.579) + (.842) – (.474) = .947
Conditional Probability • Probability of an event occurring given that another event has occurred. • Example: probability of an outcome, given that something else has occurred. • 3 Doors Problem
Probability and Statistics • Statistics deal with what we observe and how it compares to what might be expected by chance. • A set of probabilities corresponding to each possible value of some variable, X, creates a probability distribution • Common examples include normal (Gaussian), Poisson, Exponential, Binomial, etc
For now, we will just deal with describing or characterizing the distribution of a single variable
Describing Simple Distributions of Data • Central Tendency • Some way of “typifying” a distribution of values, scores, etc. • Mean (sum of scores divided by number of scores) • Median (middle score, as found by rank) • Mode (most common value from set of values) • In a normal distribution, all 3 measures are equal. • Example: Class stats knowledge
Dispersion • Range • Difference between highest value and the lowest value. • Standard Deviation • A statistic that describes how tightly the values are clustered around the mean. • Variance • A measure of how much spread a distribution has. • Computed as the average squared deviation of each value from its mean
Properties of Standard Deviation • Variance is just the square of the S.D. • If a constant is added to all scores, it has no impact on S.D. • If a constant is multiplied to all scores, it will affect the dispersion (S.D. and variance) S = standard deviationX = individual scoreM = mean of all scoresn = sample size (number of scores)
Common Data Representations • Histograms • Simple graphs of the frequency of groups of scores. • Stem-and-Leaf Displays • Another way of displaying dispersion, particularly useful when you do not have large amounts of data. • Box Plots • Yet another way of displaying dispersion. Boxes show 75th and 25th percentile range, line within box shows median, and “whiskers” show the range of values (min and max)