250 likes | 348 Views
Hypothesis Testing. CJ 526. Probability. Review P = number of times an even can occur/ Total number of possible event Bounding rule of probability Minimum value is 0 Maximum value is 1. Probability. Probability of an event NOT occurring is the complement of an event
E N D
Hypothesis Testing CJ 526
Probability • Review • P = number of times an even can occur/ • Total number of possible event • Bounding rule of probability • Minimum value is 0 • Maximum value is 1
Probability • Probability of an event NOT occurring is the complement of an event • Probability of an illness = .2 • Probability that illness will not occur = • 1=probability of event or 1 - .2 = .8 • Odds of an event is the ratio • Odds of illness = .2/.8 or 1 to 4 or odds of not getting ill are 4 to 1
Addition rule of p • What is the probability of either one event OR another occurring? • If the events are mutually exclusive, simply add the probabilities (Venn diagram) • What is the p of having a boy or a girl? • P = 5. + .5 = 1
Multiplication rule • What is the probability of A and B occurring? • If the events are independent of one another, they can be multiplied • What is the p of having both schizophrenia and epilepsy?
Probability distributions • A probability distribution is theoretical—we expect it based on the laws of probability • That is different from an empirical distribution—one which we actually observe
Normal probability distribution • Probability distribution for continuous events • Probability of an event occurring is higher in the center of the curve • Declines for events at each of the two ends (tails) of the distribution • Neither of the tails touches the x axis (infinity)
Normal distribution • Theoretical probability distribution • Unimodal, symmetrical, bell-shaped curve • Symmetrical: draw a line down the center, left and right halves would be mirror images • Can be expressed as a mathematical formula (p. 220)
Normal distribution • Family of normal distributions • Dependent on mean and SD • (Illustrate) • More spread out: larger SD • Narrower: smaller SD
Variations • Skewness • Skewed to the right or the left, as opposed to symmetry • Kurtosis: degree of “peakedness” or “flatness”
Area under the normal curve • Remember that for any continuous distribution there is a mean and SD • Example: Mean = 10 and SD = 2 • If the distribution is not skewed, the majority (2/3) of scores will be from 8 to 12 • 8 and 12 are each one SD from the mean • See p. 225
Area under the normal curve • If a distribution is normal, we can express standard deviation in terms of z scores • A z score = (a score – the mean)/SD • If we convert all our raw scores to z scores, then we get what is call the standard normal distribution • It STANDARDIZES our scores
Standard normal distribution • Then distributions of different measures can be compared against one another • The standard normal distribution has a mean of 0 and an SD of one • If you use the formula for z scores, all the scores can be converted • If a distribution has a mean of 10, the z score for 10 will be (10-10)/SD = 0
Standard normal distribution • If a distribution has a mean of 10 and an SD of 2, the z score for 12 would be z = (12-10)/2 = 1 • The z score for 8 would be z = (8-10)/2 = -1 • The negative and positive sign have meaning: a + sign means a score is above the mean
Standard normal distribution • A minus sign means the score is less than the mean • The z score also tell about magnitude—the larger the z score, the further from the mean, and the smaller the z score, the closer to the mean
Standard normal distribution • We can also make statements about where an individual score is in relation to the rest of the distribution • .3413 (or 34.13%) of scores will fall between the mean and 1 SD • .3413 (or 34.13%) of scores will fall between the mean and – 1 SD
Standard normal distribution • .6826 (0r 68.26) of scores will be between -1 and + 1 SD on a normal distribution • Thus, when we see a mean and SD, if it is normally distributed, about 2/3 of the scores will fall between the mean – the SD and the mean + the SD
Standard normal distribution • 50% of the scores will be above the mean • 50% of the scores will be below the mean • .1359 (13.59%) will fall between -1 and -2 SD and between +1 and +2 SD • .0215 (2.15%) will fall between -2 and -3 SD and +2 and +3 SD • See p. 223, illustrate
Standardized normal distribution • Tells us about any distribution • Example of IQ scores, mean = 100, SD = 15 • About 2/3 between 85 and 115 • Less (13.5%) between 115 and 130, and 70 and 85 • About 2% between 130 and 145, and 55 and 70
Standardized normal • SAT scores, mean = 500, SD = 100 • Illustrate • Use of z table, p. 724 • Reading the table
Utility of the normal distribution • Use of the normal distribution underlies many statistical tests • Many variables not normally distributed • However, the normal distribution useful anyway because of the apparently validity of the Central Limit Theorem
Sampling distributions • To understand the Central Limit Theorem, need to understand sampling distributions • Say we draw many samples, and calculate a statistic for each sample, such as a mean • When we draw the samples, the mean will not be the same each time—there will be variation
Sampling distributions • If you were to obtain some measure on several samples of patients with the same disorder, there would be variation in the mean of the measure for each sample. • There is an actual mean for the entire population of patients that have the disorder, but that is not known, because we don’t have measures for the whole population
Sampling distributions • However, we could obtain means based on a large number of samples • Central limit theorem: if an infinite number of random samples of size n are drawn from a population, the sampling distribution of the sample means will itself approach being normally distributed (even if the measure is not itself normally distributed)
Number of subjects • With sample sizes greater than 100, the Central Limit Theorem can be used • If the measure is not terribly skewed, then samples could be around 50 • With sample sizes of less than 50, the central limit theorem probably should not be used. • Application of the central limit theorem (ex)