460 likes | 642 Views
Statistics and Quantitative Analysis U4320. Segment 4: Statistics and Quantitative Analysis Prof. Sharyn O’Halloran. Probability Distributions. A. Distributions: How do simple probability tables relate to distributions? 1. What is the Probability of getting a head? ( 1 coin toss).
E N D
Statistics and Quantitative Analysis U4320 Segment 4: Statistics and Quantitative Analysis Prof. Sharyn O’Halloran
Probability Distributions • A. Distributions: How do simple probability tables relate to distributions? • 1. What is the Probability of getting a head? ( 1 coin toss)
Probability Distributions(cont.) • 2. Now say we flip the coin twice. The picture now looks like:
Probability Distributions(cont.) • As number of coin tosses increases, the distribution looks like a bell-shaped curve:
Probability Distributions(cont.) • 3. General: Normal Distribution • Probability distributions are idealized bar graphs or histograms. As we get more and more tosses, the probability of any one observation falls to zero. • Thus, the final result is a bell-shaped curve
Probability Distributions(cont.) • B. Properties of a Normal Distribution • 1. Formulas: Mean and Variance
Probability Distributions(cont.) • 2. Note Difference with the Book • The population mean is written as: • Variance as: • Example: Two tosses of a coin
Probability Distributions(cont.) • 2. Note Difference with the Book (cont.) • So the average, or expected, number of heads in two tosses of a coin is: 0*1/4 + 1*1/2 + 2*1/4 = 1.
Probability Distributions(cont.) • 3. Expected Value
Probability Distributions(cont.) • C. Standard Normal Distribution • 1.Definition: a normal distribution with mean 0 and standard deviation 1. • Z-values - Are points on the x-axis that show how many standard deviations that point is away from the mean m.
Probability Distributions(cont.) • C. Standard Normal Distribution • 2. Characteristics • symmetric • Unimodal. • continuous distributions • 3. Example: Height of people are normally distributed with mean 5'7" • What is the proportion of people taller than 5'7"?
Probability Distributions(cont.) • D. How to Calculate Z-scores • Definition: Z-value is the number of standard deviations away from the mean • Definition: Z-tables give the probability (score) of observing a particular z-value.
Probability Distributions(cont.) • 1. What is the area under the curve that is greater than 1 ? Prob (Z>1) • The entry in the table is 0.159, which is the total area to the right of 1.
Probability Distributions(cont.) • 2. What is the area to the right of 1.64? Prob (Z > 1.64) • The table gives 0.051, or about 5%.
Probability Distributions(cont.) • 3. What is the area to the left of -1.64? Prob (Z < -1.64)
Probability Distributions(cont.) • 4. What is the probability that an observation lies between 0 and 1? Prob (0 < Z < 1)
Probability Distributions(cont.) • 5. How would you figure out the area between 1 and 1.5 on the graph? Prob (1 < Z < 1.5)
Probability Distributions(cont.) • 6. What is the area between -1 and 2? Prob (-1 < Z < 2) • P (-1<Z<0) = .341 • P (0<z<2) = .50-.023 =.477 0.477 + 0.341 = .818
Probability Distributions(cont.) • 7. What is the area between -2 and 2? Prob(-2<Z<2) 1- Prob (Z< -2) - Prob (Z>2) = 1 - .023 - .023 = .954
Probability Distributions(cont.) • E. Standardization • 1. Standard Normal Distribution -- is a very special case where the mean of distribution equals 0 and the standard deviation equals 1.
Probability Distributions(cont.) • 2. Case 1: Standard deviation differs from 1 • For a normal distribution with mean 0 and some standard deviation , you can convert any point x to the standard normal distribution by changing it to x/.
Probability Distributions(cont.) • 3. Case 2: Mean differs from 0 • So starting with any normal distribution with mean and standard deviation 1, you can convert to a standard normal by taking x- and using this as your Z-value. • Now, what would be the area under the graph between 50 and 51? • Prob (0<Z<1) = .341
Probability Distributions(cont.) • 4. General Case: Mean not equal to 0 and SD not equal to 1 • Say you have a normal distribution with mean & standard deviation . You can convert any point x in that distribution to the same point in the standard normal by computing • This is called standardization. • The Z-value corresponds to x. • The Z-table lets you look up the Z-Score of any number.
Probability Distributions(cont.) • 5. Trout Example: • a. The lengths of trout caught in a lake are normally distributed with mean 9.5" and standard deviation 1.4". There is a law that you can't keep any fish below 12". What percent of the trout is this? (Can keep above 12) • Step 1: Standardize • Find the Z-score of 12: Prob (x>12) • Step 2: Find z-score • Find Prob (Z>1.79) • Look up 1.79 in your table; only .037, or about 4% of the fish could be kept. 12 - 9.5 Z = --------- = 1.79. 1.4
Probability Distributions(cont.) • 5. Trout Example (cont.): • b. Now they're thinking of changing the standard to 10" instead of 12". What proportion of fish could be kept under the new limit? • Standardize Prob (x>10) • Step 2: Find z-score Prob (Z>.36) • In your tables, this gives .359, or almost 36% of the fish could be kept under the new law. 10 - 9.5 Z = --------- = 0.36. 1.4
Joint Distributions • A. Probability Tables • 1. Example: Toss a coin 3 times. How many heads and how many runs do we observe? • Def: A run is a sequence of one or more of the same event in a row • Possible outcomes
Joint Distributions (cont.) • 2. Joint Distribution Table
Joint Distributions (cont.) • 3. Definition: The joint probability of x and y is the probability that both x and y occur. p(x,y) = Pr(X and Y) p(0, 1) = 1/8, p(1, 2) = 1/4, and p(3, 3) = 0.
Joint Distributions (cont.) • B. Marginal Probabilities • 1. Def: Marginal probability is the sum of the rows and columns. The overall probability of an event occurring. • So the probability that there are just 1 head is the prob of 1 head and 1 runs + 1 head and 2 runs + 1 head and 3 runs = 0 + 1/4 + 1/8 = 3/8
Joint Distributions (cont.) • C. Independence • A and B are independent if P(A|B) = P(A).
Joint Distributions (cont.) • Are the # of heads and the # of runs independent?
Correlation and Covariance • A. Covariance • 1.Definition of Covariance • Which is defined as the expected value of the product of the differences from the means.
Correlation and Covariance • 2.Graph
Correlation and Covariance • B. Correlation • 1.Definition of Correlation
Correlation and Covariance • 2. Characteristics of Correlation
Correlation and Covariance • 2. Characteristics of Correlation (cont.)
Correlation and Covariance • 2. Characteristics of Correlation (cont.)
Correlation and Covariance • 2. Characteristics of Correlation (cont.)Why?
Sample Homework GET /FILE 'gss91.sys'. The SPSS/PC+ system file is read from file gss91.sys The SPSS/PC+ system file contains 1517 cases, each consisting of 203 variables (including system variables). 203 variables will be used in this session. ------------------------------- COMPUTE AFFAIRS = XMARSEX. RECODE AFFAIRS (0,5,8,9 = SYSMIS) (1,2 = 0) (3,4 = 1). VALUE LABELS AFFAIRS 0 'BAD' 1 'OK'. The raw data or transformation pass is proceeding 1517 cases are written to the compressed active file. ***** Memory allows a total of 10345 Values, accumulated across all Variables. There also may be up to 1293 Value Labels for each Variable.