160 likes | 309 Views
Chris Morgan, MATH G160 csmorgan@purdue.edu April 11, 2012 Lecture 29. Chapter 2.3 & 2.4: Stem and Leaf Plots and Cross Tabulation. • Gives a quick picture of the shape of the distribution • Shows the rank order and the distribution simultaneously • Includes actual numerical values
E N D
Chris Morgan, MATH G160 csmorgan@purdue.edu April 11, 2012 Lecture 29 Chapter 2.3 & 2.4: Stem and Leaf Plots and Cross Tabulation
• Gives a quick picture of the shape of the distribution • Shows the rank order and the distribution simultaneously • Includes actual numerical values • Works best for small numbers of observations where all observations are greater than zero Stem and Leaf Plots
Making a Stem and Leaf Plot • Sort data from smallest to largest (and trim data if necessary) • For each number set the last part to its “leaf” and the first part as its “stem” (eg. With the number 24 the 2 would be the stem and the 4 would be the leaf) • Separate the stems and leafs into two columns; and format the leaves such that they are left-aligned
Making a Stem and Leaf Plot Number of times per day my mom will yell at me over Christmas break:
Making a Stem and Leaf Plot Number of times per day my mom will yell at me over Christmas break:
Making a Stem and Leaf Plot Number of times per day my mom will yell at me over Christmas break:
Histogram or Stem and Leaf Plot? Histogram • Quantitative variables • Good for big data sets, especially if technology is available • Uses a box to represent each data point • Popular method of conveying information and will be utilized often in this course Stem and Leaf Plots • Quantitative variables • Good for small data sets, convenient for back-of-the-envelope calculations; rarely found in scientific or laymen publications • Uses a digit to represent each data point • Seen as elementary and will not be utilized often in this course
How to analyze relationships between types of data? What type(s) of variables we have will determine the method we use to compare the data. Types of VariablesMethod Categorical vs. Categorical Cross Tabulation Categorical vs. Quantitative ANOVA Quantitative vs. Quantitative Regression
Conditional Probability with Two Way Tables • Two way tables make it easy to compute conditional probability! • P(Row A | Column B)= Cell(A,B) . Column B Total • Similarly, • P(Column B | Row A)= Cell(A,B) . Row A Total
Cross Tabulation • What more can I do what cross tabulation? • Joint Probability • Marginal Probability • Condition Probability
Joint Probability Probability (A and B) = cell count of A and B / grand total P(Red and Peanut) = 3 / 65 = 0.05 P(Blue and Plain) = 9 / 65 = 0.14
Marginal Probability Probability [column A] = total of column A / grand total P(Orange) = 10 / 65 = 0.15 Probability [row B] = total of row B / grand total P(Plain) = 40 / 65 = 0.62
Conditional Probability Conditional Probability is the probability event A occurs given that event B has already occurred. For instance, if I observe a yellow M&M then what is the probability it is plain. Or if it is peanut, what is the probability it’s red? Probability [A and B | B] = cell count of A and B / total count of B P(Plain | Yellow) = 6 / 11 = 0.55 P(Red | Peanut) = 3 / 25 = 0.12