90 likes | 107 Views
Discover how proportions and probabilities are interconnected, from survey data to contingency tables. Learn about conditional and marginal probabilities. Explore the area of histograms in data analysis.
E N D
Proportions and probabilities 15 out of 81 students surveyed were smokers, so the proportion of those who smoke is 15/81 = 0.185. Suppose each student’s name of placed on a card. I randomly select a card, note if the person is a smoker, and then return the card and shuffle the deck. After repeating this a sufficiently large number of times, the long-run proportion of smokers is what we call the probability that a randomly selected student will be a smoker. Connection: The probability of randomly selecting an individual with a certain characteristic is equal to the proportion of the population that has the characteristic.
Contingency Table A table that summarizes data for two categorical variables is called a contingency table.
Row and column proportions Row proportions are computed using row totals, and column proportions using column totals.
Row and column proportions Row proportions are computed using row totals, and column proportions using column totals. • Conditional probabilities • P(None | Spam) = 0.406 • P(None | Not Spam) = 0.113 • Marginal probability • P(None) = 0.140
Row and column proportions Row proportions are computed using row totals, and column proportions using column totals.
Row and column proportions Row proportions are computed using row totals, and column proportions using column totals. • Conditional probabilities • P(Spam | None) = 0.271 • P(Spam | Small) = 0.059 • P(Spam | Big) = 0.092 • Marginal probability • P(Spam) = 0.094
Area of histograms • 10% of the data (2 out of 20) is in the 240 to 280 range.
Area of histograms • 10% of the data (2 out of 20) is in the 240 to 280 range. • 10% of the total area of the histogram is in the 240 to 280 range.