230 likes | 461 Views
Chapter 1. Picturing Distributions with Graphs. 6/9/2014. 1. What is Statistics?. Statistics is the science that involves the extraction of information from data. It is a mistake to think of statistics as merely mathematical computations!. 6/9/2014. 2. Data Structure.
E N D
Chapter 1 Picturing Distributions with Graphs 6/9/2014 Chapter 1 1
What is Statistics? Statistics is the science that involves the extraction of information from data. It is a mistake to think of statistics as merely mathematical computations! 6/9/2014 Chapter 1 2
Data Structure • Individuals = observations ≡ individual units (e.g., people, institutions) upon which measurements are made • rows in the data table • Variables ≡characteristic that are measured • columns in the data table 6/9/2014 Chapter 1 3
Types of Variables • Categorical variables: named (“nominal”) categories • Counts (frequencies), percentages • Quantitative variables: numerical scales • arithmetic operations such as means and standard deviations 6/9/2014 Chapter 1 4
Types of Variables • Willet et al. (1995). Weight, weight change, and coronary heart disease in women. JAMA, 273(6). • Objective: to determine the effect weight gain on coronary heart disease (CHD) risk in women • Unit of observation: women between 30- to 55-years of age initially free of CHD. • Explanatory variable: body mass index (BMI) at age 18 • Follow-up for 14 years (cohort study) • Response variable: fatal and nonfatal CHD 6/9/2014 Chapter 1 5
BMI at age 18 Weight / Height2 QUANTITATIVE CHD occurrence Yes or no CATEGORICAL Types of Variables Willet et al. (1995) Explanatory variable Responsevariable 6/9/2014 Chapter 1 6
Distributions • Distributions tell us how often a variable takes on various values (!) • Picture distributions with graphs • Categorical variables: pie charts, bar graphs • Quantitative variables: stemplots, histograms, (next chapter: boxplots) 6/9/2014 Chapter 1 7
Types of Solid Waste (Categorical) 6/9/2014 Chapter 1 8
Types of Solid Waste (Categorical) Bar charts: bars do not touch (compare histograms) Pie charts: Use Excel or Applet Percentages must add to 100% 6/9/2014 Chapter 1 9
Body Weight (Quantitative) n = 53 students 6/9/2014 Chapter 1 10
Body Weight (Quantitative) Histogram Create class interval frequency table Approx 4 to 12 non-overlapping class-intervals Tally frequencies and proportions 6/9/2014 Chapter 1 11
Number of students Weight (pounds) 100 120 140 160 180 200 220 240 260 280 Body Weight (Quantitative) Histogram Draw histogram (frequencies and/or proportions) and label axes 6/9/2014 Chapter 1 12
Body Weight: Stem-and-Leaf Separate each value into stem value (first one or two significant digits) and leaf value (next significant digit) 6/9/2014 Chapter 1 13
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 Pounds(×10) • Draw “stem” • Label stem • Include “axis multiplier” • Write leaf values next to stem 192 5 152 2 135 2 6/9/2014 Chapter 1 14
10 0166 11 009 12 0034578 13 00359 14 08 15 00257 16 555 17 000255 18 000055567 19 245 20 3 21 025 22 0 23 24 25 26 0 Pounds(×10) Plot one leaf per data point Then sort leaves in rank order 6/9/2014 Chapter 1 15
Each Stemplot will differ! • Key: select the correct axis multiplier before breaking values in stem & leaf values • Think of the stemplot as a histogram with between 4 – 12 stem “bins”
Second Example (n = 8) • Data (average coliform count): 1.47, 2.06, 2.36, 3.43, 3.74, 3.78, 3.94, 4.42 • Stem = ones-place values • Leaves = tenths-place • Truncate extra digit (e.g., 1.47 1.4)The book rounds but we shall truncate • Do not plot the decimal |1|4|2|03|3|4779|4|4(×1)Coliforms
Third Example (n = 25) Data & stemplot |1|4789|2|223466789|3|000123445678(×1) Too squished! • Split stem values • First “1” on stem holds leaves between 0 to 4 • Second “1” on stem holds leaves between 5 to 9 • Etc. • |1|4|1|789|2|2234|2|66789|3|00012344|3|5678(×1) • Notice shape (negative skew)
Interpreting Stemplots • Shape: Symmetrical? Mound(s)? Tails? • Central location (the book uses the midpoint) • Spread (for now use range; better methods next week) • Outliers: fall outside regular pattern Chapter 1 19 6/9/2014
Interpreting Stemplots 10|0166 11|009 12|0034578 13|00359 14|08 15|00257 16|555 17|000255 18|000055567 19|245 20|3 21|025 22|0 23| 24| 25| 26|0 (×10) • Shape: tail toward larger numbers positive skew • Center: n = 53 use the 26.5th ranked value between 157 & 165 • Spread: 100 to 260 • Outlier: “260” seems out there 6/9/2014 Chapter 1 20
Back-to-Back Stemplots Women | | Men ------------------- |0| |0|9 |1| 8|1| |2| |2| |3| |3| minutes (×100) • Data = Exercise 1.38 on page 35 • I’ve plotted the first value for women (180 minutes) and the first value for men (90 minutes) 6/9/2014 Chapter 1 21
Interpreting HistogramsExample: 7th Grader Vocabulary Score • Shape: symmetrical • Center: around 7 • Spread: from 2 to 12 • Outlier: 12(?) 6/9/2014 Chapter 1 22