1 / 22

Chapter 1

Chapter 1. Picturing Distributions with Graphs. 6/9/2014. 1. What is Statistics?. Statistics is the science that involves the extraction of information from data. It is a mistake to think of statistics as merely mathematical computations!. 6/9/2014. 2. Data Structure.

aleshanee
Download Presentation

Chapter 1

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 1 Picturing Distributions with Graphs 6/9/2014 Chapter 1 1

  2. What is Statistics? Statistics is the science that involves the extraction of information from data. It is a mistake to think of statistics as merely mathematical computations! 6/9/2014 Chapter 1 2

  3. Data Structure • Individuals = observations ≡ individual units (e.g., people, institutions) upon which measurements are made • rows in the data table • Variables ≡characteristic that are measured • columns in the data table 6/9/2014 Chapter 1 3

  4. Types of Variables • Categorical variables: named (“nominal”) categories • Counts (frequencies), percentages • Quantitative variables: numerical scales • arithmetic operations such as means and standard deviations 6/9/2014 Chapter 1 4

  5. Types of Variables • Willet et al. (1995). Weight, weight change, and coronary heart disease in women. JAMA, 273(6). • Objective: to determine the effect weight gain on coronary heart disease (CHD) risk in women • Unit of observation: women between 30- to 55-years of age initially free of CHD. • Explanatory variable: body mass index (BMI) at age 18 • Follow-up for 14 years (cohort study) • Response variable: fatal and nonfatal CHD 6/9/2014 Chapter 1 5

  6. BMI at age 18 Weight / Height2 QUANTITATIVE CHD occurrence Yes or no CATEGORICAL Types of Variables Willet et al. (1995) Explanatory variable Responsevariable 6/9/2014 Chapter 1 6

  7. Distributions • Distributions tell us how often a variable takes on various values (!) • Picture distributions with graphs • Categorical variables: pie charts, bar graphs • Quantitative variables: stemplots, histograms, (next chapter: boxplots) 6/9/2014 Chapter 1 7

  8. Types of Solid Waste (Categorical) 6/9/2014 Chapter 1 8

  9. Types of Solid Waste (Categorical) Bar charts: bars do not touch (compare histograms) Pie charts: Use Excel or Applet Percentages must add to 100% 6/9/2014 Chapter 1 9

  10. Body Weight (Quantitative) n = 53 students 6/9/2014 Chapter 1 10

  11. Body Weight (Quantitative) Histogram Create class interval frequency table Approx 4 to 12 non-overlapping class-intervals Tally frequencies and proportions 6/9/2014 Chapter 1 11

  12. Number of students Weight (pounds) 100 120 140 160 180 200 220 240 260 280 Body Weight (Quantitative) Histogram Draw histogram (frequencies and/or proportions) and label axes 6/9/2014 Chapter 1 12

  13. Body Weight: Stem-and-Leaf Separate each value into stem value (first one or two significant digits) and leaf value (next significant digit) 6/9/2014 Chapter 1 13

  14. 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 Pounds(×10) • Draw “stem” • Label stem • Include “axis multiplier” • Write leaf values next to stem 192 5 152 2 135 2 6/9/2014 Chapter 1 14

  15. 10 0166 11 009 12 0034578 13 00359 14 08 15 00257 16 555 17 000255 18 000055567 19 245 20 3 21 025 22 0 23 24 25 26 0 Pounds(×10) Plot one leaf per data point Then sort leaves in rank order 6/9/2014 Chapter 1 15

  16. Each Stemplot will differ! • Key: select the correct axis multiplier before breaking values in stem & leaf values • Think of the stemplot as a histogram with between 4 – 12 stem “bins”

  17. Second Example (n = 8) • Data (average coliform count): 1.47, 2.06, 2.36, 3.43, 3.74, 3.78, 3.94, 4.42 • Stem = ones-place values • Leaves = tenths-place • Truncate extra digit (e.g., 1.47  1.4)The book rounds but we shall truncate • Do not plot the decimal |1|4|2|03|3|4779|4|4(×1)Coliforms

  18. Third Example (n = 25) Data & stemplot |1|4789|2|223466789|3|000123445678(×1) Too squished! • Split stem values • First “1” on stem holds leaves between 0 to 4 • Second “1” on stem holds leaves between 5 to 9 • Etc. • |1|4|1|789|2|2234|2|66789|3|00012344|3|5678(×1) • Notice shape (negative skew)

  19. Interpreting Stemplots • Shape: Symmetrical? Mound(s)? Tails? • Central location (the book uses the midpoint) • Spread (for now use range; better methods next week) • Outliers: fall outside regular pattern Chapter 1 19 6/9/2014

  20. Interpreting Stemplots 10|0166 11|009 12|0034578 13|00359 14|08 15|00257 16|555 17|000255 18|000055567 19|245 20|3 21|025 22|0 23| 24| 25| 26|0 (×10) • Shape: tail toward larger numbers  positive skew • Center: n = 53  use the 26.5th ranked value  between 157 & 165 • Spread: 100 to 260 • Outlier: “260” seems out there 6/9/2014 Chapter 1 20

  21. Back-to-Back Stemplots Women | | Men ------------------- |0| |0|9 |1| 8|1| |2| |2| |3| |3| minutes (×100) • Data = Exercise 1.38 on page 35 • I’ve plotted the first value for women (180 minutes) and the first value for men (90 minutes) 6/9/2014 Chapter 1 21

  22. Interpreting HistogramsExample: 7th Grader Vocabulary Score • Shape: symmetrical • Center: around 7 • Spread: from 2 to 12 • Outlier: 12(?) 6/9/2014 Chapter 1 22

More Related