1 / 26

Chapter 1

Chapter 1. Why Statistics?. Learning can result from:. Critical thinking Asking an authority Religious experience However, collecting DATA is the surest way to learn about the world. Data in the Sciences are messy . At first glance, data often look like an incoherent jumble of numbers

thuy
Download Presentation

Chapter 1

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 1 Why Statistics?

  2. Learning can result from: • Critical thinking • Asking an authority • Religious experience However, collecting DATA is the surest way to learn about the world

  3. Data in the Sciences are messy • At first glance, data often look like an incoherent jumble of numbers • How do we make sense of data? Statistical procedures are tools for learning about the world by Learning from Data.

  4. Real Data! • To help you understand the power and usefulness of statistics, we will explore two real and interesting data sets • “The Smoking Study” • “The Maternity Study”

  5. The Smoking Study • From the University of Wisconsin Center for Tobacco Research and Intervention • 608 participants provided data on smoking, addiction, withdrawal, and how best to quit smoking • The full data set is provided on the CD, a description of the data collected in provided in the appendices of the book

  6. The Maternity Study • From Wisconsin Maternity Leave and Health Project • 244 families provided data on marital satisfaction, child-rearing styles, and other household events • The full data set is provided on the CD, a description of the data collected in provided in the appendices of the book

  7. Variability • Why are data messy? • Consider a concrete example: Depression scores (“CESD”) for participants in the Smoking Study • Some participants (each has a different ID number) have CESD scores of 0, while others have scores of 2, 11 or 7, or some other value • These data are messy in that the scores are different from one another • Variability is the statistical term for the degree to which scores (such as the depression scores) differ from one another.

  8. Sources of Variability • It is easy to see that depression scores are variable, by why? • Individual differences • Some people are more depressed than others • Some people have difficulty reading the and understanding the questions on the test • Some people answer the questions more honestly than others • Procedure • Differences in the ways the data were collected • Conditions or Treatments • The conditions that are imposed on the participants of the study

  9. Populations and Samples • Statistical Population – a collection or set of measurements of a variable that share some common characteristic • Sample – a subset of measurements from a population • Random sample – a sample selected such that every score in the population has an equal chance of being included

  10. Chapter 2 Frequency Distributions and Percentiles

  11. Variability (revisited) • Collecting Data means measuring a variable • Those measurements differ (vary) from one another • One way to organize and summarize a set of measurements is to construct a frequency distribution • These methods can be applied to both populations and samples

  12. Example YRSMK – Number of Years Smoking Daily From the First 60 Participants in the Smoking Study

  13. Example YRSMK – Number of Years Smoking Daily From the First 60 Participants in the Smoking Study

  14. A Better Summary? YRSMK – Number of Years Smoking Daily From the First 60 Participants in the Smoking Study

  15. Graphing Distributions

  16. Percentiles • We have been focusing on distributions rather than individual scores • Sometimes, individual scores are of great importance • Computing Percentiles, when n=608 • The 50-th percentile is the “middle” score. It is the 304-th sorted score. • The 32-th percentile is the 608*0.32=194.56, i.e., the 195-th sorted score.

  17. Percentile Rank • The percentile rank of a score is the percent (the proportion times 100) of the measurements in the distribution below that score value • Computing percentile rank for YRSMK: • Sort the variable, called YRSMK_sorted • The percentile rank of 9 is 50/608 = 0.082, so it is the 8-th percentile • The percentile rank of 21 is 246/608 = 0.4046053, so it is the 40-th percentile

  18. Graphing Distributions • Graphing distributions is a very valuable tool for highlighting features of the data • Shape • Range • Central Tendency • Variability

  19. Shape • We classify the shape of distributions in three ways: • Symmetry – is one half a mirror image of the other half? • Skew – are there high/low frequencies of low/high scores? • Modality – how many humps or modes?

  20. Symmetry • Is one half of the distribution a mirror image of the other (along a vertical axis)? • Three examples of symmetrical distributions:

  21. Skew • Negative – low frequencies of low values and high frequencies of high values • Positive – high frequencies of low values and low frequencies of high values

  22. Modality • How many humps (or modes)? Unimodal Bimodal

  23. Characterizing Shape Asymmetric Negatively Skewed Bimodal Asymmetric Positively Skewed Unimodal

  24. Central Tendency and Variability • In addition to shape, distributions differ in terms of: • Central Tendency - scores near the center of the distributions; where the scores “tend” to be • Variability – the degree to which scores differ from one another; the “spread” of the scores

  25. Comparing Distributions • It is very useful to be able to compare and contrast (name similarities and differences) of distributions • Distributions can differ in terms of shapes, central tendencies, and variability

  26. Comparing Distributions How do these distributions differ?

More Related