1 / 40

APSTAT PART ONE Exploring and Understanding Data

There are three kinds of lies - lies, damned lies and statistics. ~Benjamin Disraeli, commonly misattributed to Mark Twain. APSTAT PART ONE Exploring and Understanding Data. What is Statistics?. Chapters 1-3. What is Stat?. Book Says: A way of reasoning Collection of tools and methods

plato
Download Presentation

APSTAT PART ONE Exploring and Understanding Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. There are three kinds of lies - lies, damned lies and statistics. ~Benjamin Disraeli, commonly misattributed to Mark Twain APSTAT PART ONEExploring and Understanding Data

  2. What is Statistics? Chapters 1-3

  3. What is Stat? • Book Says: • A way of reasoning • Collection of tools and methods • Helps us understand the world • Statistics is about variation

  4. Stat Basics • Individuals • Object described by a set of data • People (#1), cars, animals, groups… • Variables • Categorical (Qualitative)– Usually involves words • Examples: sex, advisor, social security #... • Quantitative – Involve #’s • Examples: age, height, income, test score…

  5. Displaying Categorical Data • Frequency tables:

  6. Displaying Categorical Data • Realtive Frequency tables: • Just roll up the %’s

  7. Displaying Categorical Data • Contingency Table • Two Way table Age at first “Real Kiss” (ahhhhhhhhhhhh…)

  8. Marginal Distribution Age at first “Real Kiss” (ahhhhhhhhhhhh…) • Conditional Distribution: • % of males whose first kiss came when they were 10-14 • % of 20-24 year old first kissers who were male

  9. The Rest of Chapters 1-3 • Displaying the data • Pie Charts • Bar Charts • Blah Blah Blah…. • Simpson’s Paradox – AP MC • Being Skeptical – Important for real life • 5 W’s + 1H • Ex: 4 out of 5 dentists…. • Displaying data • Lies, Dammed Lies, and Statistics

  10. Showing Off Your Data Chapters 4-5

  11. Histograms • Remember bar graphs? Same, but different. • Think of sorting boxes… • Same size boxes • ON TI-83 • Enter Data into L1 (STAT>EDIT) • Go to STAT PLOT (2ND Y=) • Change Options • Go to ZOOM Choose Stat OR Go to WINDOW Change Options Go to GRAPH

  12. Histograms • Make a histogram of the following data: • Age of Teachers At WPS 25, 34, 37, 42, 51, 43, 49, 35, 37, 65,

  13. Outliers • An observation that is outside the pattern • For example, ages in this classroom 16, 17, 16, 17, 18, 17, 17, 16, 18, 36 • Formula to determine (l8r, sk8r) • For now “potential” or “possible” outlier

  14. Center Mean - Average Median - Middle Shape Symmetric Skewed Uniform Bell Shaped Bi- or Multi-modal Spread Standard Deviation Range IQR Weird-ness Outliers Gaps Describing a distribution

  15. Stemplots • Basic • Split Stems • Back-To-Back

  16. Basic Stemplot • Boys Weight in class (pounds) 10 11 12 13 14 15 16 17 18 3 4 6 9 9 0 2 5 7 8 8 0 0 1 3 4 4 5 8 9 1 9 KEY: 10 8 = 108 pounds

  17. Split Stem Stemplot • Boys Weight in class (pounds) 3 4 6 9 9 0 2 5 7 8 8 0 0 1 3 4 4 5 8 9 1 9 14 14 15 15 16 16 17 17 18 KEY: 10 8 = 108 pounds

  18. Back to Back Stemplot • Girls vs. Boys Weight in class (pounds) 10 11 12 13 14 15 16 17 18 8 9 3 8 7 7 3 9 4 0 2 1 3 4 6 9 9 0 2 5 7 8 8 0 0 1 3 4 4 5 8 9 1 9 KEY: 10 8 or 8 10 = 108 pounds

  19. Mean • Average! Add ‘em up and divide by n • Sample Mean denoted as x (x-bar) • Not Resistant to extreme measures • ie. Ages in Mrs. Smith’s Kindergarten Class • 4,5,4,4,4,5,5,4,4,4,5,5,4,4,5,39

  20. Median • Middle! Line ‘em up (in order) and find the middle. If two share it, find their mean. • Resistant to extreme measures • ie. Ages in Mrs. Smith’s Kindergarten Class • 4,4,4,4,4,4,4,4,4,5,5,5,5,5,5,39

  21. Quartiles • Median cuts data in half, Quartiles cut the Halves in Half! Recall Teacher Ages: 25, 34, 35, 37, 37, 42, 43, 49, 51, 65 Median 3rd Quartile Q3 1st Quartile Q1

  22. 5-Number Summary • Low-Q1-Median-Q3-High • Shows Spread of Data Recall Teacher Ages: 25, 34, 35, 37, 37, 42, 43, 49, 51, 65 • 5-Number Summary: 25 35 39.5 49 65

  23. Boxplot • Graphical Representation of 5-Number Summary • Shows Shape, Spread, and Center • Always draw to scale: 25 35 39.5 49 65

  24. Outliers • First off, IQR – InterQuartile Range • Distance between Quartiles… Recall Teacher Ages: 25, 34, 35, 37, 37, 42, 43, 49, 51, 65 • IQR is 49-35=14 • Outlier is anything 1.5 times IQR below Q1 or above Q3 • Sooo…. An outlier would have to be 21 below 35 or 21 above 49…Below 14 or above 70. Nothing in our data is an outlier!

  25. Boxplot Using TI-83 Enter Teacher Ages into L1 (clear old stuff first): 25, 34, 35, 37, 37, 42, 43, 49, 51, 65 • ON TI-83 • Go to STAT PLOT (2ND Y=) • Change Options • Go to ZOOM Choose Stat OR Go to WINDOW Change Options Go to GRAPH

  26. Variance & Standard Deviation • Variance - s2 • Average of Squared distances from mean • In example 26/5 = 5.2 • Standard Deviation – s • Square Root of Variance • In example, about 2.28 • Standard Deviation • Measure of Spread • Use with Mean • Non-Resistant • On TI-83 Now….. STAT>CALC-1VARSTAT Mean = 6

  27. It’s Normal to Deviate Chapter 6 – The Normal Model

  28. Mean, Median and Mode Density Curve • Area under a density curve is always 1 • Symmetric density curve:

  29. Mean Mode Mean Skewed to the Left (tail trails to the left) Skewed to the Right (tail trails to the right) Median Density Curve Continued • Density curves are often skewed • Recall Median is “resistant” while Mean is not

  30. 50% of Population 50% of Population Histograms • Median is “equal areas” point • Mean is “balance point” – “think Physics”

  31. Concave Down Concave Up Concave Up     + Normal Distributions (bell shaped) • Center is mean m –(population mean) • Spread is Standard Deviation s – (population standard deviation) • To find, look for inflection points

  32. Raw-Score (X)  2  3  1   + 1  + 2  + 3 z-Score (z) 3 2 1 0 1 2 3 68 – 95 – 99.7 Rule • Also called EMPIRICAL RULE Probability = 99.7% within 3 Probability = 95% within 2 Probability = 68% within 1

  33. Percentiles (and quartiles) • Think standardized tests or class rankings • Percent of observations to the LEFT of an observation • Quartiles: • First is at 25th percentile • Median is at 50th percentile • Third is at 75th percentile

  34. Raw-Score (X)  2  3  1   + 1  + 2  + 3 z-Score (z) 3 2 1 0 1 2 3 Z-SCORE • Number of Standard Deviations (s) away from the Mean (m)

  35. Z-SCORE Continued • Example, You have an IQ of 148 The IQ test you took has a distribution N(105, 20). What is your Z-Score? What does this mean? • = population mean  = population standard deviation, X = Raw-Score, z = z-Score • Normal Distribution Notation N (, )

  36. Using Tables • Ex. – Your IQ Z-SCORE was 2.15. What does it mean now?

  37. Using Tables • Ex. – If someone’s IQ was at the 10th percentile, what would their Z-SCORE be?

  38. Using TI-83 • Normalcdf (Xlower, Xupper, , ) : - use to convert Raw-Score directly to probability. • Normalcdf (Zlower, Zupper) : - use to convert z-Score to probability ***For Graphics use Shadenorm (GTANG notes)

  39. Using TI-83 • Test Empirical Rule (68-95-99.7) • Find Normalcdf(-1,1), Normalcdf(-2,2), Normalcdf(-3,3) • Ex. What percent of IQ Scores would fall between 100 and 110 Using N(105, 20)? What percent would be above 150? • Normalcdf(100,110,105,20) • Normalcdf(150,1000000000,105,20)

  40. Normality • Just check Box and Whisker plot or Histogram on TI-83 • ALWAYS do this if raw data is given • Sketch result and comment on it!

More Related