1 / 24

What is Statistics?

What is Statistics?. Statistics is the science of collecting, analyzing, and drawing conclusions from data Descriptive Statistics Organizing and summarizing Inferential Statistics Generalizing from a sample to the population from which it was selected. Describing Data.

fancy
Download Presentation

What is Statistics?

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. What is Statistics? • Statistics is the science of collecting, analyzing, and drawing conclusions from data • Descriptive Statistics • Organizing and summarizing • Inferential Statistics • Generalizing from a sample to the population from which it was selected

  2. Describing Data • What kind of data is there? • How can it be graphed for visual comparison? • How can it be described verbally? • How can it be analyzed numerically?

  3. Data--Types of Variables

  4. Discrete Takes on only certain values Example: Number of siblings, number of pockets in a pair of jeans, number of free throws made in a season,… Continuous Takes on any of an infinite number of values Example: Time, Weight, Height, …because of our limitations of measurement accuracy we often round to the nearest second, ounce, inch,… Types of Quantitative (Numerical) Data

  5. Categorical Bar graph Segmented Bar Graph Pie chart Quantitative Dotplot Stemplot (Stem & leaf) Histogram (Frequency distribution) Ogive: Cumulative relative frequency plot Boxplot Describing Univariate DataThe distribution of a variable tells us what values the variable takes, how often it takes those values, and shows the pattern of variation

  6. Bar, Segmented Bar, & Pie Charts

  7. What shape do you most identify with? • Square - committed, hard worker, most organized, independent worker, loyal, not sloppy, data collector • Circle – people person, team player, great communicator, gossiper, best listener, brings harmony • Squiggle – creative, experimental, innovative, gets bored quickly, conceptual, cannot stand rules, high energy • Triangle – loves recognition, leader, fast thinker, manager, strategic planner, list maker, confident • Rectangle – confused, in transition, most unpredictable, open to new learning, in a state of growing

  8. What’s misleading about this graph?

  9. How is this graph misleading? Source:Marist Institute for Public Opinion

  10. Describing Data using Summary Features of Quantitative Variables Center—Location in middle of all data Unusual features - Outliers, gaps, clusters Spread—Measure of variability, range Shape—Distributionpattern: symmetric, skewed, uniform, bimodal, etc. Always CUSS in context!

  11. Dotplot for UnivariateQuantitative Data

  12. 1.11 Stemplot Answer 0 3 0 99 1 134 1 5677889 2 0001234 2 55668888 3 2 3 5699 4 134 4 5579 5 03 5 59 6 1 6 7 0 7 8 3 8 66 9 3 b) a) 1| 9 represents $19 spent at store 0 399 1 1345677889 2 000123455668888 3 25699 4 1345579 5 0359 6 1 7 0 8 366 9 3 Stemplots: Stems & Leaves in order Leave stem blank if no leaf Split stems if too few stems (c) The distribution is skewed to the right. The spread is approximately 90 (3 to 93). The center of the distribution is at approximately $28. There are several moderate outliers visible in the split-stem plot; specifically, the five amounts of $70 or more. While most shoppers spent small to moderate amounts of money around $30, a “cluster” of shoppers spent larger amounts ranging from $70 to $93.

  13. Back-to-back Stemplot When comparing data, use comparative language! (higher, more than, etc.) Babe Ruth Roger Maris | 0| 8 | 1 | 3, 4, 6 5, 2 | 2 | 3, 6, 8 5, 4 | 3 | 3, 9 9, 7, 6, 6, 6, 1, 1 | 4 9, 4, 4 | 5 | 0 | 6 | 1 Number of home runs in a season

  14. Histogram of Discrete Data: Rolling a fair six-sided die 300 times

  15. 1.14 AnswerHistogram of Continuous Data • The center is located at 350 ($350,000). • There appears to be one outlier of $1,103,000. • The distribution is skewed to the right with a peak in the $200,000s. • The spread is approximately $1,082,000 ($21,000 to $1,103,000) • Which bars did the $200,000 and $300,000 salaries go? • Border values always go in the bar on the right! • (First bar is salaries of at least 0 to less than $100,000.)

  16. Histograms on the calculator • Enter data into List • Turn StatPloton and choose histogram option. Set Xlist to the list you used to enter in the data. • Choose 1 for Freq or a 2nd list if data is stored in two lists (values in one, frequency in another) • Press Zoom 9:Statplot to set window to the data initially • Check the windowand set reasonable, pretty values of min & max for both x (values) and y (frequency count). The Xscl will set the width of the bins – make this is a “pretty” number also! • Then press graph to see the adjusted graph • Press traceto see details of the graph

  17. Histogram of People’s Weights

  18. Data from Histogram

  19. Ogive: Cumulative Relative Frequency Graph Cumulative Relative Frequency Weight (in pounds)

  20. 5 Number Summary Minimum Q1(lower quartile) is the 25th percentile of ordered data or median of lower half of ordered data Median (Q2) is 50thpercentile, or middle number of ordered data (average the two middle numbers if there is an even number of #s) Q3 (upper quartile) is the 75th percentile of ordered data or median of upper half of ordered data Maximum Range= Maximum – minimum IQR(Interquartile Range) = Q3 – Q1 Outlier Formula: Any point that falls belowQ1- 1.5(IQR) or above Q3 + 1.5(IQR) is considered an outlier.

  21. Boxplot – using the 5 # summarySalaries from 1.14 – Enter in calc and press stat, calc, 1-var stats Median 350 Q3 543 Max 1103 Min 21 Q1 250 Check for outliers: • IQR = Q3 – Q1 = 543-250 =293 • Low boundary: Q1 - 1.5(IQR) = 250 – 1.5(293) = -389.5 no outliers on low end since no salaries are less than this • High boundary: Q3 + 1.5(IQR) = 543 + 1.5(293) = 982.5 one outlier on high end (1103) since it is higher than 982.5 Max value that’s not an outlier

  22. Comparative Parallel Boxplots

  23. Scatterplot—Bivariate quantitative data

More Related