1 / 23

Analyzing Measurement Data

Analyzing Measurement Data. Example. Prediction: I f a spring on the slingshot were pulled back 1m , the softball will land a distance of 17m downrange To confirm prediction, data is collected from 20 trials. Example. Most values fall between 14 and 20 m.

Download Presentation

Analyzing Measurement Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Analyzing Measurement Data Analyzing Data

  2. Example Prediction: If a spring on the slingshot were pulled back 1m, the softball will land a distance of 17m downrange To confirm prediction, data is collected from 20 trials. Analyzing Data

  3. Example • Most values fall between 14 and 20 m. • This data contains an outlier of 45.2 m. Analyzing Data

  4. Represent the Data with a Histogram • First, determine an appropriate bin size. • The bin size [k] can be assigned directly or can be calculated from a suggested number of bins [h]: • Let’s try the most commonly used formula first: = 4.43 ≈ 5 Analyzing Data

  5. Histogram - Example Is this the best way to represent this data? By changing our bin size, [k], we can improve the representation. Analyzing Data

  6. Histogram - Example All 3 histograms represent the exact same data set, but the bin width and number of bins for the two shown above were selected manually. Which one is most descriptive? Analyzing Data

  7. Dealing with outliers • Engineers must carefully consider any outliers when analyzing data. • It is up to the engineer to determine whether the outlier is a valid data point or if it is invalid and should be discarded. • Invalid data points can result from measurement errors or recording the data incorrectly. Analyzing Data

  8. Characterizing the data • Statistics allows us to characterize the data numerically as well as graphically. • We characterize data in two ways: • Central Tendency • Variation Analyzing Data

  9. Central Tendency (Expected Value) • Central tendency is a single value that best represents the data. • But which number do we choose? • Mean • Median • Mode • Note: For most engineering applications, mean and median are most relevant. Analyzing Data

  10. Central Tendency - Mean Is the mean value a good depiction of the data? How does the outlier affect the mean? Analyzing Data

  11. Central Tendency - Mean Problem: Outliers may decrease the usefulness of the mean as a central value. Observe how outliers can affect the mean for this simple data set: -112 212 Changing 3 to -112 Outlier: -112 Changing 44 to 212 Outlier: 212 Without outliers Solution: Look at the median. Analyzing Data

  12. Central Tendency - Median n = 20  even number of data points. Must take the average of the 2 middle values In this case, the 2 middle values are both 17.4 Which value looks like a better representation of the data? Mean (18.47) or median (17.4)? Why? Analyzing Data

  13. Central Tendency Median Using the simple data set, observe how the median reduces the impact of outliers on the central tendency. Median = 21 Median = 21 Analyzing Data

  14. Central Tendency – Mean and Median Which value, the mean (18.47 m) or the median (17.4) is a better representation of the data? Analyzing Data

  15. Characterizing the data • We can select a value of central tendency to represent the data, but is one number enough? • It is also important to know how much variation there is in the data set. • Variation refers to how the data is distributed around the central tendency value. Analyzing Data

  16. Variation • As with central tendency, there are multiple ways to represent the variation of a set of data. • ± (“Plus, Minus”) gives the range of the values. • Standard Deviation provides a more sophisticated look at how the data is distributed around the central value. Analyzing Data

  17. Variation - Standard Deviation Definition: how closely the values cluster around the mean; how much variation there is in the data Equation: Analyzing Data

  18. Standard Deviation Example mean = ∑ = Analyzing Data

  19. Standard Deviation: Interpretation These curves describe the distribution of students’ exam grades. The average value is an 83%. Curve A Curve B Which class would you rather be in? A B Analyzing Data

  20. Normal Distribution • Data that is normally distributed occurs with greatest frequency around the mean. • Normal distributions are also frequently referred to as Gaussian distributions or bell curves mean Frequency 1 2 3 4 5 -5 -4 -3 -2 -1 0 Bins Analyzing Data

  21. Normal Distribution Mean = Median = Mode • 68% of values fall within 1 SD • 95% of values fall within 2 SDs Analyzing Data

  22. Other Distributions Skewed distributions: Uniform distribution: Multimodal distribution: Analyzing Data

  23. What we’ve learned • This lecture has introduced some basic statistical tools that engineers use to analyze data. • Histograms are used to represent data graphically. • Engineers use both central tendency and variation to numerically describe data. Analyzing Data

More Related