1 / 19

Lecture 6 Sections 2.1 – 2.2

Lecture 6 Sections 2.1 – 2.2. Objectives: Measure of Center Measure of Center for Data Measure of Center for Distributions Measure of Variability for Data Measure of Variability for Distributions The Empirical Rule (Normal Distribution). The Sample Mean.

virginiau
Download Presentation

Lecture 6 Sections 2.1 – 2.2

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lecture 6Sections 2.1 – 2.2 • Objectives: • Measure of Center • Measure of Center for Data • Measure of Center for Distributions • Measure of Variability for Data • Measure of Variability for Distributions • The Empirical Rule (Normal Distribution)

  2. The Sample Mean • To describe a “typical” or “representative” observation, we will use the sample mean • Most frequently used measure of the center. • Sensitive to extreme observations (outliers). • Useful for the estimation of the center when the distribution is symmetric and is free of outliers.

  3. Example Caustic stress corrosion cracking of iron and steel has been studied because of failures around rivets in steel boilers and failures of steam rotors. Consider the accompanying observations on crack length (μm) as a result of constant load stress corrosion tests on smooth bar tensile samples for a fixed length of time. The data is from the article “On the Role of Phosphorus in the Caustic Stress Corrosion Cracking of Low Alloy Steels”, Corrosion Science, 1989: 53-68. 16.1 9.6 24.9 20.4 12.7 21.2 30.2 25.8 18.5 10.3 25.3 14.0 27.1 45.0 23.3 24.2 14.6 8.9 32.4 11.8 28.5 a. Find the mean of crack length. b. Replace 45.0 by 295.0 and then find the mean of crack length.

  4. The Sample Median Midpoint of the observations in the ordered list. So, 50% of data falls below and 50% falls above. If n is odd ⇒ the median is the middle value in the ordered list (that is, the (n+1)/2 th observation). If n is even, there is no unique middle ⇒ the median is the average of the middle pair of values. • Much less sensitive to extreme observations (outliers). • Useful for the estimation of the center when the distribution is skewed.

  5. Example Consider the following 5 observations 34 44 56 63 67 Consider the following 10 observations 49.2 53.9 50.0 44.5 42.2 42.3 32.3 31.3 60.9 47.5

  6. Trimmed Means The 100r% trimmed mean is the mean of remaining observations after trimming the largest n*100r% and the smallest n*100r%, where r is a number between 0 and 0.5 Note: The trimmed mean is less sensitive to outliers than the mean but more sensitive than the median. Example. Consider the following 20 observations, each representing the lifetime (hr) of a certain type of incandescent lamp: 612 623 666 744 883 898 964 970 983 1003 1016 1022 1029 1058 1085 1088 1122 1135 1197 1201 Find the 10% trimmed mean.

  7. Population Mean Discrete Distributions Definition. The mean (or expected value) of a discrete variable x is given by Example. Plastic parts manufactured using an injection molding process may exhibit one or more defects, including sinks, scratches, black spots, and so on. Let x represent the number of defects on a single part, and suppose the distribution of x is as follows: x 0 1 2 3 4 p(x) .80 .14 .03 .02 .01 1) If x~B(n,π), then μ=nπ. 2) If x~Poisson(λ), then μ = λ.

  8. Population Mean Continuous Distributions Definition. The mean (or expected value) of a continuous variable x is given by Example. The distribution of the amount of gravel (tons) sold by a particular construction supply company in a given week is a continuous variable x with density function Knowledge of the mean value of x will help the company decide on a price for the gravel. Find the mean of x.

  9. Means for Specific Distributions • Continuous distributions • If x ~ N(μ, σ2), then the mean of x is μ. • If x has an exponential distribution with parameter λ, then mean of x is λ. • If x has a lognormal distribution with parameters μ and σ, then the mean of x is • The mean of a Weibull distribution is a somewhat complicated expression involving the parameters α and β.

  10. Population Median Median for continuous distribution The median of a continuous distribution divides the area under the density curve into two equal halves. The defining condition is Example. Find the median for the distribution of weekly gravel sales.

  11. Mean and Median The mean and the median are the same only if the distribution is symmetrical. The median is a measure of center that is resistant to skew and outliers. The mean is not. Mean and median for a symmetric distribution Mean Median Mean and median for skewed distributions Left skew Right skew Mean Median Mean Median

  12. Measure of Variability for Data A measure of the center is not enough to describe a distribution well. Example. Suppose the heights of five starting basketball players on two men’s basketball teams are: Team I (inches): 72 73 76 76 78 Team II (inches): 67 72 76 76 84 Range - Simplest measure of variability Range = the difference between the largest and the smallest sample values. The range depends on only the two most extreme observations and disregards the positions of the remaining (n-2) values.

  13. Sample Variance and Standard Deviation • The sample standard deviation, denoted by s, is the square root of the variance. • The sample variance measures how far, on average, the observations are from the mean. • The more spread out a distribution is around its mean, the larger its standard deviation. • The unit for s is the same as the unit for the data. • Sensitive to outliers.

  14. Example Strength is an important characteristic of materials used in prefabricated housing. Each of 11 prefabricated plate elements was subjected to a severe stress test, and the maximum width (mm) of the resulting cracks was recorded. The data is from the article “Prefabricated Ferrocement Ribbed Elements for Low-Cost Housing” (J. of Ferrocement, 1984: 347-364). .684 2.540 .924 3.130 1.038 .598 .483 3.520 1.285 2.650 1.497 Find the sample variance and sample standard deviation.

  15. Measure of Variability for Distributions Population variance and standard deviation Discrete distributions Definition. The variance of a discrete variable x is given by The standard deviation is σ, the positive square root of the variance. Example. Revisit the plastic part example. Find the variance and standard deviation. Variances for specific discrete distributions 1) If x ~ B(n,π), then. σ2 = nπ(1-π) 2) If x ~ Poisson (λ), then σ2 = λ.

  16. Measure of Variability for Distributions Population variance and standard deviation Continuous distributions Definition. The variance of a continuous variable x is given by The standard deviation is σ, the positive square root of the variance. Example. Revisit the gravel sales example. Find the variance and standard deviation of x. • Variances for specific continuous distributions • If x ~ N(μ, σ2) then the variance of x is σ2 • If x has an exponential distribution with parameter λ, then variance of x is λ. • 3) If x has a lognormal distribution with parameters μ and σ, then the variance of x is • 4) The variance of a Weibull distribution is even more complicated.

  17. The Empirical Rule For any variable x whose distribution is well approximated by a normal curve: Approximately 68% of the values are within 1 standard deviation of the mean. Approximately 95% of the values are within 2 standard deviation of the mean. Approximately 99.7% of the values are within 3 standard deviations of the mean. N(0,1)

  18. Example • Scores on an achievement test taken by all high school seniors in a certain state are known to have, approximately, a bell-shaped distribution with Mean (μ) = 64, Standard Deviation (σ) = 10. • 68% of the data will lie in the interval • 95% of the scores are between • Almost all of the scores are between

  19. Example The time to complete a standardized exam is approximately bell shaped with a mean of 70 minutes and a standard deviation of 10 minutes. Using the empirical rule, what percent of students will complete the exam in under an hour?

More Related