1 / 28

Chapter 6. Descriptive Statistics

Chapter 6. Descriptive Statistics. 6.1 Experimentation 6.2 Data Presentation 6.3 Sample Statistics 6.4 Examples. Data: a mixture of nature and noise. Is the noise manageable? The noise is desired to be represented by a probability distribution. Statistical inference:

oblevins
Download Presentation

Chapter 6. Descriptive Statistics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 6. Descriptive Statistics 6.1 Experimentation 6.2 Data Presentation 6.3 Sample Statistics 6.4 Examples

  2. Data: a mixture of nature and noise. • Is the noise manageable? • The noise is desired to be represented by a probability distribution. • Statistical inference: • The science of deducing properties of an underlying probability distribution from data • Can we have information on the underlying probability distribution? • The information is given in the form of (functions of) data.

  3. Figure 6.1 The relationship between probability theory and statistical inference

  4. 6.1 Experimentation6.1.1 Samples • Population: the set of all the possible observations available from a particular probability distribution. • Sample: a subset of a population. • Random sample: a sample where the elements are chosen at random from the population • A sample is desired to be representative of the population. • Types of observations: numerical and nominal x

  5. 6.1.2 Examples • Example 1: Machine breakdowns • Suppose that an engineer in charge of the maintenance of a machine keeps records on the breakdown causes over a period of a year. • Suppose that 46 breakdowns were observed by the engineer (see Figure 6.2). • What is the population from which this sample is drawn? • Factors to consider to check the representative of data: • Quality of operators • Working load on the machine • Particularity of data observation (e.g., more rainy days than other years)

  6. Figure 6.2 Data set of machine breakdowns

  7. Example 2: Defective computer chips • The chip boxes are selected at random from ….. • Points to check on data: • What is the data type? • Are the data representative? • How the randomness of data realized? • Statistical problem: • What is the population from which the data are sampled?

  8. Figure 6.4Data set of defective computer chips

  9. 6.2 Data presentation 6.2.1 Bar and Pareto charts 6.2.2 Pie charts 6.2.3 Histograms 6.2.4 Outliers • An outlier is an observation which is not from the distribution from which the main body of the sample is collected.

  10. Figure 6.7 Bar chart of machine breakdowns data set

  11. Figure 6.9 Pareto chart of customer complaints for Internet company

  12. Figure 6.12Pie chart for machine breakdowns data set

  13. Figure 6.14Histogram of computer chips data set

  14. Figure 6.16 Histograms of metal cylinder diameter data set with different bandwidths

  15. Figure 6.18 A histogram with positive skewness

  16. Figure 6.19 A histogram with negative skewness

  17. Figure 6.21 Histogram of a data set with a possible outlier

  18. 6.3 Sample statistics 6.3.1 Sample mean 6.3.2 Sample median 6.3.3 Sample trimmed mean 6.3.4 Sample mode 6.3.5 Sample variance 6.3.6 pth Sample quantiles 6.3.7 Boxplots

  19. Cf. Chebyshev’s inequality: Let Then, In general, Cf. Theorem: the weak law of large numbers Let be a sequence of i.i.d. random variables, each having mean and variance Then, for any

  20. (proof)

  21. Figure 6.22 Illustrative data set

  22. Figure 6.23Relationship between the samplemean, median, and trimmed meanfor positively and negativelyskeweddata sets

  23. Figure 6.20 A histogram for a bimodal distribution

  24. Figure 6.24 Boxplot of a data set

  25. Figure 6.30 Rolling mill process

  26. Figure 6.31 % scrap data set from rolling mill process

  27. Figure 6.32 Histogram of rolling mill scrap data set

  28. Figure 6.33 Boxplot and summary statistics for rolling mill scrap data set

More Related