170 likes | 313 Views
Chapter 1 Data Presentation. Statistics and Data Measurement Levels Summarizing Data Symmetry and Skewness. Statistics and Data. Statistics – collection of techniques used in analyzing data – numbers produced in the analysis ( eg . Average)
E N D
Chapter 1 Data Presentation Statistics and Data Measurement Levels Summarizing Data Symmetry and Skewness
Statistics and Data • Statistics – collection of techniques used in analyzing data – numbers produced in the analysis (eg. Average) • Data – collection of measurements made on a number of subjects. • Subjects – where information are drawn – experimental units
Data are usually stored in a row-and-column display called a spreadsheet. From page 2 of the textbook Row represents a subject and columns represent measure of variables.
Measurement Levels of Data Types of Data A. Categorical Data – variables that yield categorical data i. Nominal – possible values are just names of categories – no apparent ordering between the possible values examples: Gender, Major, College ii. Ordinal – there is an obvious ordering of the possible values example: Year level (Freshman, Sophomore …) , Military ranking
Measurement Levels of Data • B. Numerical Data– variables that yield numerical data i. Interval – interval exists but not ratios – zero does not mean absence of that variable examples: Temperature, IQ 60F vs 30F ii. Ratio – ratio exist examples: Age, Height, Number of classes taken this semester a.Discrete: result of a counting process • example: number of classes being taken, number of students in a class • b.Continuous: result of a measuring process. • example: height, age, weight, velocity
Summarizing Data Summarizing Categorical Data • Relative Frequency Table - represents the frequency of each type of categorical variable • Bar Chart - plot of the relative frequency table; order of categories is arbitrary • Pie Chart - also a plot of the relative frequency table, except in a circular shape
Bar Chart of the Relative Frequency Table Using Frequency Using Relative Frequency
Summarizing Numerical Data • 1. Stem and Leaf Plot • 2. Relative Frequency Table and Histogram • - similar concept with the categorical data • - determine the following: number of classes, class width • - for example: MIN, MAX , number of classes, width = (MAX -MIN)/(classes-1) • - the intervals in each class should be mutually exclusive. • the histogram will just be the graphical presentation of the RFT • 3. Box-and-Whisker Plot • - a graphical picture of the distribution of quarters of the data. • - useful for comparing distributions of two or more variables • Minimum • Q1 (first quartile) – the upper boundary of the first quarter • Median – divides the data into lower and upper halves. • Q3 (third quartile) – the upper boundary of the third quarter • Maximum • 4. Dotplot • - similar to the histogram but used for moderately large data • - this can also be used in studying outliers in the data
Stem-and-leaf Display Summer 2 Quiz Data: 8, 11, 13, 19, 21, 23, 25, 25, 25, 28, 31, 35, 39, 47 Stemplot of Summer 2 Quiz
Relative Frequency Table and Histogram Summer 2 Quiz Data: 8, 11, 13, 19, 21, 23, 25, 25, 25, 28, 31, 35, 39, 47 For example, 4 classes is desired. MIN=8, MAX=47 Class width = (47-8)/(4-1)=39/3=13 Note: intervals include the right endpoint but not the left endpoint.
Boxplot or Box-and-Whisker Plot Summer 2 Quiz Data: 8, 11, 13, 19, 21, 23, 25, 25, 25, 28, 31, 35, 39, 47 • Minimum = 8 • Q1 (first quartile) =19 • Median = 25 • Q3 (third quartile) = 31 • Maximum = 47
Symmetry and Skewness Examining symmetry and skewness determines the shape of the data If the left tail is longer than the right tail, then the data is left-skewed. If the right tail is longer than the left tail, then the data is right-skewed. If the left tail is almost the same as the right tail, then the data is symmetric. Stem-and-leaf display, Histogram and Boxplot can be used to examine symmetry and skewness.
Symmetry and Skewness The left tail is longer than the right tail, hence the data is left-skewed.