170 likes | 318 Views
Chapter 1 Data Presentation. Statistics and Data Measurement Levels Summarizing Data Symmetry and Skewness. Statistics and Data. Statistics – collection of techniques used in analyzing data n umbers produced in the analysis ( eg . Average)
E N D
Chapter 1 Data Presentation Statistics and Data Measurement Levels Summarizing Data Symmetry and Skewness
Statistics and Data • Statistics – collection of techniques used in analyzing data • numbers produced in the analysis (eg. Average) • Data – collection of measurements made on a number of subjects. • Subjects – where information are drawn • experimental units
Data are usually stored in a row-and-column display called a spreadsheet. From page 2 of the textbook Row represents a subject and columns represent measure of variables.
Measurement Levels of Data Types of Data • Categorical Data – variables that yield categorical data Nominal – possible values are just names of categories – no apparent ordering between the possible values examples: Gender, Major, College Ordinal – there is an obvious ordering of the possible values example: Year level (Freshman, Sophomore …) , Military ranking • Numerical Data - variables that yield numerical data Interval – Interval exists but not ratios – zero does not mean absence of that variable examples: Temperature, IQ 60 F vs 30 F, there is 30 degrees difference between the two temperatures but it does not mean that 60 F is twice as warm as 30F Ratio – ratio exist examples: Age, Height, Number of classes taken this semester
Ratio : there are 2 other levels under ratio Discrete: result of a counting process example: number of classes being taken, number of students in a class Continuous: result of a measuring process. example: height, age, weight, velocity
Summarizing Data Summarizing Categorical Data • Relative Frequency Table - represents the frequency of each type of categorical variable • Bar Chart - plot of the relative frequency table; order of categories is arbitrary • Pie Chart - also a plot of the relative frequency table, except in a circular shape
Bar Chart of the Relative Frequency Table Using Frequency Using Relative Frequency
Summarizing Numerical Data • Stem and Leaf Plot • Relative Frequency Table and Histogram • similar concept with the categorical data • determine the following: number of classes, class width • For example: MIN, MAX , number of classes, width = (MAX -MIN) /(classes-1) • The intervals in each class should be mutually exclusive. • The histogram will just be the graphical presentation of the RTF • Box-and-Whisker Plot • a graphical picture of the distribution of quarters of the data. • Useful for comparing distributions of two or more variables • Minimum • Q1 (first quartile) – the upper boundary of the first quarter • Median – divides the data into lower and upper halves. • Q3 (third quartile) – the upper boundary of the third quarter • Maximum • Dotplot • similar to the histogram but used for moderately large data • this can also be used in studying outliers in the data
Stem-and-leaf Display Summer 2 Quiz Data: 8, 11, 13, 19, 21, 23, 25, 25, 25, 28, 31, 35, 39, 47 Stemplot of Summer 2 Quiz
Relative Frequency Table and Histogram Summer 2 Quiz Data: 8, 11, 13, 19, 21, 23, 25, 25, 25, 28, 31, 35, 39, 47 For example, 4 classes is desired. MIN=8, MAX=47 Class width = (47-8)/(4-1)=39/3=13 Note: intervals include the right endpoint but not the left endpoint.
Boxplot or Box-and-Whisker Plot • Minimum = 8 • Q1 (first quartile) =19 • Median = 25 • Q3 (third quartile) = 31 • Maximum = 47 Summer 2 Quiz Data: 8, 11, 13, 19, 21, 23, 25, 25, 25, 28, 31, 35, 39, 47
Symmetry and Skewness Examining symmetry and skewness determines the shape of the data If the left tail is longer than the right tail, then the data is left-skewed. If the right tail is longer than the left tail, then the data is right-skewed. If the left tail is almost the same as the right tail, then the data is symmetric. Stem-and-leaf display, Histogram and Boxplot can be used to examine symmetry and skewness.
The left tail is longer than the right tail, hence the data is left-skewed.