80 likes | 248 Views
What is Exploratory Data Analysis?. An Approach/Philosophy for data analysis Employs a variety of techniques (mostly graphical)…we will look at 3 of these: scatter plot stem and leaf boxplot (box and whisker). Basic Idea of EDA. Model = Smooth + Rough
E N D
What is Exploratory Data Analysis? • An Approach/Philosophy for data analysis • Employs a variety of techniques (mostly graphical)…we will look at 3 of these: • scatter plot • stem and leaf • boxplot (box and whisker)
Basic Idea of EDA • Model = Smooth + Rough • Visual techniques can often tease more “smooth” out of the rough
Classic vs Exploratory • Classical sequence: Problem > Data > Model > Analysis > Conclusions • Exploratory: Problem > Data > Analysis > Model > Conclusions
Data Treatment • Classical uses • mean and standard deviation = point estimates • Measure of variance explained - Pearson r • Exploratory uses • 5-Number Summary: Min, Q1, Median, Q3, Max • all (most) data=visual summaries • scatterplot • stem and leaf • boxplot (box and whisker) needs 5 Number Summary
5-Number summary • Arrange data in descending order • Find Q1=1/4 data lies below this point • Find Median= 1/2 data lies below this point • Find Q3=3/4 data lies below this point • Find Max score and Min score
Try it with Kings and Queens • First make a stem and leaf • Then find 5-Number summary • Then create a box plot
A correlation measure - Pearson r • Measures the strength and direction of two ratio/interval variables • Strength is normalized to be between -1 and +1, where zero means there is no relationship • The sign + or - indicates the direction of the relationship • + means that as one variable goes up in size, so does the other • - means that as one variable increases in size, the other decreases