70 likes | 293 Views
Basic Idea of EDA. Model = Smooth RoughVisual techniques can often tease more smooth" out of the rough. Classic vs Exploratory. Classical sequence: Problem > Data > Model > Analysis > ConclusionsExploratory: Problem > Data > Analysis > Model > Conclusions. Data Treatment. Classical uses me
E N D
1. What is Exploratory Data Analysis? An Approach/Philosophy for data analysis
Employs a variety of techniques (mostly graphical)…we will look at 3 of these:
scatter plot
stem and leaf
boxplot (box and whisker)
2. Basic Idea of EDA Model = Smooth + Rough
Visual techniques can often tease more “smooth” out of the rough
3. Classic vs Exploratory Classical sequence: Problem > Data > Model > Analysis > Conclusions
Exploratory: Problem > Data > Analysis > Model > Conclusions
4. Data Treatment Classical uses
mean and standard deviation = point estimates
Measure of variance explained - Pearson r
Exploratory uses
5-Number Summary: Min, Q1, Median, Q3, Max
all (most) data=visual summaries
scatterplot
stem and leaf
boxplot (box and whisker) needs 5 Number Summary
5. 5-Number summary Arrange data in descending order
Find Q1=1/4 data lies below this point
Find Median= 1/2 data lies below this point
Find Q3=3/4 data lies below this point
Find Max score and Min score
6. Try it with Kings and Queens First make a stem and leaf
Then find 5-Number summary
Then create a box plot
7. A correlation measure - Pearson r Measures the strength and direction of two ratio/interval variables
Strength is normalized to be between -1 and +1, where zero means there is no relationship
The sign + or - indicates the direction of the relationship
+ means that as one variable goes up in size, so does the other
- means that as one variable increases in size, the other decreases