1 / 7

What is Exploratory Data Analysis?

What is Exploratory Data Analysis?. An Approach/Philosophy for data analysis Employs a variety of techniques (mostly graphical)…we will look at 3 of these: scatter plot stem and leaf boxplot (box and whisker). Basic Idea of EDA. Model = Smooth + Rough

aleda
Download Presentation

What is Exploratory Data Analysis?

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. What is Exploratory Data Analysis? • An Approach/Philosophy for data analysis • Employs a variety of techniques (mostly graphical)…we will look at 3 of these: • scatter plot • stem and leaf • boxplot (box and whisker)

  2. Basic Idea of EDA • Model = Smooth + Rough • Visual techniques can often tease more “smooth” out of the rough

  3. Classic vs Exploratory • Classical sequence: Problem > Data > Model > Analysis > Conclusions • Exploratory: Problem > Data > Analysis > Model > Conclusions

  4. Data Treatment • Classical uses • mean and standard deviation = point estimates • Measure of variance explained - Pearson r • Exploratory uses • 5-Number Summary: Min, Q1, Median, Q3, Max • all (most) data=visual summaries • scatterplot • stem and leaf • boxplot (box and whisker) needs 5 Number Summary

  5. 5-Number summary • Arrange data in descending order • Find Q1=1/4 data lies below this point • Find Median= 1/2 data lies below this point • Find Q3=3/4 data lies below this point • Find Max score and Min score

  6. Try it with Kings and Queens • First make a stem and leaf • Then find 5-Number summary • Then create a box plot

  7. A correlation measure - Pearson r • Measures the strength and direction of two ratio/interval variables • Strength is normalized to be between -1 and +1, where zero means there is no relationship • The sign + or - indicates the direction of the relationship • + means that as one variable goes up in size, so does the other • - means that as one variable increases in size, the other decreases

More Related