1 / 13

Exploratory Data Analysis; coined by Tukey 1977

Exploratory Data Analysis; coined by Tukey 1977. -Illuminate underlying pattern in noisy data -Predecessor to formal analysis -May lead to different analysis than originally planned. Data visualization (The first thing you do with your data!!).

berg
Download Presentation

Exploratory Data Analysis; coined by Tukey 1977

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Exploratory Data Analysis; coined by Tukey 1977 -Illuminate underlying pattern in noisy data -Predecessor to formal analysis -May lead to different analysis than originally planned Data visualization (The first thing you do with your data!!)

  2. Important functions of exploratory data visualization • Spot outliers • Discriminate clusters • Check distributional and other assumptions • Examine relationships • Compare mean differences • Observe a time-based process http://seamonkey.ed.asu.edu/~alex/teaching/WBI/EDA.html

  3. Univariate data (one variable); frequency distributions Distributions of height, biomass, etc…. often used to describe populations • How are the data distributed (including summary/descriptive statistics) • Are the data normal? (required to meet assumptions of many statistical techniques- more later) • If not normal, can they be transformed?

  4. Histograms • Raw data hidden • Division to categories arbitrary • Excel, many programs Identify skew, non-normality Identify outliers

  5. quiz scores 20 20 21 25 29 32 36 37 38 41 44 46 50 53 58 Stem-leaf plots -show original data -division to categories arbitrary -easier to order data first -a histogram on its side (sort of) Stem leaves 2 0 0 1 5 9 3 2 6 7 8 4 1 4 6 5 0 3 8

  6. Box (box-whisker) plots • -calculate median, draw horizontal line • -draw a box with ends at the quartiles Q1 (25%) and Q3 (75%) • extend the "whiskers" to the farthest points that are not outliers • outliers are outside 3/2 times the interquartile range (Q3-Q1) • Draw a dot for every outlier Can be done for a single distribution or comparing several http://mathworld.wolfram.com/Box-and-WhiskerPlot.html

  7. Normal probability plots will be covered later

  8. Bivariate (2 variable) data • -Relationship between the 2 variables • Are there outliers? • Examined by Scatterplots negative none

  9. Non-linear Graphing helps you see relationships. Formal analysis guided by a priori knowledge that one variable causes change in the other (more later)

  10. Classified Data: often result from an ecological experiment • - Bar chart • Shows means and variance • - “shows” treatment differences & magnitude 15 10 5 Epilithon NPP (mg O2/m2/hr) 0 -5 high light low light -10 Mean  one S.E.

  11. List things that are wrong with this graph. 15 10 5 Epilithon NPP 0 -5 -10

  12. Graphing Exercise Obtain a dataset, preferably your own or a colleague’s, but can be anything Choose a graphing style that best illustrates the “message” of your data Use Excel or other program to make a graph Print on an overhead to show in class

More Related