110 likes | 134 Views
Chapter Sixteen EXPLORING, DISPLAYING, AND EXAMINING DATA. Types of Data Analysis. Exploratory data analysis: finding clues and evidence emphasis on visual representations and graphical techniques guides the choice of analysis or a revision of the planned analysis. Types of Data Analysis.
E N D
Types of Data Analysis • Exploratory data analysis: • finding clues and evidence • emphasis on visual representations and graphical techniques • guides the choice of analysis or a revision of the planned analysis
Types of Data Analysis • Confirmatory data analysis • evaluating the strength of evidence • closer to classical statistical inference in its use of significance and confidence • may use information from a closely related data set or by validating findings through the gathering and analyzing of new data
Techniques to Display and Examine Distributions • Frequency Table • Visual Displays • Histograms • Stem-and-leaf display • Box-plot • Crosstabulation of Variables
Techniques to Display and Examine Distributions • Histograms • Display all intervals in a distribution, even without observed values • Examine the shape of the distribution for skewness, kurtosis, and the modal pattern • Stem and Leaf Displays • Reveals the distribution of values within the interval and preserves their rank order
Techniques to Display and Examine Distributions (cont.) • Box-plot (box and whisker-plot) • Rectangular plot encompasses 50% of the data values • Edges of the box are called “hinges” • Center line through the width of the box marks the median • Whiskers extend from the right and left hinges to the largest and smallest values
Techniques to Display and Examine Distributions (cont.) • Transformation of variables • To improve interpretation and compatibility with other data sets • To enhance symmetry and stabilize spread • To improve linear relationships between and among variables
Control Charts • Displays sequential measurements of a process together with a center line and control limits (upper and lower) • Types for variables data (ratio or interval) • X-bar • R-charts • s-charts (sample standard deviation) • Pareto Diagrams (Bar chart whose percentages sum to 100 percent)
Geographic Information Systems • Systems of hardware, software, and procedures that capture, store, manipulate, integrate, and display spatially-referenced data (e.g. MapQuest) • Integrating information from various sources • Capturing data • Projection and restructuring • Modeling ( of best route)
Crosstabulation • A technique for comparing two classification variables • Cells • Marginals • Contingency tables • A technique to test whether there is a difference between observed frequencies and expected frequencies
Other Table-based Analysis • Automatic Interaction Detection (AID) • Sequential partitioning procedure that uses a dependent variable and set of predictors • Searches among up to 300 variables for the best single division of data into subsets according to each predictor variable, • Chooses one division approach • Splits the sample using chi-square tests to create multi-way splits.