110 likes | 135 Views
Explore different types of data analysis methods such as exploratory and confirmatory analysis. Learn how to display and examine distributions using techniques like histograms, box plots, and transformations for improved interpretation. Discover the use of control charts and geographic information systems in data analysis.
E N D
Types of Data Analysis • Exploratory data analysis: • finding clues and evidence • emphasis on visual representations and graphical techniques • guides the choice of analysis or a revision of the planned analysis
Types of Data Analysis • Confirmatory data analysis • evaluating the strength of evidence • closer to classical statistical inference in its use of significance and confidence • may use information from a closely related data set or by validating findings through the gathering and analyzing of new data
Techniques to Display and Examine Distributions • Frequency Table • Visual Displays • Histograms • Stem-and-leaf display • Box-plot • Crosstabulation of Variables
Techniques to Display and Examine Distributions • Histograms • Display all intervals in a distribution, even without observed values • Examine the shape of the distribution for skewness, kurtosis, and the modal pattern • Stem and Leaf Displays • Reveals the distribution of values within the interval and preserves their rank order
Techniques to Display and Examine Distributions (cont.) • Box-plot (box and whisker-plot) • Rectangular plot encompasses 50% of the data values • Edges of the box are called “hinges” • Center line through the width of the box marks the median • Whiskers extend from the right and left hinges to the largest and smallest values
Techniques to Display and Examine Distributions (cont.) • Transformation of variables • To improve interpretation and compatibility with other data sets • To enhance symmetry and stabilize spread • To improve linear relationships between and among variables
Control Charts • Displays sequential measurements of a process together with a center line and control limits (upper and lower) • Types for variables data (ratio or interval) • X-bar • R-charts • s-charts (sample standard deviation) • Pareto Diagrams (Bar chart whose percentages sum to 100 percent)
Geographic Information Systems • Systems of hardware, software, and procedures that capture, store, manipulate, integrate, and display spatially-referenced data (e.g. MapQuest) • Integrating information from various sources • Capturing data • Projection and restructuring • Modeling ( of best route)
Crosstabulation • A technique for comparing two classification variables • Cells • Marginals • Contingency tables • A technique to test whether there is a difference between observed frequencies and expected frequencies
Other Table-based Analysis • Automatic Interaction Detection (AID) • Sequential partitioning procedure that uses a dependent variable and set of predictors • Searches among up to 300 variables for the best single division of data into subsets according to each predictor variable, • Chooses one division approach • Splits the sample using chi-square tests to create multi-way splits.