190 likes | 311 Views
Exploratory Data Analysis for single variables. Module B2 Session 12. Learning Objectives. students should be able to Explain the importance of exploring data at the start of the analysis Use two new tools for exploration Dot plots and stem & leaf plots
E N D
Exploratory Data Analysis for single variables Module B2 Session 12
Learning Objectives students should be able to • Explain the importance of exploring data • at the start of the analysis • Use two new tools for exploration • Dot plots and stem & leaf plots • Construct simple and jittered dot plot • Draw a stem and leaf plot • Use training resources more effectively • CAST as a training resource • Excel as a training and analysis tool • With charts and graphs • Explain the difference between exploratory and presentation graphs
Stages in processing the data • Entry and checking the data • Organising the data for analysis • Exploring the data • Analysis • Reporting • The middle 3 stages are iterative and can be repeated • Some exploration can be before the organising • Continue to explore through the analysis
In this session • Two new tools are introduced • Dot plots • Stem and leaf plots • They are to process numeric data • So far we have concentrated on categorical data • Now we start to redress the balance • In the next session • We apply these tools
Jittered dot plots in CAST and Excel CAST EXCEL Rainfall data: 608, 746, 767, ….. 1395, 1425, 1482
Stem and leaf plots – survey yields Stem - tens digit Leaves - units digit Dec point - truncated Single stem Split stem Yields: 19.1, 24.3, 24.7,….. 59.3, 61.4, 62.1
Exploratory and presentation graphs • Dot plots and stem and leaf plots • show all the data • to help with exploration • to look for oddities • and to prepare for the analysis • They are for data exploration • the graphs have to be effective, not “pretty” • Bar charts and pie charts • show summaries • to present results • to others • in reports and presentations • They are for presentation
Practical • Activity 2 – uses CAST for dot plots • Activity 3 – uses Excel to produce dot plots • Activity 4 – uses both for stem and leaf plots • It also prepares for the future • Efficient use of CAST • Effective use of Excel
Learning to use resources fully • CAST is a new type of resource • You may have to take some time • to learn how to use it fully • so you gain the maximum from the resource • When you do • It can help during this training • and also afterwards • Because it supports self-study • It is part of an effort to change learning • towards a voyage of discovery
Using CAST fully? You also saw this earlier In the tutorial introduction Follow the instructions to take advantage of the dynamic elements Also think why the action is useful
Did you puzzle, or just click? Did you follow these instructions to scan down the list and look for the pattern Or did you take the easy way out and just click
Interact and read the text as well Instructions Instructions and statistics Important points are in white
Using Excel effectively • Dot plots are not on Excel’s menus • Dot plots are not in Excel’s help • But you decided to do dot plots in Excel! • You therefore need to understand them better • So you can construct them yourself • And this understanding is good anyway • And helps with effective data analysis • It is an example • Of you controlling the software • And not being limited by it • That applies to all software
Jittered dot plots in CAST and Excel CAST EXCEL Why are the vertical heights different in the 2 cases? Do you ALL know?
Excel for analysis and training • Excel is not designed as a training resource • Unlike CAST – that is all CAST is for • Excel is to support • data organisation • and analysis • But here we have used it also for training • With dot plots • And stem and leaf plots • Neither of which are in the Excel menus
Summary • Dot plots and stem & leaf plots give simple tools • to look at the actual data in a simple and concise way • It is important to look at the data itself • before starting on the actual analysis • so any patterns or oddities can be identified • and necessary steps taken to deal with them • When dealing with large sets of data, computers are needed to do the exploration; • However the importance of this work • should be stressed right at the data entry stage • and could even become part of the data checking procedures
The next session will extend and apply the tools from this session to real data