270 likes | 400 Views
Data Analysis. Using Descriptive Statistics ED 690 Minjuan Wang . A Few Online Tools. Seeing Statistics Interactive Statistics Tools Against all odds Inside statistics http://www.learner.org/resources/series65.html#jump1 Video on Demand. Today We’ll Watch.
E N D
Data Analysis Using Descriptive Statistics ED 690 Minjuan Wang
A Few Online Tools • Seeing Statistics • Interactive Statistics Tools • Against all odds • Inside statistics • http://www.learner.org/resources/series65.html#jump1 • Video on Demand
Today We’ll Watch • http://learner.org/resources/series65.html • Picturing Distributions • Describing Distributions • Normal Distributions
How to know what to use? • Statistical Procedures Applied are determined by: • Research/Evaluation Questions • Research/Evaluation Design • Types of Measurements • Nominal, Ordinal, Interval or Ratio
Analyzing Quantitative Data • Measures of Central Tendency • Measures of Variability • Measure of relative standing • Measures of Relationship • Refer to your Statistical Family Tree
Measures of Central Tendency • Convenient way to describe a set of numbers with a single number • Three common types: • Mean • Median • Mode
Measure of Variability • Variability: • reflects how the scores differed from one another. • a measure of difference from the mean. • Central Tendency without any measures of variability? • can be misleading
Measure of Variability • Range • Variance • Standard Deviation • the most common and useful measure of variability • the average distance of each score from the mean • Candy bar example • Frequency distribution • Distribution Curve • Normal distribution • Skewed distribution • Negatively skewed; positively skewed
When to Use Graphs • To illustrate relative amounts • To specify the subject • To answer specific questions
Bar Graphs • Quantitative and Rank-Order Data • Show achievement of objectives • Frequency histogram
Frequency Distribution & Histogram • Frequency Distribution • "a set of scores arranged in order of magnitude along the x-axis and the frequency of each score is represented along the y-axis“ • Frequency Histogram • similar to bar graphs but has no spaces between the bars
Example of company • 25 employees • 1 Owner at 450K • 1 VP at 150K • 2 Directors at 100K • 1 Manager at 57K • 3 Department heads at 50K • 4 Section Chiefs at 37K • 1 Maintenance at 30K • 12 Shift workers at 20K
Company data average • Mean = 57 K • Median = 30 K • Mode = 20 K • What is the average wage?
Draw Frequency Distribution • Group data into intervals (5 to 10) • Define the size of the interval widths based on understandable units • Range/intervals • Make sure the intervals do not overlap • Work in teams to draw a distribution of the Salkind book data or the salary data • Handout: project data & results on screen • The result sheet
Frequency distribution--Normal Curve (Figure 12.2, p. 445) • Many statistics assume the normal, bell-shaped curve distribution for scores. • 50% > mean; 50% < mean • Normal curve for population (height, weight, IQ scores) • Mean=median=mode • Mean + 1SD/34.13% of the score • Mean – 1SD/34.13% of the score • Mean +/- 3SD = more than 99% of the score
Skewed Distribution • Non-symmetrical distribution • Mean, median, mode not the same • Negatively skewed (Figure 12.3, p. 447) • extreme scores at the lower end • Mean < median <mode • most did well, a few poorly • Positively skewed • at the higher end • Mean >median >mode • Most did poorly, a few well • Colorado Mountain: Ski to the right->skew to the right • The further apart the mean and median, the more the distribution is skewed.
Describing-Variability • Standard Deviation [or dispersion] (average distance from the mean) • 1 sd includes 34% above and below mean • 2 sd includes 47.5% above and below mean • 3 sd includes 49.9 % above and below mean • SD chart by Kathleen Barlo • EET article on SD • URL: http://coe.sdsu.edu/eet/Articles/standarddev/index.htm
Using SD in Prescribing Cereal • As a practicing nutritionist, Dr. Green frequently came across patient questions like "what is the cereal that are within my diet in terms of calories and fat grams?" • Dr. Greenly uses descriptive statistics to give advise. • Launch Cereal data from Data->Load data->sample data (StatCrunch) • Draw frequency histogram • Fruit loop calories: SD=+2 • Give it to someone who is trying to lose 10 lbs?
StatCrunch Demo http://focus.sdsu.edu/statcrunch4.0/
Mini-Data Activity • Use the Culture Data posted on the weekly page: • http://edweb.sdsu.edu/courses/ed690new/week8new.htm • Run the basic descriptive analysis • Instructions: see “Guide for Analysis” worksheet on file Culture_Minida_SCrunchSee
Mean Height SD Monday 67.9” 3.56” Tuesday* 68.0” 3.6” Both Sections 68.0” 3.5” Predicting Height: Normal Distribution From this sample of 30 adults, the average height of is 68.0 inches or 5 feet 10 inches tall and that 99% of all adults fall in between the height of ??? And ???
*Describing-Relative standing • Percentile • z score • based on sd • Score of 0 is mean. • Score of 1 is 1 sd above mean (percentile of 68%) • T score • 10z + 50 • Quartile • Divided into 4 groups • Stanine: Divided into 9 groups Questions 1-5 onPage 451
Measure of relative standing • Z-score • One type of standard scores • Compares scores from different tests • Convert scores to z scores, average them->final index of average performance • Z= Raw Score (X)-Mean/SD • Z score of mean = 0 • Percentiles • The percentage of scores that fall at or below a given score • Outliers • Example • GRE
Describing-Relationships • How variables are related--need at least 2 variables • Spearman rho • coefficient correlates data that are ranked • Pearson r • correlates data that are interval or ratio • How does foot size correlate to GRE scores? • Scores go from +1 to -1 • More in “Correlational Research”
Optional: Z-Distribution (Histogram) & Hypothesis Testing • One type of frequency histogram • Z-distribution (normal) lies at the heart of inferential statistics
Optional--Do Copper Bracelets reduce arthritic pain? • Take all patients’ scores (Numbers of pain complaints) in the experimental and control group and convert them to a single z score • If the z score of the treatment group falls within -2 to +2 SD, conclusion? • Ney! Since 95.44% of all z scores should fall within this range by chance anyway • If the z is >2.00 or <-2.00, conclusion? • Yeah! P (probability of chance)=4.56%