1 / 44

Basic Statistics

Basic Statistics. Measures of Variability. Measures of Variability. The Range Deviation Score The Standard Deviation The Variance. STRUCTURE OF STATISTICS. TABULAR. Continuing with numerical approaches. DESCRIPTIVE. GRAPHICAL. NUMERICAL. STATISTICS.

bikita
Download Presentation

Basic Statistics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Basic Statistics Measures of Variability

  2. Measures of Variability The Range Deviation Score The Standard Deviation The Variance

  3. STRUCTURE OF STATISTICS TABULAR Continuing with numerical approaches. DESCRIPTIVE GRAPHICAL NUMERICAL STATISTICS NUMERICAL CONFIDENCEINTERVALS INFERENTIAL TESTS OF HYPOTHESIS

  4. STRUCTURE OF STATISTICSNUMERICAL DESCRIPTIVE MEASURES TABULAR CENTRALTENDENCY DESCRIPTIVE GRAPHICAL NUMERICAL VARIABILITY SYMMETRY

  5. STRUCTURE OF STATISTICSNUMERICAL DESCRIPTIVE MEASURES CENTRALTENDENCY RANGE VARIABILITY VARIANCE NUMERICAL SYMMETRY STANDARDDEVIATION

  6. We need the variability of IQs in the class! You are an elementary school teacher who has been assigned a class of fifth graders whose mean IQ is 115. Because children with IQ of 115 can handle more complex, abstract material, you plan many sophisticated projects for the year. Do you think your project will succeed ? General population 85% 115 55 70 85 100 130 145

  7. Is the average salary enough? We need the variability! Having graduated from college, you are considering two offers of employment. One in sales and the other in management. The pay is about same for both. After checking out the statistics for salespersons and managers at the library, you find that those who have been working for 5 years in each type of job also have similar averages. Can you conclude that the pay for two occupations is equal? Sales management Much more Much less $20,000

  8. Single score Central Tendency measures Mean IQ=118 Group of scores IQ of 100 students

  9. ? ? ? More homogeneous Central Tendency Measures Measures of Central Tendency do not tell you the differences that exist among the scores

  10. Same Mean---Different Variability So What? 4 How many are out here? 2 1 3 60 Central Tendency

  11. 1. The Range = The difference between the largest(Xmax) and the smallest (Xmin). 25, 21, 22, 23, 28, 26, 24, 29 21 22 23 24 25 26 28 29 Range = 29 –21 = 8 A large range means there is a lot of variability in data.

  12. Drawbacks of The Range 10, 28, 26, 27, 29 ? Range = 29 –26=3 10 26 27 28 29 Range = 29 –10 = 19 The Range depends on only the two extreme scores

  13. Range and Extreme Observations R R R Because the range is determined by just two scores in the group, it ignores the spread of all scores except the largest and smallest. One aberrant score or outlier can be greatly increase the range

  14. Range and Measurement Scales Before you determine the Range, all scores must be arranged in order Country Code SES F Age 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 3-1=2 3-1=2 3-1=2 3-1=2 1=American 2=Asian 3=Mexican 1=Upper 2=Middle 3=Lower

  15. 3. The Variance Differences among Scores 78 49 49 41 88 35 66 53 44 66 49 27 83 53 95 27 Differences among Scores 81 35 42 62 72 57 41 81 49 49 63 35 41 53 ? 77 35 78 88 66

  16. Total Variability = Sum of Individual Variability How can you determine the variability of each individual in group? 72 22 22 55 70 3 12 67 The amount of Individual difference entirely depends on comparison criteria.

  17. Can you figure out how much each score is different from other scores ? Can you figure out total amount of differences among scores ?

  18. You need a Common Criteria for computing Total Variability 46 48 47 Mean Score 53 ? 49 Reference score 51 50 45 52

  19. You need a Common Criteria for computing Total Variability 46 -3 -1 48 -2 47 Deviation Scores 49 +4 53 ? 49 0 Reference score 51 +2 +1 50 -4 45 +3 52 A Deviation score tells you that a particular score deviate, or differs from the mean

  20. DEVIATION SCORE= (Xi - Mean) A score a great distance from the mean will have large deviation score. 2 1 3 Mean A B C D E F

  21. Sum of Deviation Scores Total amount of variability?! Sum of all distance values! mathematically No way! conceptually

  22. The idea makes sense…but If you compute the sum of the deviation scores, the sum of the deviation scores equals zero! Sum of Deviation scores =(-4) + (-3) + (-2) + 0 + (1) + (2) + (3) + (4) = 0

  23. The Sum of Absolute Deviation Scores Sum of absolute deviation scores ( 4 + 3 + 2 + 1 + 0 +1 + 2 + 3 + 4) = 20 The sum of absolute deviations is rarely used as a measure of variability because the process of taking absolute values does not provide meaningful information for inferential statistics.

  24. Sum of Squares of deviation scores “SS” Conceptually And Mathematically

  25. Sum of Squares of Deviation Scores, SS Instead of working with the absolute values of deviation scores, it is preferable to (1) square each deviation score and (2) sum them to obtain a quantity know as the Sum of Squares. SS=(-4)+(-3)+(-2)+(-1)+0+(1)+(2)+(3)+(4) =16+9+4+1+0+1+2+9+16 =60 2 2 2 2 2 2 2 2 2 SS= i

  26. So ! Group of scores “B” Group of scores “A” SS(A)=30 SS(B)=40 Can you say that the variability of the data in Group B is greater than the data in Group A?

  27. What happens to SS when we look at some data? 3, 4 3, 4 3, 4 Group A Group B Mean = 3.5 Mean = 3.5 2 2 2 2 SS = (3 - 3.5) + (4 - 3.5) =.50 SS = (3 - 3.5) + (4 - 3.5) + (3 - 3.5) + (4 - 3.5) =1.00 2 2

  28. N Mean i i=1 SS tends to increase as number of data(N) increase. SS is not appropriate for comparing variability among groups having unequal sample size. How can you overcome the limitation of SS

  29. If SS is divided by N The resulting value will beMean of the Deviation Scores (Mean Square) VARIANCE

  30. 3, 4 3, 4 3, 4 Group A Group B Mean = 3.5 Mean = 3.5 2 2 2 2 V = (3 - 3.5) + (4 - 3.5) =.50/2 = .25 V = (3 - 3.5) + (4 - 3.5) + (3 - 3.5) + (4 - 3.5) = 1.00/4 = 2.5 2 2

  31. Variance Population Variance Sample Variance

  32. POPULATION VARIANCE Individual value Population mean Sigma Square Population size

  33. SAMPLE VARIANCE Sample Mean Individual value Sample variance Sample size-1 Degree of freedom The sample variance (S2) is used to estimate the population variance (2)

  34. Why n-1 instead n ?

  35. population 100 = Sampling n Sampling error sample ?<100<?

  36. Ideally, a sample variance would be based on (x - m)2. This is impossible since m is not known if one has only a sample of n cases. m is substituted by . The value of the squared deviations is less from X than from any other score . Hence, in a sample, the value of (X-X) n would be less than n. > n n

  37. > n-1 Ideal sample variance One could correct for this bias by dividing by a factor somewhat less than n

  38. sample n=5 7 If we know that the mean is equal to 5, and the first 4 scores add to 18, then the last score MUST equal 7. We know that ? must equal 25. n-1 are free to change Degree of freedom

  39. 4. Standard Deviation SD Positivesquare root of the variance Population Sample

  40. The Standard Deviation and the Mean with Normal Distribution m

  41. Normal Distribution m m-3s m-2s m-1s m+1s m+2s m+3s Relationship between m and s

  42. Normal Distribution 68% 95% 99.9% -3S -2S -1S +1S +2S +3S S Relationship between and

  43. EMPIRICAL RULE • For any symmetrical, bell-shaped distribution, approximately 68% of the observations will lie within  1 standard deviation of the mean; approximately 98% within  2 standard deviations of the mean; and approximately 99.9% within  3 standard deviations of the mean.

  44. You can approximately reproduce your data! If a set of data has a Mean=50 and SD=10,then… 68% 95% 99% 20 30 40 50 60 70 80

More Related