250 likes | 382 Views
Statistics. Descriptive Statistics. Organize & summarize data (ex: central tendency & variability. Scales of Measurement. Nominal Categories for classifying Least informative scale EX: divide class based on eye color. Ordinal Order of relative position of items according to some criterion
E N D
Descriptive Statistics • Organize & summarize data (ex: central tendency & variability
Scales of Measurement Nominal • Categories for classifying • Least informative scale EX: divide class based on eye color Ordinal • Order of relative position of items according to some criterion • Tells order but nothing about distance between items Ex: Horse race
Scales of Measurement Interval • Scale with equal distance btw pts but w/o a true zero • Ex: Thermometer Ratio • Scale with equal distances btw the points w/ a true zero • Ex: measuring snowfall
Histogram & Frequency Polygon • X axis- possible scores • Y axis- frequency
Normal Curve of Distribution • Bell-shaped curve • Absolutely symmetrical • Central Tendency: mode, mean, median?
Central Tendency Let’s look at the salaries of the employees at Dunder Mifflen Paper in Scranton: • Mean, Median and Mode. • Watch out for extreme scores or outliers. $25,000-Pam $25,000- Kevin $25,000- Angela $100,000- Andy $100,000- Dwight $200,000- Jim $300,000- Michael The median salary looks good at $100,000. The mean salary also looks good at about $110,000. But the mode salary is only $25,000. Maybe not the best place to work. Then again living in Scranton is kind of cheap.
Skewed Distributions Positively Skewed Negatively Skewed
Each hump indicates a mode; the mean and the median may be the same. Ex: Survey of salaries- Might find most people checked the box for both $25,000-$35,000 AND $50,000-$60,000 Bimodal Distribution
Variability • On a range of scores how much do the scores tend to vary or depart from the mean • Ex: golf scores of erratic golfer or consistent golfer
Standard Deviation • Statistical measure of variability in a group of scores • A single # that tells how the scores in a frequency distribution are dispersed around the mean
Standard Deviation 12 12 12 12 12 20 220 21 221 22 222 23 223
Correlation DOES NOT IMPLY CAUSATION!
Correlation: Two variable are related to each other with no causation • The strength of the correlation is defined with a statistic called the correlation coefficient (+1.00 to -1.00) • Positive- Indicates the two variables go in the same direction • EX: High school & GPA
Correlation Positive • two variables go in the same direction • EX: High school & GPA Negative • two variable that go in the opposite directions • EX: Absences & Exam scores
Strength of the Correlation ( r) • Correlation Coefficent- Numerical index of the degree of relationship between two variable or the strength of the relationship. • Coefficient near zero = no relationship between the variables ( one variable shows no consistent relationship to the other 50%) • Perfect correlation of +/- 1.00 rarely ever seen • Positive or negative ONLY indicate the direction, NOT the strength
Coefficient of Determination-Index of correlation’s predictive power • Percentage of variation in one variable that can be predicted based on the other variable • To get this number, multiply the correlation coefficient by itself • EX: A correlation of .70 yields a coefficient of determination of .49 (.70 X .70= .49) indicating that variable X can account for 49% of the variation in variable Y • Coefficient of determination goes up as the strength of a correlation increases (B.11)
Inferential Statistics • The purpose is to discover whether the finding can be applied to the larger population from which the sample was collected. • P-value= .05 for statistical significance. • 5% likely the results are due to chance.
Null Hypothesis • Is the observed correlation large enough to support our hypothesis or might a correlation of the size have occurred by chance? • Do our result REJECT the null hypothesis?
Statistical Significance • It is said to exist when the probability that the observed findings are due to chance is very low, usually less than 5 chances in 100 (p value = .05 or less) • When we reject our null hypothesis we conclude that our results were statistically significant.
Type I v. Type II Error • Type I Error- said IV had an effect but it didn’t • False alarm • Type II Error- don’t believe the IV had an effect but it really does • Which is worse?