EDUC 502: Introduction to Statistics

EDUC 502: Introduction to Statistics Lesson 3: variability 1/24/12

Review from last class • Percentile • Simply the percentage of scores at or below a given score • So you just count the frequency of scores at or below the score you want to know the percentile for then divide by the total frequency of scores • Quartiles • Didn’t actually cover last week, but important to know • Each quartile is 25% • Draw on board with a distribution

What is variability? • Variability is how much scores fluctuate in the sample or population • There are a variety of ways of looking at the variability

Range • = Maximum score – Minimum score • Gives a very rough measure of variability, but is only based on two data points and can obscure the data if there are outliers

Variability around the mean To get a better description of how much scores vary from one another we calculate how much they vary around the mean. First, we need to understand that each score is a certain distance from the mean (x – xbar) and that we can’t just compute the average distance directly because the sum of the distances will always equal zero.

Variance • One way to solve the problem of the distances canceling each other out is to square each distance from the mean • Thus we get s2 = Σ(x – xbar)2/ n • This still isn’t perfect though because we squared everything which makes the variability appear larger than it actually is

Standard Deviation • Because you have squared the differences, we now need to take the square root to get the estimate of the variability back to something that isn’t exaggerated • s = √( Σ(x – xbar)2/ n)

Population Variance and SD • The formulas are essentially the same, but we use different symbols to indicate when we are talking about samples vs populations • Sample = Latin alphabet • Population = Greek alphabet • Variance • σ2x = Σ(X – μ)2 / N • SD • σx = √(Σ(X – μ)2 / N) • The problem is we very rarely have data from the population, so the question is, are our sample measures good enough to use to estimate the population? • NO!

Estimating population variance and SD • The sample statistics are a biased estimation of the sample, they underestimate the variance and SD • So we need an unbiased (or at least less biased) estimator • To do this we simply divide by n – 1 instead of just n • n – 1 is the degrees of freedom • Degrees of Freedom (df) • The number of values in a calculation that are free to vary • If we have a sample of 50 and we know the mean then only 49 values can vary. Once we know 49 we know what the last value is. • There are proofs to show that this really is less biased but we won’t go over them

SD in relation to the normal curve • Because we assume the distribution to be normal we can know what percentage of scores are contained between SDs • 1 SD = +-34% • 2 SD = +-47.5%

EDUC 502: Introduction to Statistics

EDUC 502: Introduction to Statistics

Presentation Transcript

For Chapter Statistics Administrators

Introduction to Matlab

STATISTICS 542 Introduction to Clinical Trials SAMPLE SIZE ISSUES

Probability and Statistics with Reliability, Queuing and Computer Science Applications: Introduction

Why Statistics?

Descriptive Statistics Introduction to Summary Statistics

Spatial Statistics III

Review of Statistics 101

Chapter Eight: Using Statistics to Answer Questions

Statistics 542 Introduction to Clinical Trials Issues in Analysis of Randomized Clinical Trials

Bivariate Statistics and Linear Regression

Summarizing Measured Data

As much as I can say about Statistics in 60 minutes …

Statistics

Lectures ( Biostatistics)

Statistics and Modelling Course

Descriptive Statistics

BASIC STATISTICS For the HEALTH SCIENCES Fifth Edition

Adventures in S ocial Stochastics

Chapter 1: Looking at Data: Distributions

Univariate Statistics

Computational Statistics – Graphical and Analytic Methods for Streaming Data