240 likes | 419 Views
Overview. Central Limit Theorem The Normal Distribution The Standardised Normal Distribution Z Scores Estimation Confidence Intervals. Central Limit Theorem. The sampling distribution of the means of samples becomes normal as the sample size increases.
E N D
Overview • Central Limit Theorem • The Normal Distribution • The Standardised Normal Distribution • Z Scores • Estimation • Confidence Intervals
Central Limit Theorem • The sampling distribution of the means of samples becomes normal as the sample size increases. • The sampling distribution of the mean for sufficiently large samples will be normal (N>30). • The sampling distribution of the mean will always be normal if the underlying population is also normal irrespective of sample sizes. • For small samples (N< 30) taken from a normally distribution population, we know the form of the sampling distribution of the mean.
The Normal Distribution • The normal distribution is a specific distribution with a particular shape • It is mathematically defined by the expression • This defines the probability x with respect to two parameters, the mean and the variance of a population of scores
Standardised Normal Distribution • We can define a normal distribution of standardised scores in the following way: • where • z is known as a standardised score
Standardised Normal Distribution • Using calculus (specifically integration) we can calculate the area under a standardised normal distribution • This tells us the proportion of scores that fall under a particular area of the distribution. • We can use tables of standardised scores to estimate the probability of a particular score occurring in any set of data
Using Z Scores • To use the standardised normal distribution we must adopt the basic assumption that the population of scores is normally distributed • What proportion of IQ scores are greater than 125? • IQ scores are normally distributed with a mean of 100 and a standard deviation of 15
Using Z Scores • We have to calculate the z score associated with an IQ of 125. • To do this we calculate the difference between the mean and the score and divide by the standard deviation • We obtain:
Using Z Scores • Now we look at the tables to find what proportion of scores are beyond the z score of 1.67. • Looking at the table entry for z = 1.67 we find that 0.04745 of the area of the curve lies beyond the z value of 1.67. • If we multiply this by 100 we obtain the percentage of scores that lie beyond this value. • In other words 100 x 0.04745 = 4.745% of the population have an IQ of greater than 125
Using Z Scores • What proportion of IQ scores are less than 60? • We have to calculate the z score associated with an IQ of 60. • First calculate the difference between the mean and the score and divide by the standard deviation • We obtain:
Using Z Scores • Now we look at the tables to find what proportion of scores are beyond the z score of -2.67 • Looking at the table entry for z = 2.67 we find that 0.00378 of the area of the curve lies beyond the z value of 2.67. • If we multiply this by 100 we obtain the percentage of scores that lie beyond this value. • In other words 100 x 0.00378 = 0.378% of the population have an IQ of less than 60
Using Z Scores • What proportion of scores lie between 85 and 115 on the IQ scale • To do this we calculate the difference between the mean and the score and divide by the standard deviation for both points • We get:
Using Z Scores • Now we look at the tables to find what proportion of scores are between the mean of the distribution and -1.00 and 1.00 respectively. • Looking at the table entry for z = -1.00 we find that 0.34134 of the area of the curve lies between the mean the z score -1.00. • Looking at the table entry for z = 1.00 we find that 0.34134 of the area of the curve lies between the mean the z score -1.00. • The total proportion is the addition of the two values, i.e. 0.34134+0.34134=0.68268 • In other words 100 x 0.68268 = 68.268% of the population have an IQ of between 85 and 115.
Estimation • Most of the time we do not know about population parameters • We would like to be able to make a judgement about the population parameters • In parametric statistics we can make "best guess" judgements about the parameters of populations • These "best guesses" are known are estimates
Point & Interval Estimation • There are two kinds of estimates, point and interval. • With point estimates we attempt to assign a particular value to a population parameter such as the mean or the variance • With interval estimates we try and construct a range in which the population parameter might fall and to which we can attach a probability
Point Estimates • For an estimate to be considered a good estimate then it must be unbiased, sufficient and consistent. • Unbiased • The mean of the sampling distribution is equal to the population parameter being estimated. • Sufficient • The statistic on which the estimate is based uses all the information in the sample. • Consistent • Based on a statistics whose accuracy increases as sample size increases.
Measures of Centre • All measures of centre i.e. mean, mode and median are unbiased measures of their respective parameters. • The mean, however, is the only one of the sample statistics which is both sufficient and consistent.
Measures of Spread • Both the variance and the standard deviation are biased estimates of the population parameters s and s2 • The mean of the sampling distribution of the variance is too small as an estimate of the population variance by a factor of: • so that • is an unbiased estimate of s2
Measures of spread • Too distinguish the sample variance and the sample based estimate of the population variance we will refer to the sample based estimate as: • Similarly the sample based estimate of the population standard deviation is referred to as:
Standard Error of the Mean • The population standard error of the mean is defined as: • The sample based estimate of the population standard error of the mean is defined as:
Interval Estimates • Interval estimates are calculated on the basis of three factors: • A point estimate for the parameter • A measure of spread in the population • A probability value
Confidence Intervals • Suppose that we have tested the IQ of a number of subjects in an experiment. • We are going to calculate the 95% confidence interval for the population mean, µ • First we have to compute the population standard error of the mean: • The standard error of the mean of the population tells us how much we can expect the population mean to fluctuate.
Confidence Intervals • Assuming the population is normally distributed means that the sampling distribution of the mean is also normally distributed (central limits theorem). • Now define an area of the sampling distribution that should contain the middle 95% of the possible sample means. • In order to do this we must use the standardised formula as applied to the sampling distribution:
Confidence Intervals • If we look at the tables of z values we can find that the centre based z scores that include 95% of the distribution is equal to ±1.96. • The upper limit of our range of values for the population mean is: • The lower limit of our range of values for the population mean is:
Confidence Intervals • In general terms, the formula for the 95% interval for µ is: • For any level of confidence, ranging from 0 to 99.999% we have: