Psych 5510/6510

Psych 5510/6510 Chapter Five Simple Models: Statistical Inferences about Parameter Values Spring, 2008

Contents • Understanding what we are doing • Computing PRE and estimating η² • Determining whether the PRE is statistically significant (i.e. whether the additional parameter of Model A is ‘worthwhile’). • Confidence interval of the parameter. • PRE as effect size. • Power

1) Understanding What We Are Doing

Example The mean score on a math test for the last several years among third graders at some school has been 65. A third grade teacher tries a new teaching method, the 15 students in her class earn a mean score of 78.1 on the test. The principal would like to determine whether the population represented by those 15 students (i.e. the population of students who may be taught with the new teaching method) has a mean that is different than 65.

Single-group t test(from last semester) The hypotheses (two-tailed) for the population of students taught with the new method. H0: μ = 65 HA: μ  65 Draw the sampling distribution of the mean assuming H0 true. Set up rejection regions, d.f. = N-1, tcritical=±2.145 (for two-tailed test)

t test (continued) Y=72, 52, 93, 86, 96, 46, 55, 74, 129, 61, 57, 115, 79, 89, 68 Reject H0

Doing the same thing with our new model comparison approach Advantages • Uses an approach that generalizes to other experimental designs. • Gives us an estimate of the size of the effect. • Impresses your friends

Quick Review Yi = Ŷi + ei • Simplest model, no parameters: Ŷi = B0where B0 equals some constant, this is not considered to be a parameter in this approach as it is not estimated from the current data. • Next simplest model, one parameter that makes a non-conditional prediction (makes the same prediction for everyone). Ŷi = β0 where β0 equals μ, the estimate of μ will come from the sample.

Our Models MODEL C: (Compact model) Ŷi = B0 whereB0 = 65 PC=0 MODEL A: (Augmented model) Ŷi = β0 where β0 = μ PA=1

Hypotheses Model C: Ŷi = B0 whereB0 = 65 Model A: Ŷi = β0 where β0 = μ We will start off with a two-tailed test. There are several equivalent ways of expressing our hypotheses. We could use: H0: β0 = B0 HA: β0B0 In the model comparison approach it is always the case that if H0 is true then Model A would be the same as Model C.

Hypotheses Model C: Ŷi = B0 whereB0 = 65 Model A: Ŷi = β0 where β0 = μ H0: β0 = B0 HA: β0B0 Given that B0 = 65 and β0 = μ we could also state the hypotheses as follows (which is how they appear in the t test for a single group mean): H0: μ = 65 HA: μ  65

Hypotheses Model C: Ŷi = B0 whereB0 = 65 Model A: Ŷi = β0 where β0 = μ Remember that η2 represents that actual reduction in error from moving from Model C to Model A in the population from which we are sampling. One way of expressing our hypotheses that works for every use of the model comparison approach is: H0: η2=0 HA: η2>0

Hypotheses This gives us at least three equivalent ways of expressing the hypotheses of this experiment. H0: β0 = B0 HA: β0B0 H0: μ = 65 HA: μ  65 H0: η2=0 HA: η2>0

What We Are Doing Model C: Ŷi = B0, whereB0 = 65 Model A: Ŷi = β0, where β0 = μ If we find it is ‘worthwhile’ to go to Model A then we would be saying that it is better to use the mean of the population than it is to use ’65’ as our model. This implies that the mean of the population must not be 65 (which is what we are trying to determine).

How We Are Doing It Model C: Ŷi = B0, whereB0 = 65 Model A: Ŷi = β0, where β0 = μ To determine whether it is worthwhile to move to Model A we need to examine the error that results from applying each model to our sample. For Model A, however, we do not know the actual value of μ, so we will estimate the value of μ using the data from our sample. Model A: Ŷi = b0 where b0 = est. μ = mean of our sample =78.1

Q: Does Model A reduce error enough to be worthwhile? This will be determined by whether or not the PRE is statistically significant.

Why SSE(A) is Always  SSE(C) Look at the formula for SSE(A), remember from last semester that the sample mean will always lead to the smallest possible SS. If the constant proposed in Model C equals the sample mean then SSE(C) would equal SSE(A), otherwise SSE(A) will be smaller than SSE(C).

Continuing This Thought Model C: Ŷi = B0, whereB0 = 65 Model A: Ŷi = β0, where β0 = μ But we actually test: Model A: Ŷi = b0 where b0 = est. μ = the sample mean Our hypotheses are H0: β0 = B0 or μ = 65 HA: β0B0 or μ  65 It might be that H0 is true, and that μ = 65, if the sample mean doesn’t equal 65 due to chance then SSE(A) will still be less than SSE(C) and thus PRE will be greater than 0. This is one reason why we need to use significant testing to determine whether the PRE is greater enough than zero to reject H0.

2) Computing PRE and Estimating η²

Definitional Formulas for SS

Definitional Formulas (cont.) You can get SPSS to do these formulas but it is tedious. You need to have SPSS create a new variable that equals the actual Y scores minus the predicted scores, then have SPSS create a new variable that equals the square values of the previous variable, and then finally have SPSS give you the sum of that variable.

Definitional Formulas (cont.) While there is no reason to use the right-most part of this formula it does shed light on SSR. For each subject you subtract what Model A predicts their score will be from what Model C predicts their score will be, square those, and add them up. From this we can see that the more different Model A is from Model C the more Model A reduces error (which is what SSR measures).

Computational Formulas While SPSS will often give us what we need this semester, it does not directly provide the values we need for this particular use of the model comparison approach (performing the equivalent of the t test for a single group mean). SPSS can still be used, however, to do most of the number crunching for us.

Computing SSE(A) SSE(A) in this case is what we simply called the ‘SS’ last semester, the sum of the squared deviations from the mean. SPSS won’t give us the SS of a variable but it will give us the ‘variance’ of the variable (actually this is the estimate of the population variance based upon the sample).

Computing SSE(A) If we ask SPSS to find the ‘variance’ of variable Y (this is available through the ‘Descriptive Statistics’ item in the ‘Analyze Data’ menu) we find the variance of the Y scores equals 558.38, as N=15, we find:

Computing SSR SSE(C) is not so easy to compute but we can get there by first computing SSR, which is easy to compute. In this context (doing the equivalent of a t test for a single group) the formula for SSR reduces to:

Computing SSR In our example SSR equals:

Computing SSE(C) Now that we have SSE(A) and SSR finding SSE(C) is easy. Since SSR=SSE(C)-SSE(A), then: SSE(C) = SSE(A) + SSR SSE(C) = 7817.28 + 2574.2 = 10391.48

Computing PRE Model A led to about a 25% reduction in error compared to Model C.

Estimating η² PRE measures how much error was reduced by Model A in our sample. PRE is a biased estimate of how much Model A would reduce error if applied to our population (η²). PRE tends to be greater than η² . The following formula gives us an unbiased estimate of η² based upon our sample.

Unbiased Estimate of η² Note the last piece in the adjustment. The adjustment becomes bigger when PA is a lot larger than PC (i.e. when PA adds lots of extra parameters), and the adjustment also becomes bigger when PA and PC approach n (approach the maximum amount of parameters we can have). And finally, note that when PA=n, we divide by zero (which is undefined), which makes sense as PRE will always equal 1 if n=PA, so there is no way to estimate the true value of η².

Our Computations So Far

3) Determining whether the PRE is statistically significant (i.e. whether the parameter of Model A is ‘worthwhile’).

Worthwhileness Model C: Ŷi = B0 whereB0 = 65 Model A: Ŷi = β0 where β0 = μ Moving from Model C to Model A reduced error by about 25%. If that reduction is statistically significant then we will reject Model C in favor for Model A, saying that the parameter of Model A is ‘worthwhile’ to add to our model, that a model which uses the mean of the population is better than a model that uses a value of 65, which would imply that the mean of the population must not be 65.

Testing the Statistical Significance of the PRE Three equivalent methods: • See if the PREobtained exceeds the PREcritical. This is the most direct way given the model comparison approach. • Change the PRE into a value of Fobtained(using what I call the ‘PRE to F Method’), and see if it exceeds Fcritical. This approach is the most conceptually clear of the three. • Change the results into Mean Squares and from that compute a value of Fobtained(using what I call the ‘Mean Square to F Method’), and see if it exceeds Fcritical. This approach best fits what is available in SPSS.

Hypotheses (again) Model C: Ŷi = B0 whereB0 = 65 Model A: Ŷi = β0 where β0 = μ We have various ways of expressing the hypotheses of this experiment. H0: β0 = B0 or μ = 65 or η2=0 HA: β0B0 or μ  65 or η2>0 To understand our first approach we will focus on η2

A) Comparing PREobt to PREc H0: η2=0 (There is no real reduction in error from Model A, the extra parameter of Model A is not worth incorporating into our model) HA: η2>0 (There is a real reduction in error from Model A, the extra parameter of Model A is worth incorporating into our model) So we need to look at the PRE from the sample to see if it is large enough to conclude that η2 (i.e. the PRE in the population) is actually greater than zero. But we know that PRE probably won’t exactly equal η2 and in fact is usually greater than η2, so we need to determine what value of PRE (and above) is there only a 5% chance of obtaining if H0 is true.

PREcritical Values See the handout on PRE critical values PA=the number of parameters in MODEL A=1 PC=the number of parameters in MODEL C=0 N=the number of observations=15 PREcritical =.247 If H0 is true (η2=0) then there is a 95% chance that PREobt will be between 0 and .247, there is only a 5% chance that PREobt will be .247 or above. If PREobt PREcritical then reject H0 (p<.05) PREobt= .248 , so we reject H0, we conclude that it is worthwhile to move to Model A, which in this context means it is worth estimating μ, for it’s not what was proposed by Model C (i.e. μ is not 65).

Using the PRE Tool We could, instead, use the ‘PRE Tool’ and plug in the values for PC, PA, N, and PRE we find that the p value for a PRE of 0.248 is p=0.0496, which is less than or equal to our significance level of .05, so we reject H0.

PRE and F* For the second, equivalent approach to testing the statistical significance of the PRE we will translate the PRE value into an F value. Note: the book uses ‘F*’ to represent Fobtained. F* and PRE are related, to know one is to know the other (and to love one is to love the other).

B) PRE to F* Approach First, calculate the PRE per parameters added.

Second, calculate the ‘remaining proportion of error per parameters remaining’. (1-PRE): remaining proportion of error (after incorporating MODEL A). (N-PA): maximum number of possible parameters that could be added after incorporating MODEL A. Remaining proportion of error per parameters remaining:

F* If adding parameters to MODEL A only helped as much as we would expect due to chance, then the numerator will equal the denominator and F* will approximately equal one*. If the parameters helped more than chance then the numerator will be greater than the denominator and F* will be greater than one. * (actually F should equal (n-PA)/(n-PA-2)  1)

Our Example Obtain Fcritical from the F critical table, or from the ‘F Tool’, with: d.f.numerator=(PA-PC)=1, and d.f.denominator=(n-PA)=14 Fcritical=4.60, F*=4.62, So reject H0. (p<.05) In the ‘F Tool’ p=0.0496. Same p value as we found for the PRE

C) Mean Squares to F* Approach Note MSresidual a.k.a MSerror

ANOVA summary table Exact p value available through the F Tool, with a table you could just say p<.05.

Summary All three approaches (looking up the PREcritical value, turning the PRE into an F* value, or computing MS’s) lead to exactly the same conclusion to reject H0 with p=.0496. Thus your decision would be to ‘reject H0’. If there are no confounding variables then you can conclude that the extra parameter of Model A is worthwhile, and in turn that the population of students taught using the new method has a different mean than those taught using the old method (i.e. that μ  65).

If p had been greater than .05, you would have decided to ‘not reject H0’, and said that you were unable to show that the extra parameter of Model A was worthwhile, that you were unable to show that the mean of the population of students taught using the new method differs from 65. Remember that you should not infer that H0 is true, only that you failed to reject it. You have not proven that the parameters of Model A are worthless, you have only failed to prove they are worthwhile. Showing that you have a great deal of power in the experiment—on the other hand-- might allow you to infer that H0 is true.

Performing a One-Tail Test The approach we have used so far is for a two-tailed hypothesis: H0: β0 = B0 or μ = 65 HA: β0B0 or μ  65 We can, with a little extra work, also test a one-tailed hypothesis:

One-Tailed Hypotheses Remember that HA states the theory we are trying to prove. Thus if the theory predicts that math scores will increase, then: H0: β0B0 or μ  65 HA: β0>B0 or μ > 65 If the theory predicts that math scores will decrease, then: H0: β0B0 or μ  65 HA: β0<B0 or μ < 65

Psych 5510/6510

Psych 5510/6510

Presentation Transcript

Zernike Polynomials and Their Use in Describing the Wavefront Aberrations of the Human Eye

Descriptive Methods & Ethical Research

CHILD MALTREATMENT IDENTIFICATION 1

Components of a Therapeutic Relationship

Chapter 11

Introduction to PsychToolbox in MATLAB

Theories of Motivation Hunger Motivation Eating Disorders

Root = bio, vit

AP PSYCH UNIT II

Sea Lions and Parrots: Smaller Brains, Equivalent Abilities

Psych 230 Psychological Measurement and Statistics

Computers and Ape Language

MULTIPLE SCLEROSIS AND NEUROPSYCHOLOGICAL FUNCTIONING: MANAGING COGNITIVE DEFICITS

Dolphin Cognition

Substance Use Disorders

Psych 5510/6510

Psych 5510/6510

Presentation Transcript

Zernike Polynomials and Their Use in Describing the Wavefront Aberrations of the Human Eye

Descriptive Methods &amp; Ethical Research

CHILD MALTREATMENT IDENTIFICATION 1

Components of a Therapeutic Relationship

Chapter 11

Introduction to PsychToolbox in MATLAB

Theories of Motivation Hunger Motivation Eating Disorders

Root = bio, vit

AP PSYCH UNIT II

Sea Lions and Parrots: Smaller Brains, Equivalent Abilities

Psych 230 Psychological Measurement and Statistics

Computers and Ape Language

MULTIPLE SCLEROSIS AND NEUROPSYCHOLOGICAL FUNCTIONING: MANAGING COGNITIVE DEFICITS

Dolphin Cognition

Substance Use Disorders

Descriptive Methods & Ethical Research