250 likes | 419 Views
Factors Affecting Test Results Cultural Considerations. Class 4 February 9, 2004. Factors That Affect Test Results. What factors may have influenced Malcolm’s test results? Factors Related to the Test Factors Related to the Examiner Factors Related to the Client (Malcolm)
E N D
Factors Affecting Test ResultsCultural Considerations Class 4 February 9, 2004
Factors That Affect Test Results • What factors may have influenced Malcolm’s test results? • Factors Related to the Test • Factors Related to the Examiner • Factors Related to the Client (Malcolm) • Factors Related to the Testing Situation • Other Factors
“Psychological tests are tools…. Any tool can be an instrument of good or harm, depending on how it is used.” • Anastasi & Urbina (1997)
Test Bias • Group differences do occur on some tests • In almost every case, differences within groups are much larger than differences between groups; however, group differences do occur • A test can only be said to be biased if some systemmatic source of error leads the test results to be unreliable or invalid for a particular group • Example: Imagine we are trying to use the SATs to predict college GPA. The SATs could be said to be biased if they predicted differently for two groups • Two types of predictive bias: intercept bias and slope bias • Both types indicate that the test functions different for the different groups
Intercept Bias GPA SAT Scores
Slope Bias GPA SAT Scores
Lack of Test Bias • Group differences on tests, however, may reflect actual differences between groups • For example, a test that found higher rates of depression or eating disorders in women compared to men would (probably) not be considered biased • A test that found differences between students who studied and students who did not would not be biased • To return to our SAT/GPA example, the SATs would not be considered biased if differences were found between Group A and Group B on SAT score if those performance differences accurately predicted GPA for both groups
Test Fairness • A test can be used in a fair or unfair manner • Using a test in an unfair manner involves using the test in a way which discriminates against a particular group (e.g., denying them access to services) • A test that is used unfairly does not necessarily imply that the test is biased • A test which reflects actual group differences and is not biased may still be used unfairly • For example, imagine a clinician only diagnoses people with Major Depression if their BDI score is two or more SD above the mean. A man with a BDI score much higher than most men may still not score two SD above the overall mean, because men have lower depression scores than women. This would be an unfair use of a non-biased test • Unfair uses of test often have to do with cut-offs
Cultural Factors • Why might differences between groups exist? • Tests that have been normed on a particular culture may not be appropriate for another culture for many reasons: • Language differences • Differences in understanding of or familiarity with the construct being tested • Differences in base rates or tolerance of a phenomena • “Test savvy” - people who are used to being tested in a particular way perform better on those types of tests • Examiner characteristics - examinees perform better when the examiner is of the same ethnicity and gender as them • Expectancies - examinees perform worse when they are “expected” to achieve low scores
Activation of Stereotypes • Stereotypes exist regarding test performance of one group relative to another • E.g., women are worse at math than men, Asian-Americans are better at math than Caucasians • When groups are tested, differences between them are small or non-existant, and differences within groups are much bigger • However, when examinees are “reminded” of the stereotype, group differences become much bigger, due to the poor performance of the group “expected” to do worse • E.g., women perform significantly worse as a group on tests of math when the stereotype is activated
Social Desirability • People often rate themselves in a way that they believe casts a favorable light on them • They may try to appear more “normal”/less pathological • May exaggerate the extent to which they possess certain desirable traits or may claim to perform behavior consistent with a desirable trait more often than they do • Can occur in the reverse • If for a particular person, seeming “abnormal” is desirable, person may rate themselves more abnormal (example: number of sexual partners) • This may include conscious and unconscious attempts to rate themselves favorably • This occurs even on (some argue especially on) anonymous surveys
“Above Average” Effect • Related to social desirability • Though people are fairly accurate at rating how often or how intensely they perform a specific behavior, people are less accurate on ambigously defined traits (e.g. “intelligence”, “niceness”) especially if the traits are desirable • People tend to dismiss times when they do not behave that way and remember times they behaved that way • Examples: • 93% of people rate themselves as happier than average • 89% of people rate themselves as smarter than average • When asked who would get into heaven, 87% of people chose themselves, followed by Mother Theresa (79%)
Faking “Good” or “Bad” • Trying to appear less “abnormal” on a test than one actually is is referred to as minimizing or faking good. Trying to appear more pathological is referred to as malingering or faking bad • Many but by no means the majority of tests now contain validity scales • For example, to assess faking good, tests ask questions which sound good but indicate the person is trying to appear more virtuous than they are (e.g. I always read every editorial in the newspaper) - called a lie scale • To assess faking bad, tests include questions which sound pathological but which are infrequently endorsed by anyone (e.g., I have never had any hair anywhere on my body) - called an infrequency scale
Flynn Effect • After tests have been in use for a while, scores tend to go up across the board (IQ scores, psychopathology scores) • Also, once tests are re-normed, test scores drop • Magnitude is small (e.g., ~3 points on IQ tests; 1-5 points on clinical measures depending on the measure) but can make a difference • What are some ways in which the Flynn Effect might help (or hurt) someone’s chances of obtaining services?
Is This You? • You have a need for other people to like and admire you, and yet you tend to be critical of yourself. While you have some personality weaknesses you are generally able to compensate for them. You have considerable unused capacity that you have not turned to your advantage. Disciplined and self-controlled on the outside, you tend to be worrisome and insecure on the inside. At times you have serious doubts as to whether you have made the right decision or done the right thing. You prefer a certain amount of change and variety and become dissatisfied when hemmed in by restrictions and limitations. You also pride yourself as an independent thinker; and do not accept others' statements without satisfactory proof. But you have found it unwise to be too frank in revealing yourself to others. At times you are extroverted, affable, and sociable, while at other times you are introverted, wary, and reserved. Some of your aspirations tend to be rather unrealistic.
Barnum Effect • An effect which is important for personality tests, especially, is the Barnum Effect (named for Barnum’s ability to sell anything) • Also called the Forer Effect, this refers to accepting broad, vague statements as being uniquely true for one person • E.g., “You can be highly emotional under very stressful circumstances”, “You enjoy being with other people, but there are also times when you’d rather be alone” • These statements apply to almost everyone • Many personality inventories (and psychological reports) contain these types of statements, which are then interpreted as uniquely true for that person
Applications • Sondra, an African-American woman, is referred for evaluation to the Schizophrenia Treatment Center. Ten years ago, the STC developed their own test of schizophrenia (STCTS; scored via T scores), which correlates with lifetime psychiatric hospitalizations at .90. Sondra is evaluated by Dr. White, who determines that she scores a 69 on the STCTS (just below the cut-off of 70). Dr. White diagnoses Sondra with “prodromal schizophrenia”. What factors that may affect test results are involved in this case?
Applications • Jeannie’s mother is very concerned that her daughter is depressed. When Jeannie’s mother completes a depression rating scale regarding her daughter, Jeannie scores in the moderately depressed range. However, when Jeannie completes a self-report children’s depression scale, she scores in the normal range, and in fact does not endorse a single symptom of depression. What might be going on?
Applications • Group differences emerge on all IQ tests, such that the mean for most minority groups is .5 SD below the mean for Caucasians (though the mean for Asian-Americans is .3 SD above the mean for Caucasians) • What might account for these group differences? • Is this an issue of test bias, test fairness, both, or neither? • What further information do we need to be able to answer these questions????