1 / 63

PSY 6430, Unit 7 Survey of tests

Schedule Monday and Wednesday: Lecture Monday, 4/10: Exam. PSY 6430, Unit 7 Survey of tests. SO2: The most important source for tests. Mental Measurements Yearbook (1938) Now in its 20 th edition, updated in 2016

nharper
Download Presentation

PSY 6430, Unit 7 Survey of tests

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Schedule Monday and Wednesday: Lecture Monday, 4/10: Exam PSY 6430, Unit 7Survey of tests

  2. SO2: The most important source for tests • Mental Measurements Yearbook (1938) • Now in its 20th edition, updated in 2016 • You can access it online for free through our library, the web site address is in the study objectives • Szostek & Hobson (2011) • I included this article so you can get an idea of how important this resource is from a legal perspective. Courts have acknowledged it as the “bible of testing” and the “authoritative source on testing” • I also included it because it uses the Myers-Briggs Inventory as an example of an extremely popular, yet really bad selection instrument (Just a brief mention of the MMY – you should always check the reviews for any off-the-shelf test an organization is planning on using – not going over the other study objectives on the article; and yes, it’s a sad thing the author spelled It wrong and the editors did not catch it. I am spelling it correctly)

  3. SO4: Myers-Briggs Type Indicator • MBTI • Personality inventory that classifies individuals into one of 16 types of personalities, using major categories of extraversion/ introversion, intuition/sensing, feeling/thinking, perception/judgment • 89 of Fortune 100 companies have used it for selection and/or promotion • Over 2 million were administered last year • Just about every executive in business knows his/her MB scores/indicators • It has no validity or reliability!! • In terms of reliability, if you retake the inventory after 5 weeks, there’s a 50% chance that you will fall into a different personality category (and, remember, validity is bounded by reliability) • And, here is the clincher – it is based on Jungian psychology. No psychologist worth his/her salt takes Jungian psychology seriously any more!! (how many of you have heard of this? Magical mystery tour)

  4. SO4: Myers-Briggs Type Indicator • This a perfect example of companies using selection procedures that are bunk, yet are very, very popular. • Just because something is popular and used by Fortune 100 companies doesn’t mean it has any professional merit whatsoever. Interestingly, the owners and creators of the MBTI state that “It is unethical to use the MBTI tool for hiring….The MBTI tool can’t tell you who to hire, but it can help you work with your team so that everyone gives his or her best performance.” https://www.cpp.com/pdfs/MBTI_myths_infographic.pdf

  5. SO8 Intro: Achievement vs. Aptitude TestsAn Arbitrary Distinction • For decades, tests have been classified as either achievement tests or aptitude tests • Definitions of “achievement” and “aptitude” • Achievement The act of accomplishing or finishing something successfully, especially by means of skill, practice, or perseverance • Aptitude A natural or acquired talent or ability or inclination; quickness in learning and understanding - intelligence • The distinction represents the mind-body dualism typical of traditional testing (in this material, GFB argue that the terms “achievement test” and “aptitude test” are inappropriate and should be replaced with the term “ability test” - and I agree with them; excellent material)

  6. SO8A, NFE: Typical distinction between achievement and aptitude tests • Achievement tests (supposedly) measure • What a person learned as a result of a specific structured educational/training experience/course • Scores are interpreted to be a measure of how much an individual knows as a result of the education or training • English grammar, math, science, social studies, etc. • These are the types of tests used in grade school and high school to measure student learning/proficiency • In Michigan, MEAP tests: Michigan Educational Assessment Program

  7. SO8A, NFE: Typical distinction between achievement and aptitude tests • Aptitude tests (supposedly) measure • Accumulation of learning from a number of diverse and usually informal learning experiences • Although not emphasized by GFB, there is a genetic implication • You have artistic ability or you don’t • You have mechanical aptitude or you don’t • Women don’t have an aptitude for math • Men don’t have good spatial aptitude, thus they can’t find their way around the mall • Said to measure potential to learn, or the potential to develop new skills and acquire new knowledge • If you don’t have the aptitude you can’t be a good artist, mechanic, mathematician • Intelligence tests, SATs, GREs, Artistic Aptitude • These are the tests that you are told you can’t study for (hog wash - most people don’t say that any more) (Olympic athletes and musicians have “natural” ability – then we learn the parent was an Olympic athlete or musician – both parents were musicians…)

  8. SO8B&C, FE: Why is the distinction arbitrary? • All tests measure what a person has learned up to the time he or she takes the test and that is the only thing a test can measure • They cannot and do not measure innate or unlearned potential (even if that existed) • Thus, the distinction between achievement tests and aptitude tests is arbitrary and • We should use the term “ability tests” for both types of tests • Ability in the sense of competence or proficiency, regardless of how you have acquired the ability/skill

  9. SO8D: Tests can still be used to predict how well someone will perform. Why? • Tests can and do measure the prerequisites that are necessary for further learning in an specified area, and thus can predict future learning/performance • If students do not do well in PSY 3600, Concepts and Principles of Behavior Analysis, they cannot do well in PSY 4600, Survey of Behavior Analysis Research, thus a student’s grade in PSY 3600 can predict his or her performance in PSY 4600 • You can’t balance an equation in chemistry unless you know algebra, thus a test of algebra can predict performance in a chemistry class (not in text, but important to understand)

  10. SO9: Intro, Mental Ability and Cognitive Ability Tests = Intelligence Tests • Mental ability tests were at the center of early critical Supreme Court decisions regarding unfair discrimination • Thus, many companies stopped using them • However, there is a lot of research in selection that indicates that mental ability tests are related to almost all jobs • Validity correlations are often quite high, and higher than other tests • Many companies are now using them again • Remember, however, if you use one of these, you must conduct an empirical validity study (or use validity generalization - risky) (as a behavior analyst, I still have trouble with the term “mental” ability since it still implies mind-body dualism; I’m more comfortable, but not completely with “cognitive” ability; but haven’t been able to come up with anything different that and certainly like those terms better than “intelligence” tests)

  11. SO9: Why is it that all mental ability tests are not interchangeable? • A rose is not a rose is not a rose • A mental ability test is not a mental ability test is not a mental ability test • Mental ability tests measure a collection of abilities - a learned repertoire that typically includes: • Verbal, math, memory, and reasoning abilities • 14 different abilities are often measured in some combination by mental ability tests (next slide) • Different mental ability tests often measure a different set of these abilities • Thus a person may score differently on different tests of mental ability

  12. (NFE) Abilities Measured by Various Mental Ability Tests • Memory span • Numerical fluency • Verbal comprehension • Conceptual classification • Semantic relations • General reasoning • Conceptual foresight • Figure classification • Spatial orientation • Visualization • Intuitive Reasoning • Ordering • Figure identification • Logical evaluation and deduction (that is why if you use the PAQ you must take great care in selecting tests that are similar to the GATB tests that are recommended)

  13. SO11, NFE: Why should these tests be called “mental ability” rather than “intelligence” or “I.Q.” tests? • The term mental ability makes it explicit that these tests measure various cognitive abilities of the applicant (and not some innate, unlearned, hypothetical construct called “intelligence”) • These cognitive abilities are most directly identified by the what is measured (some combination of the 14 abilities listed earlier) and from the content of the items themselves • They should be thought of the same way the other abilities discussed in the book are thought of • e.g., mechanical ability, clerical ability • In other words, the authors are resisting the traditional view that there is something called “intelligence”

  14. (NFE) Popular mental ability tests • The authors describe the Wonderlic Personnel Test which is probably the most popular • Given to all players at the NFL Scouting Combine and scores are reported to NFL teams before the annual draft • For a moment, look at items in the text that are similar to the ones on the Wonderlic Personnel Test

  15. Examples of items similar to those on the Wonderlic 1. Which of the following months has 30 days? (a) February (b) June (c) August (d) December 2. Alone is the opposite of: (a) happy (b) together (c) single (d) joyful 3. Which is the next number in this series: 1, 4, 16, 4, 16, 64, 16, 64, 256, (a) 4 (b) 16 (c) 64 (d) 1024 (Two slides - Note: all six items are different types of items: general knowledge, opposites - verbal comprehension and vocabulary, numerical reasoning and ordering)

  16. Example items similar to those on the Wonderlic 4. Twilight is to dawn as autumn is to: (a) winter (b) spring (c) hot (d) cold 5. If Bob can outrun Junior by 2 feet in every 5 yards of a race, how much ahead will Bob be at 45 yards? (a) 5 yards (b) 6 yards (c) 10 feet (d) 90 feet 6. The two words relevant and immaterial mean: (a) the same (b) the opposite (c) neither same nor opposite (again, notice the type of questions: semantic or verbal reasoning, numerical fluency/reasoning, verbal comprehension - opposites)

  17. SO12: Validity of mental ability tests • What have the validity studies uniformly concluded? Mental ability tests are among the most valid of all selection instruments (work samples are the only tests that seem to be as valid, recent data suggest they have just as much adverse impact; next slide on validity of mental ability tests as well)

  18. SO13: Back to validity generalization • The validity correlations for both mental ability tests and other types of tests are highly stable across organizations (and, next slide)

  19. SO14: Validity of mental ability tests • Differences in the actual tasks that a person performs as part of a job have very little effect on the magnitude of the validity coefficients for mental ability tests • In other words, mental ability tests are valid predictors of performance for a wide variety of jobs

  20. SO15: Uniform Guidelines vs. validity generalization data • The data from the more recent meta-analyses conflict with the Uniform Guidelines • The Uniform Guidelines are based on situational specificity of validity: that is, that local validity studies are required • The meta-analysis studies indicate that is not correct, supporting validity generalization • Following from that, the requirement for local validity studies is not appropriate • However, courts follow the Uniform Guidelines and past court decisions, thus • It is still legally risky to use validity generalization, particularly given the language in CRA of 1991 • Uniform Guidelines need to be updated (a very important point: major collision of legal vs. professional)

  21. A problem with mental ability tests • Mental ability tests have repeatedly been shown to have adverse impact on protected classes, particularly blacks and Hispanics • This led to the notion that these types of test might have differential validity - next

  22. SO16: Differential Validity • 16A: What is meant by differential validity? • Notion/hypothesis that tests are less valid for minority groups than for non-minorities • That is, a test may be significantly more valid for whites than for blacks • Term is related to test bias regarding ability tests, particularly mental ability tests • This claim is made over and over again with respect to SATs and GREs - that those tests are more predictive of the performance of white students than they are of the performance of minority students (extremely important; and mentioned often in selection as well as admissions to colleges and universities,- and is still very controversial)

  23. SO16B: Differential Validity, the argument • The argument is that the content of ability tests is based on content/items related to the white middle-class (e.g., vocabulary and grammar), and thus the scores of the minorities are lower than what they should be

  24. SO16C: Differential Validity • The data are very clear about this issue Differential validity does not exist • That is, tests are equally valid for whites and other ethnic/racial groups • It makes sense • Verbal comprehension skills are verbal comprehension skills • Verbal reasoning skills are verbal reasoning skills • Math skills are math skills, etc. • Thus if any of these skills are required by the job, they should be “equally required” by whites and members of other ethnic/racial groups

  25. SO17: Cognitive ability tests -Differences among demographic groups • Meta-analyses have been consistent – there are significant differences in mean test scores among racial/ethnic groups • Ranking: Asians whites Hispanics blacks

  26. (NFE) Cognitive ability tests -Differences among demographic groups • Cognitive ability tests have a high correlation with job performance and academic performance • They have a disproportionate impact on Hispanics and blacks • Often result in adverse impact as legally defined when used for selection (important, difficult issue arises)

  27. SO18: Adverse impact and cognitive ability tests Remember, adverse impact, however, does not mean that unfair discrimination has occurred; if the tests are job related then fair discrimination has occurred • SO18: Three things that make a defense against adverse impact likely: • Their overall validity – they are among the most valid and least expensive tests • Differential validity does not exist • Adverse impact cannot be overcome by using any other measure

  28. SO18, NFE: Inappropriate conclusions from mean differences on test scores • It is not appropriate to conclude from these studies that differences are due to genetic differences • Studies do not address the reasons (the authors want to caution any one making any general conclusions as to why differences exist; particular concern about race-based genetic arguments as advanced in the Bell Curve, published a number of years ago that re-opened the debate about race-based genetic intelligence.)

  29. SO21: Two factors that should be taken into account when deciding whether to use cognitive ability tests • Cognitive ability tests are among the most valid tests for a large number of jobs (and some selection specialists would say for all jobs) • Evidence also indicates that adverse impact is highly likely with these tests (skipping to SO21; cont. on next slide)

  30. NFE: Cognitive Ability Tests • Because they are so valid, some selection specialists believe cognitive ability tests should be used extensively in selection • Some, however, have expressed deep reservation about using them because of the social implications of the disqualification of larger proportions of minorities (very nice discussion of this in text; directly quoting GFB here; cont. on next slide)

  31. NFE: Cognitive Ability Tests • The decision should reflect the values/goals of the organization • If the goal is to maximize individual performance with minimal cost, cognitive ability tests will do this • If the organization has multiple goals of sustaining high performance while maintaining a broad representation of minorities, then it would be better to limit the use of cognitive ability tests and use other, generally more expensive and almost equally valid instruments • biodata inventories (I don’t like these as you will see next unit) • structured interviews • assessment centers *The authors include work samples in their list but in later in this chapter present recent data that indicates work samples appear to have as much adverse impact as cognitive ability tests. (that’s the rub - the expense of those other instruments)

  32. SO19: Diversity and use of cognitive ability tests • If an organization has diversity as a selection goal and wants to use cognitive ability tests because of their validity and the fact that other options are much more expensive, what is the main/best option? Vigorous recruitment of minority applicants (now back to SO19: remember race norming is not legal; often a problem because selection specialists are typically not the ones who are responsible for recruitment –selection specialists really need to work with the HR staff)

  33. SO22: (NFE) Popular mechanical ability, clerical, and physical ability tests • The authors describe several very popular tests • Refer to this material if you are ever looking for tests in these categories • I am not going to have you learn anything specific about these tests

  34. Intro Personality Tests • The data and information on personality tests is difficult • For many years, companies used personality tests that were developed by clinical psychologists, and some of those tests are still popular and being used by organizations • One is the California Personality Inventory • Have not had good validity historically • In prior editions of the book, GFB advised against their use • They remain cautious in this one, but “cautiously optimistic”

  35. Intro Personality Tests • There is some good work going on right now, however, the field is in a bit of flux right now • Intuitively we know that “personality” influences how effective a person is at work, we just haven’t tapped into what the relevant KSAs really are, or what the relevant clusters of behaviors are • Even with the recent work, validity coefficients tend to be low, but they do appear to add independent predictive power (above and beyond cognitive ability tests and other types of ability tests)

  36. SO24A: Personality Tests • There is some agreement in the field that personality characteristics can be grouped into five broad dimensions called the Five-Factor Model or Big Five • Conscientiousness • Being responsible, organized, dependable, planful, willing to achieve, and persevering • Emotional stability (only one described in negative terms) • Being emotional, tense, insecure, nervous, excitable, apprehensive, and easily upset • Agreeableness (relevant for team work) • Being courteous, flexible, trusting, good natured, cooperative, forgiving, softhearted, and tolerant • Extroversion • Being sociable, gregarious, assertive, talkative, and active • Openness to experience (also called intellect or culture) • Being imaginative, cultured, curious, intelligent, artistically sensitive, original and broad minded

  37. SO24B: Personality Tests • Good news: to date there has been little or no adverse impact (a) across racial and ethnic groups and (b) between males and females

  38. SO25: Traits as predictors • Two traits have been shown to be universal predictors, that is, valid across jobs • Conscientiousness • Emotional stability • The other three were found to be valid for only a few jobs or specific criteria • Extraversion (managers and training criteria) • Agreeableness (team work) • Openness to experience (training criteria)

  39. SO 26: Personality Tests • If you do use a personality test, you must use a criterion-related validity study to support it because personality traits cannot be directly observed • Concurrent validity • Predictive validity • Validity generalization (in other words you cannot use content validity: also have some legal issues to be aware of)

  40. SO27A: The first of two thorny legal issues with personality tests • If a test can and is used to diagnose mental and psychiatric disorders, then it will probably be considered a medical examination under ADA and can only be administered post-offer (dealt with this in U3 as well) • If it deals with other personality traits (the Big 5, for example) then it probably will not be considered a medical examination although I don’t know how courts would/will handle “emotional stability” as it relates to ADA • Nonetheless, my strong advice to you is to treat every personality test as a medical examination until things are clarified more by the courts • Which means you should only administer personality tests post-offer and keep the results in a file that is separate from the personnel file (include material in italics – this is important)

  41. SO27A: This slide, NFE:First of two thorny legal issues with personality tests • Clarifying court case, 2005, Seventh Circuit Court • MMPI is a medical examination and thus illegal for pre-offer use in selection (certainly that was expected) • Psychological tests that measure personal traits such as honesty, integrity, preferences and habits do not constitute medical examinations

  42. SO27B: Second of two thorny legal issues with personality tests • Right to privacy: Although a right to privacy is not explicitly guaranteed under the U.S. Constitution, individuals are protected from unreasonable intrusions and surveillance • Personality tests, by their nature, reveal an individual’s thoughts and feelings • Several states have laws that explicitly guarantee a right to privacy • To date, litigation has occurred about questions relating to religious views and sexual inclinations, orientation, and identification

  43. SO27B: Right to privacy (this slide, NFE) • Soroka v. Dayton Hudson (1991) • California Court of Appeals stopped Dayton Hudson’s Target stores from requiring applicants for store security positions to take a personality test that contained questions about sexual practices and religious beliefs • The court also stated that employers must restrict psychological testing to job-related questions • The ruling was later dismissed because the parties reached a court-approved settlement • Dayton-Hudson agreed to stop using the personality test • Divided $1.3 million dollars among the estimated 2,500 members of the plaintiff class who had taken the test (last slide on this)

  44. Intro, Performance or work sample tests • Performance or work sample tests are excellent and I highly recommend their use when you can do them • Typing test • Having candidates write a computer program to solve a specific problem • Role playing a sales situation with an applicant for a sales position • Having mechanics trouble shoot a problem with an engine • You are getting an actual sample of behavior under controlled testing conditions (which permits you to easily compare performance across applicants) (this slide NFE)

  45. Performance or work sample tests • From a technical perspective, they have high validity • They reduce two limitations of other selection procedures, and both are related to verbal behavior • Most selection procedures rely heavily on verbal behavior • Written answers to questions (ability tests) • Oral descriptions of abilities/skills (interviews, training and evaluation assessments) (This slide NFE)

  46. SO28(NFE): The two limitations that are reduced • Willful distortion and faking (people want to look good) • This varies dependent upon the selection procedure • Reports about past experiences (interviews, T&Es) where the information is difficult to confirm - most susceptible • Personality and honesty inventories, next susceptible • Ability tests, least susceptible

  47. SO28 (NFE): The two limitations that are reduced • Relationship between verbal behavior and actual behavior is not perfect (as we behavior analysts well know) • Much of our behavior is contingency-shaped, not rule-governed • This is particularly a problem for exemplar performers who are not verbally fluent • Automobile mechanic • Plumbers • Machine operator • It can also be a problem for employees who are exemplar performers but can’t describe what makes them exemplary performers – sales representatives

  48. SO29 : Three limitations of work samples • Difficulty of accurately simulating job tasks that are representative of the job • Applicants must already have the KSAs being tested – they cannot cover specialized things that must be learned on the job • General sales skills OK, but questions that deal with specific company-related products and pricing will not be • Very costly to develop and and often to administer (many must be done one-on-one)

  49. SO29 NFE: Example of a bad, yet common, work sampling test: Stress interviews • Many consulting firms use stress interviews • Stress interviews Interviewer creates a stressful situation, often by asking many questions rapidly, not allowing much time for the applicant to respond, interrupting the applicant frequently, acting in a semi-hostile manner, or in a cool aloof manner • Why bad? • Even if the job is one of high work demands that produce stress, rarely is the situation staged in the interview representative of the actual work demands that produce the stress • In very few jobs, is the stress related to a semi-hostile or cool/aloof stranger rapidly firing questions • The behavior of the applicant doesn’t readily generalize to the job and thus should not be used as a predictor (maybe OK for a press secretary for a politician)

  50. SO30: Performance tests vs. cognitive ability tests, validity, adverse impact, and cost • Validity • They both have high validity: they are two of the most valid types of selection instruments • Adverse impact • Equal adverse impact • Cost • Performance tests cost much more to develop and administer

More Related