Using a Logic-Based Measurement Approach to Measure Cognitive Ability

1. 1

2. 2 Logic-Based Measurement Mid-80�s: Colberg & colleagues codified a taxonomy of logic rules that could be used to generate reasoning test items in mid-1980�s 1990: LBR test used in selection process for 90+ professional and administrative occupations in U.S. federal gov�t Today: Several U.S. federal agencies (e.g., CBP, FBI) At least one Fortune 500 company Several conference papers � but not much in peer-reviewed journals Promising approach -- could be used more widely

3. 3 LBR Item Passages modeled after job materials Materials often incorporate logical reasoning (e.g., if the suspect flees, then the officer shall�..) For each passage, test-taker determines which of several possible conclusions logically followPassages modeled after job materials Materials often incorporate logical reasoning (e.g., if the suspect flees, then the officer shall�..) For each passage, test-taker determines which of several possible conclusions logically follow

4. 4 LBR Alternate Testing Format Fewer passages, more items, more efficient use of testing time because less time spent reading passages and more time spent answering questions. Items are more grammatically complex Conclusion 1: According to the results of the study, offenders who are neither fined nor imprisoned are certain to become repeat offenders. Correct answer: INDETERMINABLE Explanation: The facts do not say anything about the behavior of offenders who are neither fined nor imprisoned. They might become repeat offenders or they might not. Conclusion 2: All study participants who repeated their crimes during the five years that followed the study had been convicted of hacking into corporate financial networks. Correct answer: TRUE Explanation: The conclusion focuses only on participants in the study, all of whom had been convicted for hacking into corporate financial networks. Therefore, all of the repeat offenders mentioned in this fact set had been convicted of hacking into corporate financial networks. Conclusion 3: In the context of computer-related crimes, research has demonstrated that whether one becomes a repeat offender is determined entirely by the severity of punishment. Correct answer: FALSE Explanation: The facts state that severity of punishment is the strongest predictor of whether or not a person repeats a computer-related crime, but do not state that severity of punishment is the only predictor. Therefore, whether or not a person becomes a repeat offender is not necessarily determined entirely by severity of punishment. This statement must be false. Fewer passages, more items, more efficient use of testing time because less time spent reading passages and more time spent answering questions. Items are more grammatically complex Conclusion 1: According to the results of the study, offenders who are neither fined nor imprisoned are certain to become repeat offenders. Correct answer: INDETERMINABLE Explanation: The facts do not say anything about the behavior of offenders who are neither fined nor imprisoned. They might become repeat offenders or they might not. Conclusion 2: All study participants who repeated their crimes during the five years that followed the study had been convicted of hacking into corporate financial networks. Correct answer: TRUE Explanation: The conclusion focuses only on participants in the study, all of whom had been convicted for hacking into corporate financial networks. Therefore, all of the repeat offenders mentioned in this fact set had been convicted of hacking into corporate financial networks. Conclusion 3: In the context of computer-related crimes, research has demonstrated that whether one becomes a repeat offender is determined entirely by the severity of punishment. Correct answer: FALSE Explanation: The facts state that severity of punishment is the strongest predictor of whether or not a person repeats a computer-related crime, but do not state that severity of punishment is the only predictor. Therefore, whether or not a person becomes a repeat offender is not necessarily determined entirely by severity of punishment. This statement must be false.

5. 5 Simpler Format & Reasoning (Sienna Reasoning Test) Low reading load; items are similar to ones that have been used in published tests of reasoning skills � if not explicitly derived from formal rules of logic, then technically not an LBM item. Underlying reasoning skills are arguably the same. Conclusion 1. Correct answer: TRUE Explanation: The first statement says that a GATH is heavier than a SHET. Conclusion 2. Correct answer: FALSE Explanation: According to the second statement, a SHET is heavier than a COUCH. According to the first statement, a GATH is heavier than a SHET. Therefore, a GATH must be heavier than a COUCH. Conclusion 3: Correct answer: FALSE Explanation: The answer is obtained by using your real world knowledge. From your real world knowledge, you know that a COUCH is typically heavier than a LAMP. Low reading load; items are similar to ones that have been used in published tests of reasoning skills � if not explicitly derived from formal rules of logic, then technically not an LBM item. Underlying reasoning skills are arguably the same. Conclusion 1. Correct answer: TRUE Explanation: The first statement says that a GATH is heavier than a SHET. Conclusion 2. Correct answer: FALSE Explanation: According to the second statement, a SHET is heavier than a COUCH. According to the first statement, a GATH is heavier than a SHET. Therefore, a GATH must be heavier than a COUCH. Conclusion 3: Correct answer: FALSE Explanation: The answer is obtained by using your real world knowledge. From your real world knowledge, you know that a COUCH is typically heavier than a LAMP.

6. 6 What�s So Great about LBR Tests? Items replicate logical thought processes required to perform job duties Items involve application of reasoning skills No subject knowledge required No training in formal rules of logic required Keyed answers can�t be disputed

7. 7 What�s so Great about LBR Tests (cont.)? Good internal consistency reliability (~.80) Strong criterion-related validity, higher than for many other measures of cognitive ability ~.60* for training perf, job knowledge, work sample ~.30* for supervisor ratings of job perf Smaller subgroup effect sizes than for some traditional cognitive ability measures (personal communication, Colberg)

8. 8 Cautionary Notes LBR tests often have a high reading load, so important to show that job requires a lot of reading Native speakers earned higher scores than non-native in one large-scale study (still valid for all) Persons from some cultures scored higher than persons from other cultures (still valid for all) Confucian Asian and Nordic/Germanic GLOBE clusters highest, Anglo and Eastern Europe middle; Sub-Saharan Africa lowest GLOBE Study: House et al. � study of leadership in 62 countries, organized into 9 cultural clusters GLOBE Study: House et al. � study of leadership in 62 countries, organized into 9 cultural clusters

9. 9 Criterion-Related Validation Studies U.S. Federal Agency Jobs: Special Agent (Study 1) Intelligence Analyst (Study 2) Design/Sample Concurrent (incumbents) N > 400 per study Criterion measures Job Knowledge Test (JKT) Overall Performance Ratings Composite (supervisor) Each job requires application of reasoning skills as a critical and core part of the jobEach job requires application of reasoning skills as a critical and core part of the job

10. 10 Criterion-Related Validation Studies (cont.) Predictor measures Developed by HumRRO specifically for this client LBR Test Situational Judgment Test (SJT) Biodata/P-E Fit Scales Vendor-provided instrument Sienna Reasoning Test

11. 11 Predictor Psychometric Information SJT involves some level of cognitive ability, so I�m including it here for reference purposes, but don�t intend to spend time talking about it. Each study included a biodata instrument too, but we aren�t showing those results hereSJT involves some level of cognitive ability, so I�m including it here for reference purposes, but don�t intend to spend time talking about it. Each study included a biodata instrument too, but we aren�t showing those results here

12. 12 Focus on corrected results because those are what�s reported earlier for the meta-analysis, and also because they reflect operational validity One take-away � No measure can guarantee that it will minimize mean differences for every subgroup. SRT may be more of a measure of fluid intelligence (has figural reasoning content) which may partially explain the large effect size for age subgroups. Focus on corrected results because those are what�s reported earlier for the meta-analysis, and also because they reflect operational validity One take-away � No measure can guarantee that it will minimize mean differences for every subgroup. SRT may be more of a measure of fluid intelligence (has figural reasoning content) which may partially explain the large effect size for age subgroups.

13. 13 Subgroup Effect Sizes One take-away � No measure can guarantee that it will minimize mean differences for every subgroup. SRT may be more of a measure of fluid intelligence (has figural reasoning content) which may partially explain the large effect size for age subgroups.One take-away � No measure can guarantee that it will minimize mean differences for every subgroup. SRT may be more of a measure of fluid intelligence (has figural reasoning content) which may partially explain the large effect size for age subgroups.

14. 14 How do Results Compare with Prior LBR Research? Validity: Study 1 lower Study 2 higher Effect Sizes: Study 1 consistent Study 2 larger for some Study 1 sample screened on cognitive ability, situational judgment, biodata, interview, written assessment. Study 2 sample screened, mostly, by interview plus evals of prior experience Study 1: Mean % correct for LBR = 56% (SD =18 percentage points) Study 2: Mean % correct LBR = 60% (SD =17 percentage points) Study 1 sample screened on cognitive ability, situational judgment, biodata, interview, written assessment. Study 2 sample screened, mostly, by interview plus evals of prior experience Study 1: Mean % correct for LBR = 56% (SD =18 percentage points) Study 2: Mean % correct LBR = 60% (SD =17 percentage points)

15. 15 Difference in LBR Performance Across Studies Job differences? Analyst job revolves around reading vast amounts of material and using it to reach logical conclusions (validity) Criterion differences? Supervisors may have placed relatively more weight on applied reasoning skills in evaluations of Analysts than of Agents (validity for predicting perf) Sample differences? Study 1 sample more directly screened on cognitive ability, more range restriction (validity) Study 2 sample had greater racial and gender diversity Same % non-native English speaking incumbents (~7%) Same tenure range among incumbentsStudy 2 sample had greater racial and gender diversity Same % non-native English speaking incumbents (~7%) Same tenure range among incumbents

16. 16 Difference in LBR Performance Across Studies (cont.) Test difficulty differences? No appreciable difference in mean performance (confounded with sample characteristics) Test format differences? Study 2 LBR format is more grammatically complex May increase race subgroup differences May also increase validity

17. 17 LBR Test vs Siena Reasoning Test Validity: LBR higher in direct comparison Effect Size: LBR larger for some; smaller for others Difficulty: LBR more difficult � as least as configured in these studies Incremental Validity when adding SRT to LBR Test: ?R2 = .025 for JKT ?R2 = .012 for overall perf LBR 1 (Agent) c-AA=0.79; C-H = 0.66; C-A=.28; M-F = 0.22 LBR 1 (Agent) 55% mean correct; SD 18 % points Stage 1 Composite Score = standardized, unit-weighted sum of LBR, SRT, and biodata-PE fit scales LBR 1 (Agent) c-AA=0.79; C-H = 0.66; C-A=.28; M-F = 0.22 LBR 1 (Agent) 55% mean correct; SD 18 % points Stage 1 Composite Score = standardized, unit-weighted sum of LBR, SRT, and biodata-PE fit scales

18. 18 Summary Points LBR tests Reliable, valid way to measure applied reasoning skills As or more valid than other cognitive ability measures Same or smaller subgroup differences as other cognitive ability measures Best for jobs with a high verbal/reading load and that require a high level of reading and reasoning skills LBR and SRT are viable choices for measuring cognitive ability for employee selection

19. 19 References Colberg, M. (1984). Towards a taxonomy of verbal tests based on logic. Educational and Psychological Measurement, 44, 113-120. Colberg, M. (1985). Logic-based measurement of verbal reasoning: A key to increased validity and economy. Personnel Psychology, 38, 347-359. Goldstein, H.W., Scherbaum, C.A., & Yusko, K.P. (2009). Revisiting g: Intelligence, adverse impact, and personnel selection. In J. Outtz� (Ed.) Adverse Impact: Implications for Organizational Staffing and High Stakes Selection. Routledge Academic: New York, NY. Hayes, T.L. (Chair) (2002). The validity of logic-based measurement for selection and promotion decisions. Symposium conducted at the 17th annual conference of the Society for Industrial and Organizational Psychology, Toronto, Canada. Includes these papers: Harris, P.A., Callen, N.F., Busciglio, H. Transportability of the logic-based measurement approach for law enforcement selection with the U.S. Customs Service. Hayes, T. L. & Reilly, S. M. The criterion-related validity of logic-based measurement tests: The SIOP Conference paper. Leaman, J.A., & Gast, I.F. Content validation of a logic-based assessment of thinking skills. Simpson, R.W., & Nester, M.A. The construct and content validity of logic-based tests of reasoning for personnel selection. Simpson, R., Nester, M.A., & Palmer, E. (2007). The validity of logic-based tests. Presented at the annual conference of the International Public Management Association-Assessment Council, St. Louis, MO. Tsacoumis, S., Putka, D.J., & Colberg, M. (2007). A cross-cultural look at items of logic-based reasoning. In A.S. Boyce & R.E. Gibby (Chairs), Global cognitive ability testing: Psychometric issues and applicant reactions. Symposium conducted at the 22nd annual conference of the Society for Industrial and Organizational Psychology, New York, NY.

Using a Logic-Based Measurement Approach to Measure Cognitive Ability

Using a Logic-Based Measurement Approach to Measure Cognitive Ability

Presentation Transcript

OVERVIEW OF CURRICULUM-BASED MEASUREMENT AS A GENERAL OUTCOME MEASURE

Cognitive Approach

Cognitive Approach

OVERVIEW OF CURRICULUM-BASED MEASUREMENT AS A GENERAL OUTCOME MEASURE

Using Curriculum-Based Measurement to Measure ‘Response to Intervention’ Jim Wright Baldwinsville (NY) Central School D

A Rewriting Logic Approach To Operational Semantics

A Probabilistic Approach to Logic Equivalence Checking

The Ability of Planck to Measure Unresolved Sources

Cognitive Approach

Cognitive Decline: A Natural Approach

Cognitive Approach to Abnormality

Cognitive ability tests and NFL

A Mathematical Logic Approach To Risk Based VV&A

A Process Approach to Outcome Measurement

A First Approach to Argument-based Recommender Systems based on Defeasible Logic Programming

Cognitive ability affects connectivity in metapopulation: A simulation approach

Cognitive Approach

Using a Logic Model to Describe Your Approach

Welcome to CS 395/495 Internet Security: A Measurement-based Approach

OVERVIEW OF CURRICULUM-BASED MEASUREMENT AS A GENERAL OUTCOME MEASURE

Cognitive Ability Test Online

Cognitive ability: Challenge: How to recognize objects in a

Using a Logic-Based Measurement Approach to Measure Cognitive Ability