180 likes | 1.2k Views
2. Logic-Based Measurement. Mid-80's: Colberg
E N D
1. 1
2. 2 Logic-Based Measurement Mid-80s: Colberg & colleagues codified a taxonomy of logic rules that could be used to generate reasoning test items in mid-1980s
1990: LBR test used in selection process for 90+ professional and administrative occupations in U.S. federal govt
Today:
Several U.S. federal agencies (e.g., CBP, FBI)
At least one Fortune 500 company
Several conference papers but not much in peer-reviewed journals
Promising approach -- could be used more widely
3. 3 LBR Item Passages modeled after job materials
Materials often incorporate logical reasoning (e.g., if the suspect flees, then the officer shall..)
For each passage, test-taker determines which of several possible conclusions logically followPassages modeled after job materials
Materials often incorporate logical reasoning (e.g., if the suspect flees, then the officer shall..)
For each passage, test-taker determines which of several possible conclusions logically follow
4. 4 LBR Alternate Testing Format Fewer passages, more items, more efficient use of testing time because less time spent reading passages and more time spent answering questions.
Items are more grammatically complex
Conclusion 1: According to the results of the study, offenders who are neither fined nor imprisoned are certain to become repeat offenders. Correct answer: INDETERMINABLE
Explanation: The facts do not say anything about the behavior of offenders who are neither fined nor imprisoned. They might become repeat offenders or they might not.
Conclusion 2: All study participants who repeated their crimes during the five years that followed the study had been convicted of hacking into corporate financial networks.
Correct answer: TRUE
Explanation: The conclusion focuses only on participants in the study, all of whom had been convicted for hacking into corporate financial networks. Therefore, all of the repeat offenders mentioned in this fact set had been convicted of hacking into corporate financial networks. Conclusion 3: In the context of computer-related crimes, research has demonstrated that whether one becomes a repeat offender is determined entirely by the severity of punishment. Correct answer: FALSE
Explanation: The facts state that severity of punishment is the strongest predictor of whether or not a person repeats a computer-related crime, but do not state that severity of punishment is the only predictor. Therefore, whether or not a person becomes a repeat offender is not necessarily determined entirely by severity of punishment. This statement must be false.
Fewer passages, more items, more efficient use of testing time because less time spent reading passages and more time spent answering questions.
Items are more grammatically complex
Conclusion 1: According to the results of the study, offenders who are neither fined nor imprisoned are certain to become repeat offenders. Correct answer: INDETERMINABLE
Explanation: The facts do not say anything about the behavior of offenders who are neither fined nor imprisoned. They might become repeat offenders or they might not.
Conclusion 2: All study participants who repeated their crimes during the five years that followed the study had been convicted of hacking into corporate financial networks.
Correct answer: TRUE
Explanation: The conclusion focuses only on participants in the study, all of whom had been convicted for hacking into corporate financial networks. Therefore, all of the repeat offenders mentioned in this fact set had been convicted of hacking into corporate financial networks. Conclusion 3: In the context of computer-related crimes, research has demonstrated that whether one becomes a repeat offender is determined entirely by the severity of punishment. Correct answer: FALSE
Explanation: The facts state that severity of punishment is the strongest predictor of whether or not a person repeats a computer-related crime, but do not state that severity of punishment is the only predictor. Therefore, whether or not a person becomes a repeat offender is not necessarily determined entirely by severity of punishment. This statement must be false.
5. 5 Simpler Format & Reasoning (Sienna Reasoning Test) Low reading load; items are similar to ones that have been used in published tests of reasoning skills if not explicitly derived from formal rules of logic, then technically not an LBM item. Underlying reasoning skills are arguably the same.
Conclusion 1. Correct answer: TRUE
Explanation: The first statement says that a GATH is heavier than a SHET.
Conclusion 2. Correct answer: FALSE
Explanation: According to the second statement, a SHET is heavier than a COUCH. According to the first statement, a GATH is heavier than a SHET. Therefore, a GATH must be heavier than a COUCH.
Conclusion 3: Correct answer: FALSE
Explanation: The answer is obtained by using your real world knowledge. From your real world knowledge, you know that a COUCH is typically heavier than a LAMP. Low reading load; items are similar to ones that have been used in published tests of reasoning skills if not explicitly derived from formal rules of logic, then technically not an LBM item. Underlying reasoning skills are arguably the same.
Conclusion 1. Correct answer: TRUE
Explanation: The first statement says that a GATH is heavier than a SHET.
Conclusion 2. Correct answer: FALSE
Explanation: According to the second statement, a SHET is heavier than a COUCH. According to the first statement, a GATH is heavier than a SHET. Therefore, a GATH must be heavier than a COUCH.
Conclusion 3: Correct answer: FALSE
Explanation: The answer is obtained by using your real world knowledge. From your real world knowledge, you know that a COUCH is typically heavier than a LAMP.
6. 6 Whats So Great about LBR Tests? Items replicate logical thought processes required to perform job duties
Items involve application of reasoning skills
No subject knowledge required
No training in formal rules of logic required
Keyed answers cant be disputed
7. 7 Whats so Great about LBR Tests (cont.)? Good internal consistency reliability (~.80)
Strong criterion-related validity, higher than for many other measures of cognitive ability
~.60* for training perf, job knowledge, work sample
~.30* for supervisor ratings of job perf
Smaller subgroup effect sizes than for some traditional cognitive ability measures (personal communication, Colberg)
8. 8 Cautionary Notes LBR tests often have a high reading load, so important to show that job requires a lot of reading
Native speakers earned higher scores than non-native in one large-scale study (still valid for all)
Persons from some cultures scored higher than persons from other cultures (still valid for all)
Confucian Asian and Nordic/Germanic GLOBE clusters highest, Anglo and Eastern Europe middle; Sub-Saharan Africa lowest
GLOBE Study: House et al. study of leadership in 62 countries, organized into 9 cultural clusters
GLOBE Study: House et al. study of leadership in 62 countries, organized into 9 cultural clusters
9. 9 Criterion-Related Validation Studies U.S. Federal Agency
Jobs:
Special Agent (Study 1)
Intelligence Analyst (Study 2)
Design/Sample
Concurrent (incumbents)
N > 400 per study
Criterion measures
Job Knowledge Test (JKT)
Overall Performance Ratings Composite (supervisor) Each job requires application of reasoning skills as a critical and core part of the jobEach job requires application of reasoning skills as a critical and core part of the job
10. 10 Criterion-Related Validation Studies (cont.) Predictor measures
Developed by HumRRO specifically for this client
LBR Test
Situational Judgment Test (SJT)
Biodata/P-E Fit Scales
Vendor-provided instrument
Sienna Reasoning Test
11. 11 Predictor Psychometric Information SJT involves some level of cognitive ability, so Im including it here for reference purposes, but dont intend to spend time talking about it.
Each study included a biodata instrument too, but we arent showing those results hereSJT involves some level of cognitive ability, so Im including it here for reference purposes, but dont intend to spend time talking about it.
Each study included a biodata instrument too, but we arent showing those results here
12. 12 Focus on corrected results because those are whats reported earlier for the meta-analysis, and also because they reflect operational validity
One take-away No measure can guarantee that it will minimize mean differences for every subgroup.
SRT may be more of a measure of fluid intelligence (has figural reasoning content) which may partially explain the large effect size for age subgroups.
Focus on corrected results because those are whats reported earlier for the meta-analysis, and also because they reflect operational validity
One take-away No measure can guarantee that it will minimize mean differences for every subgroup.
SRT may be more of a measure of fluid intelligence (has figural reasoning content) which may partially explain the large effect size for age subgroups.
13. 13 Subgroup Effect Sizes One take-away No measure can guarantee that it will minimize mean differences for every subgroup.
SRT may be more of a measure of fluid intelligence (has figural reasoning content) which may partially explain the large effect size for age subgroups.One take-away No measure can guarantee that it will minimize mean differences for every subgroup.
SRT may be more of a measure of fluid intelligence (has figural reasoning content) which may partially explain the large effect size for age subgroups.
14. 14 How do Results Compare with Prior LBR Research?
Validity:
Study 1 lower
Study 2 higher
Effect Sizes:
Study 1 consistent
Study 2 larger for some
Study 1 sample screened on cognitive ability, situational judgment, biodata, interview, written assessment.
Study 2 sample screened, mostly, by interview plus evals of prior experience
Study 1: Mean % correct for LBR = 56% (SD =18 percentage points)
Study 2: Mean % correct LBR = 60% (SD =17 percentage points)
Study 1 sample screened on cognitive ability, situational judgment, biodata, interview, written assessment.
Study 2 sample screened, mostly, by interview plus evals of prior experience
Study 1: Mean % correct for LBR = 56% (SD =18 percentage points)
Study 2: Mean % correct LBR = 60% (SD =17 percentage points)
15. 15 Difference in LBR Performance Across Studies Job differences?
Analyst job revolves around reading vast amounts of material and using it to reach logical conclusions (validity)
Criterion differences?
Supervisors may have placed relatively more weight on applied reasoning skills in evaluations of Analysts than of Agents (validity for predicting perf)
Sample differences?
Study 1 sample more directly screened on cognitive ability, more range restriction (validity)
Study 2 sample had greater racial and gender diversity
Same % non-native English speaking incumbents (~7%)
Same tenure range among incumbentsStudy 2 sample had greater racial and gender diversity
Same % non-native English speaking incumbents (~7%)
Same tenure range among incumbents
16. 16 Difference in LBR Performance Across Studies (cont.) Test difficulty differences?
No appreciable difference in mean performance (confounded with sample characteristics)
Test format differences?
Study 2 LBR format is more grammatically complex
May increase race subgroup differences
May also increase validity
17. 17 LBR Test vs Siena Reasoning Test Validity: LBR higher in direct comparison
Effect Size: LBR larger for some; smaller for others
Difficulty: LBR more difficult as least as configured in these studies
Incremental Validity when adding SRT to LBR Test:
?R2 = .025 for JKT
?R2 = .012 for overall perf
LBR 1 (Agent) c-AA=0.79; C-H = 0.66; C-A=.28; M-F = 0.22
LBR 1 (Agent) 55% mean correct; SD 18 % points
Stage 1 Composite Score = standardized, unit-weighted sum of LBR, SRT, and biodata-PE fit scales
LBR 1 (Agent) c-AA=0.79; C-H = 0.66; C-A=.28; M-F = 0.22
LBR 1 (Agent) 55% mean correct; SD 18 % points
Stage 1 Composite Score = standardized, unit-weighted sum of LBR, SRT, and biodata-PE fit scales
18. 18 Summary Points LBR tests
Reliable, valid way to measure applied reasoning skills
As or more valid than other cognitive ability measures
Same or smaller subgroup differences as other cognitive ability measures
Best for jobs with a high verbal/reading load and that require a high level of reading and reasoning skills
LBR and SRT are viable choices for measuring cognitive ability for employee selection
19. 19 References Colberg, M. (1984). Towards a taxonomy of verbal tests based on logic. Educational and Psychological Measurement, 44, 113-120.
Colberg, M. (1985). Logic-based measurement of verbal reasoning: A key to increased validity and economy. Personnel Psychology, 38, 347-359.
Goldstein, H.W., Scherbaum, C.A., & Yusko, K.P. (2009). Revisiting g: Intelligence, adverse impact, and personnel selection. In J. Outtz (Ed.) Adverse Impact: Implications for Organizational Staffing and High Stakes Selection. Routledge Academic: New York, NY.
Hayes, T.L. (Chair) (2002). The validity of logic-based measurement for selection and promotion decisions. Symposium conducted at the 17th annual conference of the Society for Industrial and Organizational Psychology, Toronto, Canada. Includes these papers:
Harris, P.A., Callen, N.F., Busciglio, H. Transportability of the logic-based measurement approach for law enforcement selection with the U.S. Customs Service.
Hayes, T. L. & Reilly, S. M. The criterion-related validity of logic-based measurement tests: The SIOP Conference paper.
Leaman, J.A., & Gast, I.F. Content validation of a logic-based assessment of thinking skills.
Simpson, R.W., & Nester, M.A. The construct and content validity of logic-based tests of reasoning for personnel selection.
Simpson, R., Nester, M.A., & Palmer, E. (2007). The validity of logic-based tests. Presented at the annual conference of the International Public Management Association-Assessment Council, St. Louis, MO.
Tsacoumis, S., Putka, D.J., & Colberg, M. (2007). A cross-cultural look at items of logic-based reasoning. In A.S. Boyce & R.E. Gibby (Chairs), Global cognitive ability testing: Psychometric issues and applicant reactions. Symposium conducted at the 22nd annual conference of the Society for Industrial and Organizational Psychology, New York, NY.