310 likes | 500 Views
Outline. Judgment under uncertainty: The Lens Model a.k.a. ?Policy Capturing" or ?Social Judgment Theory" (Lusk
E N D
1. BA 525 Decision Behavior John W. Payne
BA 525
Fall, 2011 5/25/2012 1
2. Outline Judgment under uncertainty: The Lens Model a.k.a. “Policy Capturing” or “Social Judgment Theory” (Lusk & Hammond, 1991).
The basic task, model, and questions
Judgments with uncertain (probabilistic) cues
The “Lens” Model
How well can we model the decisions of an individual (expert) judge using a simple linear model of judgment?
What does the model of the judge tell us about the use of information by a judge?
How much consistency in judgment is there within and between judgments?
What is the relationship between the accuracy of the judge and the accuracy of the model of the judge?
Research methods of Policy Capturing
Results from Policy Capturing.
Use of Models
Individual Differences
3. Social Judgment Theory Derived from the ideas of E. Brunswik.
Use of multiple ambiguous (uncertain) cues in perception and thought.
People are adapted to function in environments characterized by “incompleteness” of information.
Accuracy, not rationality, is the central concern.
Questions of ‘what’ are much more important in psychology than questions of ‘how’ (Brunswik, 1936/1951). Compare to Simon (1978).
Extension by Hammond (1955) of Brunswik’s ideas to judgment, e.g., clinical diagnosis.
4. Capturing Judgment People have to function in environments characterized by “incompleteness” of information. Consequently, people have to use of multiple ambiguous (uncertain) cues in perception and thought.
Judgment analysis typically involves:
A judge makes a series of judgments based on multiple cues
A model of the judge is built
Sometimes a model of the environment is also built
Comparison is made between the model of judge, the actual results, and the model of the environment.
Comparisons among different judges (Clusters of judges)
The model of the judge is then sometimes used in place of the judge (bootstrapping).
5/25/2012 4 John W. Payne
6. A Model of Judgment: The Lens Model Framework 5/25/2012 6 John W. Payne
7. Lens Model Equation Ra = GReRs + Cv1-Re2 v1-Rs2
Where Ra is the achievement coefficient
G is the correlation between the linear components of the criterion and judgment variables, that is, the correlation between the linear regression model of the environment and the linear regression model of the judge.
C is the correlation between the nonlinear components of the criterion and judgment.
Assuming C = 0, then G indicates the extent to which the systematic (linear) component of the judges responses is related to the systematic component of the environment (knowledge).
Note that even if G=1, performance is limited by the consistency with which knowledge is executed, Rs, and the consistency of the environment, Re. That is, how reliable is the judge and how predictable is the environment.
8. Examples of Policy Capturing Stimuli & Tasks Priority Setting for Surgery
Microburst forecasting
Stock valuation
Drug Benefit Plans
Retail store location
Graduate student selection
5/25/2012 8 John W. Payne
10. 5/25/2012 10 John W. Payne
11. Some Results from SJT Research - 1see Cooksey (1996), Judgment analysis: Theory, methods, and applications The ability of a simple linear model to capture judgments is, surprisingly, very good.
Judges lack of self insight into cue usage – For example, judges believe that they base their judgments on far more cues than appear necessary to capture their judgments. Experts are worse.
Inconsistency within judges – the same judge will judge the same case differently.
Large individual differences – this is true even for experts.
12. Consistency of Intuitive Judgments: Medical Diagnoses 5/25/2012 12 John W. Payne
13. Judgment Analysis of Surgeon’s Prioritizations of Patients for Elective Surgery – (MacCormick and Parry, 2006, MDM) Access to elective general surgery in New Zealand is governed by clinicians’ judgment of priority using a visual analog scale (VAS).
Our objective was to describe this judgment in terms of previously elicited
cues.
Methods. We asked 60 general surgeons in New
Zealand to assess patient vignettes using 8 VAS scales to
determine priority.
Judgment analysis was used to determine agreement between surgeons.
Cluster analysis was performed to identify groups of surgeons who
used different cues.
Results.
The 8-scale VAS showed good predictability in assigning a priority score (R2 = 0.66).
However, agreement between surgeons was poor (ra = 0.48).
The cause of poor agreement was mostly due to poor consensus (G) between surgeons in
how they weighted criteria.
Using cluster analysis, we classified the surgeons into 2 groups: 1 took more account of
quality of life and diagnosis, whereas the other group placed more weight on the influence of treatment. 5/25/2012 13 John W. Payne
14. 5/25/2012 14 John W. Payne
15. 5/25/2012 15 John W. Payne
16. 5/25/2012 16 John W. Payne
17. 5/25/2012 17 John W. Payne
18. Additional Results from SJT Research - 2 In multiple-cue learning tasks, learning from outcome feedback tends to be slow and less than “inspiring”. Evidence is that people often bring to learning tasks a largely predetermined set of hypotheses which they proceed to test in a more or less fixed order.
See Karelaia & Hogarth (2008, Psychological Bulletin) for a recent analysis of lens model studies.
19. It often pays to replace the judge with his or her model 5/25/2012 19 John W. Payne
20. Typical Results from Bootstrapping Studies 5/25/2012 20 John W. Payne
21. Issues Why does “bootstrapping” work?
Relate to the power of checklists, to be discussed later.
5/25/2012 21 John W. Payne
22. Why do people resist decision models? Numerous studies have shown that judgments are generally better if made using a formula (model) rather than combining information in the head.
Numerous studies have also shown that commonly used methods like unstructured interviews are poor procedures for selection decisions.
Why do we resist better methods, e.g., models, and continue to use poor methods, e.g., interviews?
A belief that the evidence does not apply to case at hand?
An inflated belief in quality of judgment?
Worry about the error possibilities in the execution of a model?
Difficulty in accepting some level of explicit error?
Impact of outcome feedback? Impact of incentives?
Other reasons? 22 5/25/2012 John W. Payne
23. Example Use of Model Results Study: Judge which of three people win some award, e.g., the Most Valuable Player Award in Baseball. Provide four pieces of information, e.g., batting average, number of home runs hit, runs batted in, and the position the player’s team finished in the standings.
Give them a simple tool, e.g., select the player whose team finished higher and you will be correct 70% of the time.
Results: Perfect score 19 choices correct.
Moderate knowledge people got about 60% correct.
High knowledge people got about 50% correct, although they were seriously overconfident judges.
Moderate knowledge people used the rule (model) more often.
In a second study
People with incentives did worse than those without performance incentives.
People who got outcome feedback after each judgment did worse than those without such feedback.
Why?
Apparently highly motivated judges were less tolerating of accepting any errors (i.e., the rule will be wrong 30% of the time).
A lose shift – win stay strategy is not a good one in a probabilistic environment. 5/25/2012 23 John W. Payne
24. Derogation of Physician Who Used a Diagnostic Aid Several studies with the following results 5/25/2012 24 John W. Payne
25. Checklists: A form of model and a POWERFUL Decision Aid 5/25/2012 John W. Payne 25
26. 5/25/2012 John W. Payne 26
28. Individual Differences in Rational Thought - 1 Behavior = f (Task, Person): An old idea in psychology.
Distinction between performance errors and competence errors.
Performance error is the failure to apply a rule (strategy) that is part of a person’s competence because of a momentary (random) lapse (a mistake). Implies what correlation in errors across tasks?
Competence error due to lacking the right rule or applying a wrong rule.
Implies what correlation in errors across similar tasks?
29. Individual Differences in Rational Thought - 2 What is the distinction between cognitive capacities and thinking dispositions? Which is the more malleable (teachable)?
What are the implications of a positive and no correlation between measures of cognitive capacity and the responses suggested by a normative model?
30. Individual Differences in Rational Thought - 3 What was the method used by Stanovich and West?
General reasoning tasks like the Wason selection task and base-rate problems.
What was the general pattern of results?
Differences between within-subject tests and between subject tests?
Many biases, but not all, are surprisingly independent of cognitive ability.
31. Individual Differences in Rational Thought - 4 Correlations with cognitive ability will be found only with problems of “intermediate” difficulty.
Need to have the correct rule and also to see the application of the rule (the “mindware” gap and “override detection”)
32. Cognitive Reflection (Frederick, 2005) Do people differ in their abilities (willingness) to override an initial judgment with more thought (cognitive reflection)?
Note, this is assumed to be different than the ability to do so (intelligence).
Basic tasks (the cognitive reflection test), e.g., “A bat and ball cost $1.10 in total. The bat costs $1.00 more than the ball. How much does the ball cost?” Intuitive (quick) response? Correct response?
Evidence that the “incorrect response” is an intuitive response?
Pattern of errors – certain types of errors, e.g., 10 cents
Process measures, e.g., verbal reports and scribbles in the margin suggest 10 cents is the first response.
People who missed the problem thought the problem was easier.
Mean CRT scores across universities? .57 to 2.18.
Relationships to time preferences (positive but mixed) and to risk preferences (suggestive but again mixed)
Sex Differences? Men scored higher than women but seem to have different relationships to risk attitude and time preferences (CRT more closely linked for women).