360 likes | 507 Views
Individual, Group, and Computer Strengths and Weaknesses. Judgments and Decisions Psych 253. Individual Weaknesses. Limited perception : Accept the frame and context that are given
E N D
Individual, Group, and Computer Strengths and Weaknesses Judgments and Decisions Psych 253
Individual Weaknesses • Limitedperception: Accept the frame and context that are given • Limitedattention: Insensitivity to relevant information and sensitivity to irrelevant information (order, phrasing, surrounding situation) • Limitedmemory: Short-term limit of 7, ± 2 • Limitedreasoning: People are inconsistent and invalid information processors.
Individual Strengths • Computerbuilders: We build the machines, not vice versa. • Theorists: We develop normative theories for decision-making. • Patternrecognizers: We can see and extract patterns (faces, chess experts, nurses, pilots, art experts).
Let’s consider a scene from Klein’s Sources of Power. • “It is a simple house fire in a one-story house in a residential neighborhood. The fire is in the back, in the kitchen area. The lieutenant leads his hose crew into the building, to the back, to spray water on the fire, but the fire just roars back at them. ‘Odd,’ he thinks. The water should have more of an impact. They try dousing it again and get the same results. They retreat a few steps to regroup. Then the lieutenant starts to feel as if something is not right. He doesn’t have any clues; he just doesn’t feel right about being in that house, so he orders his men out of the building—a perfectly standard building with nothing out of the ordinary. As soon as his men leave the building, the floor where they had been standing collapses. Had they still been inside, they would have plunged into the fire below.” • Source: Klein, Gary (1998), Sources of Power: How People Make Decisions, Boston: MIT Press.
Many reasons, but one important point to notice: There were no “gold standards” in his real-world scenarios. Perhaps, if statistical models had been developed, the models would have outperformed the experts. Many people claim to be experts. How do we know whether an "expert" is really an expert? Experts should be identified by comparing their predictions to a gold standard, such as survival (with respect to surgeons) and safety (with respect to air traffic controllers). But often there is no gold standard (i.e., wine connoisseurs, professors grading essays, eye witnesses giving testimonial accounts, jurors determining guilt or innocence). Why were Klein’s experts so good?
When gold standards are not available, experts should AT LEAST show: • Discrimination in judgments between similar, though not identical, stimuli • Consistency in judgments of the same stimuli on repeated occasions
In a study on good judgment, Swedish general practitioners judged the probability of heart failure for 45 cases based on real patients. Five were repeated (though the physicians were not told that). They are called A, B, C, D, and E. Assessments were made on a scale from “Totally Unlikely” to “Certain.” • Samples from three practitioners’ judgments are shown in the next slides. Source: Skånér, Y., Strender, L. E., & Bring, J. (1998), “How do GPs use clinical information in their judgements of heart failure? A Clinical Judgment Analysis Study,” Scandinavian Journal of Primary Health Care, 16, 95-100.
GP who is consistent but can’t discriminate. Patient Cases
GP who discriminates but is inconsistent. Patient Cases
GP who discriminates and is consistent. Patient Cases
Of course, we don’t know if any of these doctors are correct ... but discriminability and consistency are necessary components of expertise. • What we really want is validity. But even without that information, this type of exercise can be used for training, evaluating, and enhancing performance in fields as diverse as medical diagnosis, auditing, personnel selection, figure skating, and air traffic control.
Group Weaknesses Suggestibility Conformity Obedience Compliance
Effects of Suggestibility in Ambiguous Settings Person 1 Person 2 Person 3 Source: Sherif, Muzafer (1936), The Psychology of Social Norms, New York: Harper Collins.
Suggestibility and conformity to group pressure even occurred when people could easily judge the truth by themselves. Test line A B C Source: Asch, S. (1956), “Studies of independence and conformity: A minority of one against a unanimous majority, Psychological Monographs, 70, 9, Whole No. 416.
Milgram experiment: What are the conditions under which ordinary people would follow instructions and hurt others? Proportion Comply Voltage Source: Milgram, Stanley (1974), Obedience to authority: An experimental view, New York: Harper and Row.
Irving Janis came up with the term groupthink when he read Arthur Schlesinger’s account of the how Kennedy and his advisers blundered into the Bay of Pigs. Janis studied the process and found that the advisors fostered a sense that the plan had to succeed. To preserve the good group feeling, dissenting views were censored, especially after Kennedy voiced enthusiasm for the idea. Janis called the behavior groupthink. The recipe for groupthink is: Source: Janis, Irving (1982), Groupthink: Real World Examples of Conformity, Boston: Houghton Mifflin.
Members self-censor • Pressure is placed on those who dissent • Members feel invulnerable • Members stereotype others • Group is extremely cohesive • Group is insulated from others’ opinions • Group has a strong, directive leader
Group Strengths Groups tend to work better when members’ opinions are: • Independent (people’s opinions are not dependent on those around them) • Diverse (each person has some private information) • Decentralized (people can specialize and draw on local knowledge) • Aggregated via a reliable mechanism (turning private judgments into a collective decision) • Source: Surowieki, J.(2004), Wisdom of the Crowds, New York: Doubleday.
Examples • Judging the weight of an ox • Locating the USS Scorpion • Google’s method for locating web pages • Playing the Iowa Electronic Market • Getting advice on Who Wants to Be a Millionaire?
Companies now using prediction markets: Yahoo!, Eli Lilly, Google, Microsoft, HP, GE Predicting whether customers will like new products and services, whether new drugs will gain FDA approval (Eli Lilly),when product launches will occur (Google), how often products will be used (Google), what sales growth will be, when a particular feature will work (Microsoft), when a project is ready for testing, or the number of bugs that will be reported in a piece of software in a given period (Microsoft) Prediction markets open to the public yield data on factors affecting business plans. These include presidential elections, gas prices, real estate values, a film's performance at the box office, and even the probability of a flu pandemic. Companies can take advantage of the group strengths
Computer Weaknesses • Doing complex tasks in 3D • Putting information in context and taking unusual events into consideration (broken leg cue)
Computer Strengths • Remember things, keep track of things, combine the same information the same way on repeated occasions, make predictions (help us find predictable cues and predictable relationships) • Numerous studies have compared “experts” against prediction models. Simple models do better.
Selecting applicants to universities, colleges, or professional schools • Making medical diagnoses (i.e., cancer) based on tests, interviews, and other available information • Identifying students who will later act violently in high schools and middle schools • Identifying who will default on a loan • Predicting which criminals will violate parole • Estimating survival times for patients with terminal illness • Forecasting the weather • Determining who is guilty and who is innocent
Meehl (1954) • Concluded there were 16 to 20 studies that compared clinical and statistical methods of decision making. We’ll refer to these as intuitive versus statistical methods of information aggregation. In all but one, statistical methods did better at predicting actual behavior.
Sawyer (1966) • Recognized that the issue of measurement was also important. • Prediction refers to the way the data were combined (intuitively or statistically) • Measurement refers to the way the data were collected (intuitively or statistically). • Intuitive data are unstructured interviews, whereas statistical Data are test scores
Data Prediction Method Intuitive Pure intuition Intuitive combo of test scores Intuitive composite Intuitive synthesis • Intuitive • Statistical • Both • Both sequential • Statistical • Regression on ratings • Pure statistical • Statistical composite • Statistical Synthesis
Pure intuition. Predict behavior from an interview without tests or other objective information • Regression on ratings. Rate candidate on impressions and with regression • Intuitive combo of test scores.
Pure statistical. Statistically collected data, mechanically combined. Test scores used in a multiple regression to predict performance. • Intuitive composite. Both types of data, intuitively combined. Impressions from interviews and test scores combined intuitively • Statistical composite. Both modes of data combined with regression
Intuitive synthesis. Take a prediction produced by mechanical combination and treat it as a datum to be combined intuitively with other data • Statistical synthesis. Take a prediction produced by intuition and treat it as a datum to be combined statistically with other data.
Data Prediction Method Intuitive 20% 38% I 26% 50% • Intuitive • Statistical • Both • Both as • prediction • Statistical • 43% • 63% • 75% • 75%
Why do linear models do better? • There are not too many crossover interactions in the world. • Monotonic relationships between predictors and the criterion are captured fairly well with linear models. • The weights assigned to predictor variables are not as important as their signs (Even nonoptimal regression methods outperform expert judgments). • People are unreliable, invalid, and distracted by “exceptions.” They are better at providing information that is then combined statistically.
Simple models help people separate facts from values. • Police officers and minority communities in Denver, Colorado, had opposing views about which bullets police should use. The police wanted to switch from lightweight bullets to a new, hollow-tipped bullet that would more reliably disable suspects. Minorities argued the new bullet would kill innocent bystanders. The issue was brought to the city council, where each side brought in experts to testify in their favor. • Source: Hammond, K., & Adelman, L. (1974), “Science, values, and human judgment, Science, 194, 389-96.
X1 Weight Injury X2 Muzzle Velocity X3 Stopping Effectiveness Acceptability Kinetic Energy X4 Threat to Bystanders X5
Multiattribute Utility Approach Attributes Stopping Effectiveness Threat Injury B1 Bullets B2 B3 B4 Acceptability = w1Injury + w2Stopping Effectiveness + w3Threat
One can separate facts from values and put the two types of information in the appropriate place in a linear model. • Ballistic experts determine how the muzzle velocity, mass, and kinetic energy influence injury potential, threat to bystanders, and stopping effectiveness. • Council members determine the importance of the attributes. • The bullet with the greatest multiattribute utility was not the one the police had been using or the one they wanted. Nonetheless, the bullet with the greatest MAU had no greater threat to bystanders and was nearly equal in stopping effectiveness. The process lead to acceptance by all concerned.
Percentage Favorable Decisions Hour of Day Extraneous factors in judicial decisions (in PNAS 2011) ShaiDanzigera, Jonathan Levav and LioraAvnaim-Pessoa