530 likes | 919 Views
The psychology of knights and knaves. Lance J. Rips, University of Chicago, 1989. Knights and Knaves.
E N D
The psychology of knights and knaves Lance J. Rips, University of Chicago, 1989
Knights and Knaves (1)We have three inhabitants, A, B, and C, each of whom is a knight or a knave. Two people are said to be of the same type if they are both knights or both knaves. A and B make the following statements: A: B is a knave B: A and C are of the same type. What is C? (Smullyan, 1978, p.22)
Protocol evidence • Subjects attempted to solve problems by considering specific assumptions • Worked forward from their assumptions • Subjects sometimes forgot assumptions
Computational model • Based on the idea that people deal with deduction problems by applying mental-deduction rules like those of formal natural deduction systems
Computational model • Subject’s performance predicted on a deduction problem in terms of length of required derivation and availability of rules • The shorter the derivation and more available the rules, the faster and more accurate subjects should be
Computational model knight(x) – x is a knight, knave(x) – x is a knave says(x,p) – person x uttered sentence p Rule 1: says(x, p) and knight(x) entail p. Rule 2: says(x,p) and knave(x) entail NOT p. Rule 3: NOT knave(x) entails knight(x) Rule 4: NOT knight(x) entails knave(x).
Computational modelPROLOG Program • Stores logical form of sentences in problem and extracts names of individuals (A, B, and C) • Assumes first-mentioned individual is a knight, knight(A) • Draws as many inferences as possible from assumption • If contradictory sentences (knight(B) and knave(B)) it abandons assumption that first-mentioned individual is a knight and continues with assumption knave(A)
Computational model PROLOG Program • Revises rule ordering, rules successfully applied will be tried first on the next round • Continues until it has found all consistent sets of assumptions about the knight / knave status of each individual
Computational model PROLOG Program • All rules operate forward • Assumes subjects error rates and response time depend on length of derivations
Experiment 1 Rule 5 (AND Elimination): p AND q entails p, q. Rule 6 (Modus Ponens): IF p THEN q and p entail q Rule 7 (DeMorgan-1): NOT (p OR q) entails NOT p AND NOT q Rule 8 (DeMorgan-1): NOT (p AND q) entails NOT p OR NOT q
Experiment 1 Rule 9 (Disjunctive Syllogism-1): p OR q and NOT p entail q. p OR q and NOT q entail p. Rule 10 (Disjunctive Syllogism-2): NOT p OR q and p entail q. p OR NOT q and q entail p. Rule 11 (Double Negation Elimination) NOT NOT p entails p.
Experiment 1Method • Submitted puzzles to the PROLOG program and counted the number of inference steps it needed to solve them • 34 problems • Six problems had 2 speakers, 28 had 3 • 2 speaker problems had 3 or 4 clauses • 3 speaker problems had 4, 5, or 9 clauses
Experiment 1Method • 4 clause, 3 speaker problems (2) A says, “C is a knave.” B says, “C is a knave.” C says, “A is a knight and B is a knave.” (3) A says, “B is a knight.” B says, “C is a knave or A is a knight.” C says, “A is a knight.”
Experiment 1 - Subjects • 34 subjects • 3 groups of 10 to 13 individuals • University of Arizona Undergraduates • English Speakers, no formal logic courses • 10 subjects stopped working on the problems after 15 minutes
Experiment 1 Results and Discussion • None of the subjects solved the most difficult problem and 35% solved the easiest. • 24% of problems predicted to be easier, 16% of problems predicted difficult. • Program used a mean of 19.3 steps in solving simpler problems, 24.2 steps on the more difficult problems. • Core subjects solved 32% of the easier problems and 20% of more difficult problems.
Experiment 1Results and Discussion Percentage of Correct solutions in Experiment 1 as a function of the number of inference steps used by the model
Experiment 1Results and Discussion • 3-speaker, 9-clause outlier (4) A says, “We’re all knaves.” B says, “A, B, or C is a knight.” C says, “A, B, or C is a knave.”
Experiment 1Results and Discussion • Prediction that subjects would score higher on puzzles with smaller number of inference steps consistent with findings.
Experiment 1Results and Discussion • Binary Connectives says(A, ((knave(A) AND knave(B)) AND knave(C )) • N-ary Connectives AND(knave(A), knave(B), knave(C ))
Experiment 2 • Predict the amount of time subjects take to reach a correct solution based on the number of steps the model needs to find a correct answer.
Experiment 2 • Problems were simplified as longer problems produced longer and more variable times • More difficult problems also resulted in less correct answers. • Tighter control on the form of the problems • Eliminate irrelevant effects of problem wording and response.
Experiment 2 • Modified rules to allow program to solve a wider variety of problems • Rules 9 and 10 (Disjunctive Syllogism) • Allowed the program to infer p from any of the following: • OR(knight(x), p) and knave(x); • OR(knave(x), p) and knight(x); • OR(p, knight(x)) and knave(x); and • OR(p, knave(x)) and knight(x);
Experiment 2Method • Subjects viewed the problems on a monitor and responded using a response panel. • Monitor presented subjects with feedback about accuracy of their answer and amount of time taken.
Experiment 2Method • Submitted problems to the natural-deduction program and chose 12 of the groups based on output. • Each group had same output but differed in the number of inference steps required to solve • Column 1 (small) 13.1 steps • Column 2 (small) 13.0 steps • Column 3 (large) 16.4 steps
Experiment 2Method • The prediction is that the large step problems within each row will result in longer response times and more errors.
Experiment 2Subjects • 53 University of Chicago Undergraduates • Native English speakers, no formal logic • $5 bonus – minus 10 cents per trial on which they made an error • Discarded data from subjects who made errors on more than 40% of trials • 30 subjects succeeded
Experiment 2Results and Discussion • The problems with a larger number of predicted inference steps took longer for the subjects to solve. • Subjects took 25.5s to 23.9s to solve the two types of small-step problems, but 29.5s on the large-step problems.
Experiment 2Results and Discussion • Error Rates • 1st Small step 15.8% • 2nd Small step 9% • Large step 14.4%
Experiment 2Results and Discussion • Knight-knave Problems • Took longer to solve and most difficult • Knight-knight 24.8s 14.4% errors • Knight-knave 29.4s 17.5% • Knave-knight 24.0s 8% • Knave-knave 26.8s 12.2% • But only a small difference in the number of steps necessary for the program to solve.
Experiment 2Results and Discussion • Attributed increase in knight-knave problems to the small-step items • Subjects incorrectly assume character is lying when they state “I am a knave…” • This would result in knave(A)-knight(B) response
Experiment 2Results and Discussion • Effects of negatives • Subjects took longer to read and comprehend negative sentences • The model adds extra steps are necessary to transform these negatives to positives • Rule 3 – NOT(knave(x)) to knight(x) • 23.4s to solve no negative problems with 10.6% error rate • 27.2 to solve problems with one negative with 13.9% error rate
General DiscussionNatural-deduction model • People carry out deduction tasks by constructing mental proofs • Represent information • Make further assumptions • Draw inferences • Make conclusions on basis of derivation
General DiscussionNatural-deduction model • The knights and knaves problems extend model compared to previous experiments which judge validity of arguments • Depend on logical properties but do not have premise-conclusion format
General DiscussionNatural-deduction model • Protocol • Participants followed assume-and-deduce strategy • Experiment 1 • Predict probability of subjects solving a set of moderately complex and varied puzzles • Experiment 2 • Response times increased with the number of inference steps
General DiscussionNatural-deduction model • Limitations • A large minority found the simpler problems to be extremely difficult and performed below chance level of performance • Results were interpreted using only the natural-deduction framework
General Discussion • Subjects who did not complete the task • Large variation • Experiment 1 – some achieved 80% correct, other subjects missed all
General Discussion • Individual Differences • OR Introduction • Avoided problems dependent on OR Introduction • Lack of availability of Knight-knave rules • Subjects do not understand that what a knight says is true and what a knave says is false
General DiscussionAlternative Theories • Deduction by heuristic • By responding knave if a character says “I am a knave” and responding knight otherwise • Results in 25% correct versus obtained 87% • No apparent “non-logical” short cuts
General DiscussionAlternative Theories • Deduction by pragmatic schemas • Knights and knaves does not follow the real world schema • Very few situations in which people always tell the truth or always lie • May help with Wason selection task (permission / restrictions) • But no case for people using schemas on most deduction problems
General DiscussionAlternative Theories • Deduction by mental models • Subject surveys model for potential conclusion and if found attempts to find a counter example by altering the model. • If no counterexample found the subject adopts initial conclusion as correct. • If counterexample is found, conclusion is rejected and another conclusion is examined. • Continues until acceptable conclusion is found or it is decided that no conclusion is valid.
General DiscussionAlternative Theories (1)We have three inhabitants, A, B, and C, each of whom is a knight or a knave. Two people are said to be of the same type if they are both knights or both knaves. A and B make the following statements: A: B is a knave B: A and C are of the same type. What is C?
General DiscussionAlternative Theories Subject use tokens for each character. knightA knaveB knaveC Conclusion that C is a knave, continue with counterexamples.
General DiscussionAlternative Theories knaveA knightB knaveC Since conclusion stands in both then C is a knave.
General DiscussionAlternative Theories • None of the speak aloud subjects mentioned tokens • Could be a difficulty with describing mental models. • The theory does not account for the process that produces and evaluates the model
General DiscussionAlternative Theories • Deny that it is due to mental inference rules or non-logical heuristics • What cognitive mechanism is responsible for these insights? • Could be put together in a haphazard manner and checked for consistency. • Fails to give a good account of systematic protocols • Shifts burden of explanation to consistency checker