1 / 45

Modeling transfer of learning in games of strategic interaction

Modeling transfer of learning in games of strategic interaction. Ion Juvina & Christian Lebiere Department of Psychology Carnegie Mellon University. Background | Experiment | Model | In progress | Discussion. Outline. Background Experiment Cognitive model

lydia
Download Presentation

Modeling transfer of learning in games of strategic interaction

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Modeling transfer of learning in games of strategic interaction Ion Juvina & Christian Lebiere Department of Psychology Carnegie Mellon University ACT-R Workshop July 2012

  2. Background | Experiment | Model | In progress | Discussion Outline • Background • Experiment • Cognitive model • Work in progress • Discussion

  3. Background| Experiment | Model | In progress | Discussion Transfer of learning • Alfred Binet (1899): • Formal discipline: • Exercise of mental faculties -> generalization • Thorndike (1903): • Identical element theory: • transfer of learning occurs only when identical elements of behavior are carried over from one task to another • Singley & Anderson (1989): • Surface vs. deep similarities • Common “cognitive units”

  4. Background| Experiment | Model | In progress | Discussion Transfer in strategic interaction • Bipartisan cooperation in Congress • Golf -> bipartisanship? • Similarity? What is transferred?

  5. Background| Experiment | Model | In progress | Discussion Prisoner’s Dilemma (PD)

  6. Background| Experiment | Model | In progress | Discussion Chicken game (CG)

  7. Background| Experiment | Model | In progress | Discussion PD & CG payoff matrices

  8. Background| Experiment | Model | In progress | Discussion Similarities between PD & CG • Surface (near transfer) • 2X2 games • 2 symmetric and 2 asymmetric outcomes • [1,1] outcome is identical • Deep (far transfer) • Mixed motive • Non-zero sum • Mutual cooperation is superior to competition in long term • Though unstable (risky)

  9. Background| Experiment | Model | In progress | Discussion Differences between PD & CG • Different equilibria: • Symmetric in PD: [-1,-1] • Asymmetric in CG: [-1, 10] and [10,-1] • Different strategies to maximize joint payoff (Pareto-efficient outcome): • [1,1] in PD • Alternation of [-1,10] and [10,-1] in CG

  10. Background | Experiment | Model | In progress | Discussion Questions / hypotheses • Similarities • Identical element? Common cognitive units? • Transfer of learning • Is there any transfer? • Only in one direction? • Low – high entropy? (Bednar, Chen, Xiao Liu, & Page, in press) • Identical element -> both ways • Mechanism of transfer • Reciprocal trust mitigates the risk associated with the long term solution (Hardin, 2002)

  11. Background | Experiment | Model | In progress | Discussion Participants and design • 480 participants (CMU students) • 240 pairs • 2 within-subjects games: PD & CG • 4 between-subjects information conditions • No-info: 60 pairs • Min-info: 60 pairs • Mid-info: 60 pairs • Max-info: 60 pairs • 2 between-subjects order conditions in each information condition • PD-CG: 30 pairs • CG-PD: 30 pairs • 200 unnumbered rounds for each game

  12. Background | Experiment | Model | In progress | Discussion Typical outcomes

  13. Background | Experiment | Model | In progress | Discussion Pareto-optimal equilibria

  14. Background | Experiment | Model | In progress | Discussion [1,1] increases with info

  15. Background | Experiment | Model | In progress | Discussion Alternation increases with info

  16. Background | Experiment | Model | In progress | Discussion PD – CG sequence

  17. Background | Experiment | Model | In progress | Discussion CG – PD sequence

  18. Background | Experiment | Model | In progress | Discussion PD before and after

  19. Background | Experiment | Model | In progress | Discussion CG before and after

  20. Background | Experiment | Model | In progress | Discussion Transfer from PD to CG • Increased [1,1] (surface transfer) • Increased alternation (deep transfer)

  21. Background | Experiment | Model | In progress | Discussion Transfer from CG to PD • Increased [1,1] (surface + deep transf.)

  22. Background | Experiment | Model | In progress | Discussion PD CG [1,1] Surface [1,1] Deep [10,-1] / [-1,10] Divergent effects

  23. Background | Experiment | Model | In progress | Discussion CG PD [1,1] Surface [1,1] Deep [10,-1] / [-1,10] Convergent effects

  24. Background | Experiment | Model | In progress | Discussion Reciprocation by info

  25. Background | Experiment | Model | In progress | Discussion Payoff by info in PD and CG

  26. Background | Experiment | Model | In progress | Discussion Summary results • Mutual cooperation increases with awareness of interdependence (info) • Transfer of learning • Better performance “after” than “before” • Combined effects of surface and deep similarities • CG -> PD surface similarity facilitates transfer • PD -> CG surface similarity interferes with transfer • Transfer occurs in both directions • Mechanism of generalization • Reciprocal trust?

  27. Background | Experiment | Model | In progress | Discussion Cognitive model • Awareness of interdependence • Opponent modeling • Generality • Utility learning (reinforcement learning) • Transfer of learning • Surface transfer • Deep transfer

  28. Background | Experiment | Model | In progress | Discussion Opponent modeling • Instance-based learning • Dynamic representation of the opponent • Sequence learning • Prediction of opponent’s next move • Instance (snapshot of the current situation) • Previous moves and opponent’s current move • Contextualized expectations

  29. Background | Experiment | Model | In progress | Discussion Utility learning • Reinforcement learning • Strategy: what move to make given • Expected move of opponent • Context (previous moves) • Reward functions • Own payoff – Opponent’s payoff • Opponent’s payoff • Joint payoff – Opponent’s previous payoff

  30. Background | Experiment | Model | In progress | Discussion Surface transfer • Declarative sub-symbolic learning • Retrieval of instances guided by recency and frequency • Strategy learning • A learned strategy continues to be used for a while until it is unlearned

  31. Background | Experiment | Model | In progress | Discussion Deep transfer • Trust learning / Trust dynamics • Trust accumulator • Increases when opponent makes cooperative (risky) moves • Decreases when opponent makes competitive moves • Trust invest accumulator • Increases with mutually destructive outcome • Decreases with unreciprocated cooperation (risk taking)

  32. Background | Experiment | Model | In progress | Discussion Meta strategy • Determines which reward function to use • Trust accumulator <= 0 • Reward = own payoff – opponent’s payoff • Trust invest accumulator > 0 • Reward = opponent’s payoff • Trust accumulator > 0 • Reward = joint payoff – opp’s previous payoff

  33. Background | Experiment | Model | In progress | Discussion Instance Current moves: A B Previous moves: A A Move Best response: A Predicted move: A Opponent Move A Declarative Memory Procedural Memory Trust Trust accumulator Trust invest Inst3 Rule1 Rule3 Inst2 Inst4 Rule2 Reward Inst1 Prediction Previous moves: A B Opponent move: A Model diagram Environment ACT-R ACT-R extension HSCB 2011

  34. Background | Experiment | Model | In progress | Discussion PD-CG

  35. Background | Experiment | Model | In progress | Discussion CG-PD

  36. Background | Experiment | Model | In progress | Discussion PD-CG surface transfer

  37. Background | Experiment | Model | In progress | Discussion PD-CG deep transfer

  38. Background | Experiment | Model | In progress | Discussion CG – PD surf+deep transfer

  39. Background | Experiment | Model | In progress | Discussion Trust simulation

  40. Background | Experiment | Model | In progress | Discussion Summary model results • Awareness of interdependence • Opponent modeling • Generality • Utility learning • Transfer of learning • Surface level transfer: cognitive units • Deep level transfer: Trust

  41. Background | Experiment | Model | In progress | Discussion In progress • Expand model to account for all information conditions • Develop more ecologically valid paradigm (IPD^3) • Model “affective” processes in ACT-R

  42. Background | Experiment | Model | In progress | Discussion IPD^3

  43. Background | Experiment | Model | In progress | Discussion General discussion • Transfer of learning is possible • Deep similarities: interpersonal level • IPD^3 • To be used in behavioral experiments • Tool for learning strategic interaction skills

  44. Acknowledgments • Coty Gonzalez • Jolie Martin • Hau-Yu Wong • Muniba Saleem • This research is supported by the Defense Threat Reduction Agency (DTRA) grant number: HDTRA1-09-1-0053 to Cleotilde Gonzalez and Christian Lebiere

  45. • Thank you for your attention! • Questions?

More Related