Hierarchical Bayesian Models for Aggregating Retrieved Memories across Individuals

Hierarchical Bayesian Models for Aggregating Retrieved Memories across Individuals Mark Steyvers Department of Cognitive Sciences University of California, Irvine Joint work with: Michael Lee Brent Miller Pernille Hemmer Bill Batchelder Paolo Napoletano

Ordering problem: what is the correct order of these Presidents? George Washington Thomas Jefferson time Thomas Jefferson James Monroe James Monroe Andrew Jackson John Adams George Washington Andrew Jackson John Adams

Goal: aggregating responses ground truth group answer ? A B C D = A B C D Aggregation Algorithm A D B C D A B C B A D C A C B D A B D C

Bayesian Approach ground truth = latent common cause A B C D Generative Model A D B C D A B C B A D C A C B D A B D C

Important notes: • No communication between individuals • There is always a true answer (ground truth) • Aggregation algorithm never has access to ground truth • ground truth only used for evaluation

Matching problem: Van Gogh Rembrandt Monet Renoir C A D B

Wisdom of crowds phenomenon • Crowd estimate is often better than any individual in the crowd (Think of independent noise influencing each individual)

Examples of wisdom of crowds phenomenon Galton’s Ox (1907): Median of individual estimates comes close to true answer Who wants to be a millionaire?

Limitations of Current “Wisdom of Crowds” Research • Studies restricted to numeric or categorical judgments • simple averaging schemes: • Mode • Median • Mean • No treatment of individual differences • every “vote” is treated equally • downplayed role of expertise

Cultural Consensus Theory (CCT)E.g. Romney, Batchelder, and Weller (1987) • Finds the “answer key” to multiple choice questions when ground truth is lost • takes person and item differences into account • Informal version of CCT also developed for ranking data

Research Goals • Generalize “wisdom of crowds” effect to more complex data • Aggregation of permutations • Ranking data • Matching (assignment) data

Hierarchical Bayesian Models • Probability distributions over all permutations of items • with N items, there are N! combinations • e.g., when N=44, we have 44! > 10^53 combinations • Approximate inference methods: MCMC • Cognitively plausible generative processes • Treatment of individual differences

Part IOrdering Problems

Experiment 1 • Task: order all 44 US presidents • Methods • 26 participants (college undergraduates) • Names of presidents written on cards • Cards could be shuffled on large table

Measuring performance Kendall’s Tau: The number of adjacent pair-wise swaps = 1 = 1+1 = 2 Participant Ordering 1 2 5 3 4 1 2 5 3 4 1 2 3 4 5 1 2 5 3 4 Ground Truth 1 2 3 4 5

Empirical Results (random guessing) t

Many approaches for analyzing rank data… • Probabilistic models • Thurstone (1927) • Mallows (1957) • Plackett-Luce (1975) • Lebanon-Mao (2008) • Spectral methods • Diaconis (1989) • Heuristic methods from voting theory • Borda count … however, many of these approached developed for preference rankings

Bayesian Thurstonian Approach B C A Each item has a true coordinate on some dimension

Bayesian Thurstonian Approach Person 1 B A C … but there is noise because of encoding and/or retrieval error

Bayesian Thurstonian Approach Person 1 B A C B C A Each person’s mental representation is based on (latent) samples of these distributions

Bayesian Thurstonian Approach Person 1 B A C Observed Ordering: A < B < C B C A The observed ordering is based on the ordering of the samples

Bayesian Thurstonian Approach Person 1 B A C Observed Ordering: A < B < C B C A Person 2 B C A Observed Ordering: A < C < B C B A People draw from distributions with common mean but different variances

Graphical Model Notation j=1..3 shaded = observed not shaded = latent

Graphical Model of Bayesian Thurstonian Model Latent ground truth Individual ability Mental representation Observed ordering j individuals

Inference • Need the posterior distribution • Markov Chain Monte Carlo • Gibbs sampling on • Metropolis-hastings on and • Draw 400 samples • group ordering based on average of across samples

Wisdom of Crowds effect t model’s ordering is as good as best individual

Inferred Distributions for 44 US Presidents George Washington (1) John Adams (2) Thomas Jefferson (3) James Madison (4) median and minimumsigma James Monroe (6) John Quincy Adams (5) Andrew Jackson (7) Martin Van Buren (8) William Henry Harrison (21) John Tyler (10) James Knox Polk (18) Zachary Taylor (16) Millard Fillmore (11) Franklin Pierce (19) James Buchanan (13) Abraham Lincoln (9) Andrew Johnson (12) Ulysses S. Grant (17) Rutherford B. Hayes (20) James Garfield (22) Chester Arthur (15) Grover Cleveland 1 (23) Benjamin Harrison (14) Grover Cleveland 2 (25) William McKinley (24) Theodore Roosevelt (29) William Howard Taft (27) Woodrow Wilson (30) Warren Harding (26) Calvin Coolidge (28) Herbert Hoover (31) Franklin D. Roosevelt (32) Harry S. Truman (33) Dwight Eisenhower (34) John F. Kennedy (37) Lyndon B. Johnson (36) Richard Nixon (39) Gerald Ford (35) James Carter (38) Ronald Reagan (40) George H.W. Bush (41) William Clinton (42) George W. Bush (43) Barack Obama (44)

Model is calibrated Individuals with large sigma are far from the truth t s

Alternative Models • Many heuristic methods from voting theory • E.g., Borda count method • Suppose we have 10 items • assign a count of 10 to first item, 9 for second item, etc • add counts over individuals • order items by the Borda count • i.e., rank by average rank across people

Model Comparison t

Experiment 2 • 78 participants • 17 problems each with 10 items • Chronological Events • Physical Measures • Purely ordinal problems, e.g. • Ten Amendments • Ten commandments

45 R=0.961 40 35 30 25 t 20 15 10 5 0 0 1 2 3 s Ordering states west-east Oregon (1) Utah (2) Nebraska (3) Iowa (4) Alabama (6) Ohio (5) Virginia (7) Delaware (8) Connecticut (9) Maine (10)

Ordering Ten Amendments

35 R=0.722 30 25 20 t 15 10 5 0 0 0.5 1 1.5 2 s Ordering Ten Commandments Worship any other God (1) Make a graven image (7) Take the Lord's name in vain (2) Break the Sabbath (3) Dishonor your parents (4) Murder (6) Commit adultery (8) Steal (5) Bear false witness (9) Covet (10)

25 20 15 10 5 0 1 10 20 30 40 50 60 70 80 Average results over 17 Problems Thurstonian Model Borda count Mode Individuals t Mean Individuals

Effect of Group Composition • How many individuals do we need to average over?

Effect of Group Size: random groups t

Experts vs. Crowds • Can we find experts in the crowd? Can we form small groups of experts? • Approach • Form a group for some particular task • Select individuals with the smallest sigma (“experts”) based on previous tasks • Vary the number of previous tasks

Group Composition based on prior performance # previous tasks T = 0 T = 2 T = 8 t Group size (best individuals first)

Methods for Selecting Experts Endogenous: no feedback required Exogenous: selecting people based on actual performance t t

Model incorporating overall person ability Overall ability j individuals Task specific ability j individuals m tasks

Average results over 17 Problems new model t Mean

Part IIOrdering Problems in Episodic Memory

Another ordering problem: A B time C http://www.youtube.com/watch?v=29VGZtnCD30&feature=related D

Experiment 3 • 26 participants • 6 videos • 3 videos with stereotyped event sequences (e.g. wedding) • 3 videos “unpredictable” videos (e.g., example video) • extracted 10 stills for testing • Method • study video • followed by immediate ordering test of 10 items

Bayesian Thurstonian Model t = 3

Two other examples t = 1 t = 0

Overall Results t Mean

Part IIIMatching Problems

Example Matching Problem (one-to-one)

Hierarchical Bayesian Models for Aggregating Retrieved Memories across Individuals

Hierarchical Bayesian Models for Aggregating Retrieved Memories across Individuals

Presentation Transcript

Part III Hierarchical Bayesian Models

hierarchical regression models

Module 2: Bayesian Hierarchical Models

Hierarchical Models

Wisdom of Crowds in Human Memory: Reconstructing Events by Aggregating Memories across Individuals

Wisdom of Crowds in Human Memory: Reconstructing Events by Aggregating Memories across Individuals

Bayesian Hierarchical Clustering

Part III Learning structured representations Hierarchical Bayesian models

Learning overhypotheses with hierarchical Bayesian models

Randomized Algorithms for Bayesian Hierarchical Clustering

Hierarchical Hardness Models for SAT

Hierarchical Hardness Models for SAT

Bayesian Models

HIERARCHICAL LINEAR MODELS

Bayesian Models

Chapter 2: Bayesian hierarchical models in geographical genetics

Hierarchical Models

Bayesian Hierarchical Clustering

Hierarchical Models

Hierarchical Bayesian-Kalman Models for Regularization and ARD in Sequential Learning