Explaining Human Multiple Object Tracking with Resource-Constrained Inference

Explaining human multiple object tracking as resource-constrained approximate inference in a dynamic probabilistic model Ed Vul Mike Frank George Alvarez Josh Tenenbuam Support from: ONR-MURI (PI: Bavelier); NDSEG (Vul); NSF (Vul)

X N L P F J D Q R G A U

Multiple object tracking What limits performance?

What limits our performance? Juice? Slots? Number tracked out of 16 total (Alvarez, Franconeri, Cavanagh) (Sperling, 1960; Vogel Machizawa 2003)

Explanations of success and failure Phenomena Explanations • Harder when objects are faster(Alvarez & Franconeri, 2007) • Easier when they are further apart.(Franconeri et al., 2008) • We remember object velocity (Horowitz, 2008) • But we don’t seem to use it(Keane & Pylyshyn, 2005) • Well, maybe sometimes (Fencsik et al. 2005) • Keeping track of features is hard(Pylyshyn, 2004) • But we can track in color space(Blaser, Pylyshyn, & Holcombe, 2000) • And unique colors help(Makovski & Jiang, 2009) • We can only track a few objects(Pylyshyn & Storm, 1988; Intrilligator et al., 2001) • But we can track more if they are slower(Alvarez & Franconeri, 2007) Crowding? Spotlights? Must specify performance of an unconstrained observer before postulating limitations. Inhibitory surrounds? FINSTs? Speed limits? Slots? Juice? Juice boxes?

Stark contrast with the real world

Puzzles • What limits human performance in object tracking? • Discrepancy between poor performance in the lab and robust behavior in the real world?

Our approach • Look to engineering to see how people should track objects. • Do people track objects this way? • What resources would limit performance? • How should these resources be allocated? • Do people allocate resources in this manner?

An ideal observer for MOT dynamics ? ? … S: states m: observations α: data association S0 S1 St ? ? α0 α1 αt ? ? ? ? mt m0 m1 D A C B …

Dynamics and process uncertainty dynamics … S0 S1 St α0 α1 αt mt m0 m1 Parameterize by inertia and spacing / velocity variance of stationary distribution.

Measurement uncertainty dynamics dynamics ? … … S0 S0 S1 S1 St St ? α0 α0 α1 α1 αt αt ? ? mt mt Eccentricity scaling of position errors. m0 m0 m1 m1 Weber scaling of velocity errors Unknown data-associations

An ideal observer for MOT Measurement mt D D D D Assignment A A P(αt | mt,S’t)(sampled by particle filter) A A C Estimate C P(St | αt, mt, S’t)(Kalman filter) C B C B B B Prediction P(S’t+1 | St, dynamics) (Kalman filter)

An ideal observer for MOT

Speed/space for people (Franconeri et al., 2008)

Speed/space for the ideal observer The correspondence problem is harder as speed increases or spacing decreases. faster closer

Speed/space for the ideal observer Model thresholds 10-1 Faster  Velocity variance (σv) 10-2 100 Spatial concentration (1/σx) Closer 

Speed/space tradeoff experiment Adjust minimal spacing for given a fixed speed. Track 3 of 6

Speed/space tradeoff Human thresholds Model thresholds 10-1 Faster  Velocity variance (σv) 10-2 100 Spatial concentration (1/σx) Closer  People make similar tradeoffs between speed and space as the ideal observer.

Incorporating extra dynamic cues

Incorporating extra dynamic cues Human thresholds Model thresholds Color-drift = 0.02π Color-drift = 0.2π 10-1 Faster  Velocity variance (σv) 10-2 100 Spatial concentration (1/σx) Closer  People make similar tradeoffs between speed/space and color as the ideal observer.

Predictive use of inertia Model thresholds Human thresholds Inertia stable Inertia = 0.7 Inertia = 0.9 10-1 Inertia unstable Velocity variance (σv) Faster  10-2 100 Spatial concentration (1/σx) Closer  People use velocity when appropriate

Ideal object tracking Phenomena explained ??? • Speed • Spacing • Intermittent use of inertia • Use of additional cues • Track duration • Additional distracters We can only track a few objects (Pylyshyn & Storm, 1988; Intrilligator et al., 2001) But we can track more if they are slower (Alvarez & Franconeri, 2007) Number tracked out of 16 total

Sw[i] = Sw / ( 1 + √ai ) Resources in tracking objects dynamics Limited ability to propagate state estimates: Limited memory fidelity St+1 St αt+1 αt P(S’t+1 | St + wm, dynamics) A resource that reduces noise in propagation of state estimates: mt mt+1

Number of targets Model thresholds Human thresholds(Alvarez & Franconeri, 2007) 0.3 Velocity variance (σv) 0.2 Faster  0.1 4 5 1 2 3 6 7 8 Number tracked out of 16 total Number of targets Number tracked out of 16 total Limited memory produces the speed-number tradeoff.

Strategic allocation of resources • Allocating resources • Planning by greedy 1-step look-ahead sampling. • Make one object more demanding. Targets Distractors Position Velocity Targets Distractors Position Velocity model human … Object n Object 1 … Object n Object 1

Contributions • Ideal observer for human object tracking based on common engineering models… • …Accounts for effects of speed, space, inertia, time, additional cues, distracters. • A resource such as finite memory for state estimates accounts for target # effects. • Teaser about strategic allocation of resources.

Targets Distractors Position Velocity Targets Distractors Position Velocity … Object n Object 1 … Object n Object 1

How do people track objects? • Harder to track when objects are faster(Alvarez & Franconeri, 2007) • Easier when they are further apart.(Franconeri et al., 2008) • We remember object velocity (Horowitz, 2008) • But we don’t seem to use it(Keane & Pylyshyn, 2005) • Yet it helps us track (Fencsik et al. 2005) • Keeping track of identity is hard(Pylyshyn, 2004) • But we can track in color space(Blaser, Pylyshyn, & Holcombe, 2000) • And unique colors help(Makovski & Jiang, 2009) • We can only track a few objects(Pylyshyn & Storm, 1988; Intrilligator et al., 2001) • But we can track more if they are slower(Alvarez & Franconeri, 2007)

Inertia and trajectories • Assumed “inertia” predicts future dot position. With inertia Without inertia

Inertia and trajectories Model with inertia Model without inertia With inertia Without inertia 10-1 σv Velocity std. dev. faster  People use inertia / extrapolated trajectories to track like the ideal observer: Only when it is stable and predictive. 10-2 100 100 σx σx 100 σx more space  Position std. dev. Inertia = 0.7 Inertia = 0.7 Inertia = 0.7 Inertia = 0.8 Inertia = 0.8 Inertia = 0.9 Inertia = 0.9 Inertia = 0.9 Inertia varied Across subjects Inertia varied Across trials

Why only use inertia sometimes? • Underestimating inertia under uncertainty?

Allocating a finite resource • What allocation policies do people consider? • Just Target vs. Non-Target? • Or can people more flexibly adapt to task demands • Allocate to individual items Beta(T, D) Targets Distractors Position Beta(p,v) Velocity … Object n Object 1 Dirichlet(Α)

How flexible is resource allocation? • Track 4 of 8 • 1 tracked object (and one distractor) is “crazy charlie”: • Charlie moves faster/slower than the other targets. • Measure performance on other 3 targets. • Key question: • Can we flexibly allocate to individual objects based on need? If so, accuracy on other targets should decrease as Charlie’s speed increases. • If we only allocate to targets vs. non-targets, Charlie’s speed should not matter.

Optimal Crazy Charlie performance Accuracy on non-Charlie targets Log10(Charlie speed / Other speed)

Crazy Charlie experimental results Speed of “key” target alters performance on other targets Tracking one harder target “steals” resources from the others Slow “Charlie” Fast “Charlie”

Multiple object tracking (Pylyshyn & Storm, 1988; and many others.)

Tracking in non-spatial dimensions Are spatial and non-spatial features combined to track objects? Blaser, Holcombe, Pylyshyn, 2000

Tracking in non-spatial dimensions

Tracking in non-spatial dimensions Are spatial and non-spatial features combined to track objects? Human settings Model thresholds Color-drift = 0.02π Color-drift = 0.02π Color-drift = 0.2π Color-drift = 0.2π 10-1 σv Velocity std. dev. faster  10-2 People trade off spatial and non-spatial information when tracking. 100 100 σx σx more space  Position std. dev. Blaser, Holcombe, Pylyshyn, 2000

An ideal observer for MOT How should people track objects given available information? dynamics Given: 1) Starting state: α0, m0 2) Unlabeled measurements: m1, … mt 3) Model of the dynamics. Find: Final labels: αt Final state: St ? ? … S0 S1 St ? ? α0 α1 αt ? ? ? ? mt m0 m1 D A C B … Invert to compute: P(St, αt|m0,…mt-1, α0)

Explaining Human Multiple Object Tracking with Resource-Constrained Inference

Explaining Human Multiple Object Tracking with Resource-Constrained Inference

Presentation Transcript

Josh!

Josh Dutton * Mike Sunday * Todd York

Computer Architecture Principles Dr. Mike Frank

Computer Architecture Principles Dr. Mike Frank

Computer Architecture Principles Dr. Mike Frank

Computer Architecture Principles Dr. Mike Frank

Computer Architecture Principles Dr. Mike Frank

Computer Architecture Principles Dr. Mike Frank

Computer Architecture Principles Dr. Mike Frank

Computer Architecture Principles Dr. Mike Frank

Computer Architecture Principles Dr. Mike Frank

Computer Architecture Principles Dr. Mike Frank

Why VUL?

Computer Architecture Principles Dr. Mike Frank

Computer Architecture Principles Dr. Mike Frank

Computer Architecture Principles Dr. Mike Frank

Computer Architecture Principles Dr. Mike Frank

Computer Architecture Principles Dr. Mike Frank

Computer Architecture Principles Dr. Mike Frank

Computer Architecture Principles Dr. Mike Frank

Computer Architecture Principles Dr. Mike Frank