470 likes | 510 Views
General architecture. Minimal Subscene. Working definition: The smallest set of objects, actors and actions in a dynamic visual scene that are relevant to present behavior For now we will assume: Bottom-up: objects/actors/actions must be visible
E N D
Generalarchitecture CS 664, Session 19
Minimal Subscene • Working definition: The smallest set of objects, actors and actions in a dynamic visual scene that are relevant to present behavior For now we will assume: • Bottom-up: objects/actors/actions must be visible • Top-down: relevance to present behavior explicitly specified, e.g., by specifying a question or task • Knowledge base: the system may supplement explicit knowledge with long-term acquired knowledge CS 664, Session 19
Motivation:Humans • 1) Free examination • 2) estimate material • circumstances of family • 3) give ages of the people • 4) surmise what family has • been doing before arrival • of “unexpected visitor” • 5) remember clothes worn by • the people • 6) remember position of people • and objects • 7) estimate how long the “unexpected • visitor” has been away from family CS 664, Session 19 Yarbus, 1967
“Beobot” CS 664, Session 19
VisualAttention see http://iLab.usc.edu CS 664, Session 19
ObjectRecognition Riesenhuber & Poggio, Nat Neurosci, 1999 (MIT) CS 664, Session 19
Action Recognition Oztop & Arbib, 2001 CS 664, Session 19
Start: • Issue question • Parse question • Extract keywords • Expand to related concepts, • using ontology/KB • -Fill initial “task list” CS 664, Session 19
Task list Working list of currently relevant objects/actors/actions • Initially empty • Question/task specification provides initial filling-in • As the scene is scanned and objects/actors/actions are recognized, contents of task list are updated CS 664, Session 19
“Where:” attention, saliency map and task map Input: video stream Low-level vision: massively parallel extraction of simple visual features from video input Saliency map: localizes conspicuous (potentially interesting) objects irrespectively of why they are salient Task map: acts as spatial filter to saliency map; only locations in the current minimal subscene can easily pass through. Other locations need to be exceptionally salient to pass through. CS 664, Session 19
“What” memory Relates concepts to visual properties Bridge between visual and semantic knowledge CS 664, Session 19
Generalarchitecture CS 664, Session 19
Examples / experiments • Examine video clips • For each scene, please write down: • Most salient object • Most salient action • Minimal subscene • Who is doing what to whom CS 664, Session 19
Scene 001 CS 664, Session 19
Scene 001 – Attentional Trajectory CS 664, Session 19
Scene 002 CS 664, Session 19
Scene 002 – Attentional Trajectory CS 664, Session 19
Scene 003 CS 664, Session 19
Scene 003 – Attentional Trajectory CS 664, Session 19
Scene 004 CS 664, Session 19
Scene 004 – Attentional Trajectory CS 664, Session 19
Scene 005 CS 664, Session 19
Scene 005 – Attentional Trajectory CS 664, Session 19
Scene 006 CS 664, Session 19
Scene 006 – Attentional Trajectory CS 664, Session 19
Scene 007 CS 664, Session 19
Scene 007 – Attentional Trajectory CS 664, Session 19
Scene 008 CS 664, Session 19
Scene 008 – Attentional Trajectory CS 664, Session 19
Scene 009 CS 664, Session 19
Scene 009 – Attentional Trajectory CS 664, Session 19
Scene 010 CS 664, Session 19
Scene 010 – Attentional Trajectory CS 664, Session 19
Scene 011 CS 664, Session 19
Scene 011 – Attentional Trajectory CS 664, Session 19
Scene 012 CS 664, Session 19
Scene 012 – Attentional Trajectory CS 664, Session 19
Scene 013 CS 664, Session 19
Scene 013 – Attentional Trajectory CS 664, Session 19
Scene 014 CS 664, Session 19
Scene 014 – Attentional Trajectory CS 664, Session 19
Scene 015 CS 664, Session 19
Scene 015 – Attentional Trajectory CS 664, Session 19
Scene 016 CS 664, Session 19
Scene 016 – Attentional Trajectory CS 664, Session 19
Scene 017 CS 664, Session 19
Scene 017 – Attentional Trajectory CS 664, Session 19