Understanding Minimal Subscenes Selection in Visual Scenes

Minimal Subscene • Working definition: The smallest set of objects, actors and actions in a dynamic visual scene that are relevant to present behavior For now we will assume: • Bottom-up: objects/actors/actions must be visible • Top-down: relevance to present behavior explicitly specified, e.g., by specifying a question or task • Knowledge base: the system may supplement explicit knowledge with long-term acquired knowledge CS 664, Session 20

Generalarchitecture CS 664, Session 20

Factors influencing selection of minimal subscene At least include… • Setting/gist/layout • Bottom-up salience • Cultural/learned • Top-down CS 664, Session 20

Factors influencing selection of minimal subscene (1) • Setting/gist/layout: selected objects/actors/actions tended to be: - at center of field of view - in the foreground / occludes other objects/actors/actions - followed by camera if camera moved - present throughout video clip - often getting closer / growing larger • E.g., boy playing with scooter; bare-chested man standing & drinking Caveats: - lack of stereo increases foreground/background interferences - having everything in focus is unnatural - ambiguity in selection of minimal subscene if actors pass by - selected minimal subscene may disintegrate due to occlusions CS 664, Session 20

Factors influencing selection of minimal subscene (2) • Bottom-up salience: introspection as well as model suggest that selected objects/actors/actions were fairly salient • E.g., boy riding scooter; bare-chested man • Note: motion cues widely agreed to be the strongest Caveats: - low-quality video makes details difficult to perceive - salient distracting actors may disengage attention from current minimal subscene CS 664, Session 20

Factors influencing selection of minimal subscene (3) • Cultural/learned: some actors/objects/actions may bear culturally strong meaning that is likely to make them belong to the minimal subscene • E.g., finger pointing movement; facial expressions; alpha male Caveats: - culture-specific (? – not tested) - experience-specific (? – not tested) CS 664, Session 20

Factors influencing selection of minimal subscene (4) • Top-down: behavioral priorities and personal likings influence selection of components of minimal subscene • E.g., nerd playing with electronic gadget; handsome man; pretty girl; groups more interesting than isolated people Caveats: - somewhat linked to cultural - gender-specific differences - most probably influenced by nature of task but we have not explicitly tested for that CS 664, Session 20

Nature of minimal subscene • Exploration mode: Initial selection seems guided by setting/gist/layout as well as salience • E.g., focus on salient actors at center & foreground • Analysis mode: once locked onto a minimal subscene, all background activity becomes distracting • Disengagement: if background distractor strong enough, may break current minimal subscene and trigger analysis of another minimal subscene • E.g., nice girl passing by CS 664, Session 20

Additional caveats • If a minimal subscene is too boring, it will easily disintegrate • E.g., second clip with boy & dad playing with scooter: “pffff, him again, I know what he will be doing, so let’s check out what else is happening” • Cameraman may have strong influence on which minimal subscene is selected • E.g., by determining centering, by following some actors (or not following them – which may be perceived as unnatural and be distracting) • In extended video clips, several minimal subscenes may be selected in sequence • E.g., first boy with scooter, then pretty girl, then man with dog, etc. CS 664, Session 20

Can we deal with all that? • Setting/gist/layout: in principle, yes – some limited models exist • Bottom-up salience: should be fine based on previous modeling • Cultural/learned: very difficult for a computer system! Cues often very subtle (e.g., facial expressions) or involve complex spatial transformations (e.g., pointing to a location in 3D space) • Top-down: should be fine based on previous modeling work CS 664, Session 20

Generalarchitecture It is important to Note that the General architecture Seems to support all Functions just described. CS 664, Session 20

More video clips? • Multi-threaded events / interactions • Influence of task • Foreground/background ambiguities • Cross-clip continuity • Effects of scale • Multiple simultaneous subscenes • Etc… CS 664, Session 20

Examples / experiments • Examine video clips • For each scene, please write down: • Most salient object • Most salient action • Minimal subscene • Who is doing what to whom CS 664, Session 20

Scene 018 CS 664, Session 20

Scene 018 – Attentional Trajectory CS 664, Session 20

Understanding Minimal Subscenes Selection in Visual Scenes

Understanding Minimal Subscenes Selection in Visual Scenes

Presentation Transcript

Minimal Spanning Trees

Minimal Spanning Trees

Non-minimal Routing

Minimal Surface

Minimal Spanning Trees

Minimal DFA

Minimal impact strategies

MINIMAL DEFECTS

Minimal Spanning Tree

Minimal sufficient statistic

Minimal Polynomials

Minimal Spanning Tree

Minimal Art, Post-Minimal, and Conceptual Art

Minimal Neural Networks

Women's Minimal Clothing

YIFY Subtitles BY Subscene

Minimal Envelopes

Minimal Spanning Trees

Minimal

MINIMAL PLATES