480 likes | 720 Views
Towards Total Scene Understanding: Classification, Annotation and Segmentation in an Automatic Framework. Li- Jia Li, Richard Socher , Li Fei-Fei. City Travel. Pagoda. Sunrise Sunshine Sun. Weber et al 00 Fergus et al 03 Felzenswalb et al 04. Classification. City Travel.
E N D
Towards Total Scene Understanding:Classification, Annotation and Segmentation in an Automatic Framework Li-Jia Li, Richard Socher, Li Fei-Fei
City Travel Pagoda Sunrise Sunshine Sun
Weber et al 00 Fergus et al 03 Felzenswalb et al 04 Classification City Travel Fei-Fei et al 05 Sivic et al 05 Bosch et al 06 Oliva et al 01 Lazebnik et al 06 Segmentation Pagoda Duygulu et al 02 Shi et al 00 Sunrise Sunshine Sun Felzenszwalb et al04 Annotation Blei et al 03 Sali et al 99 Winn et al 05 Kumar et al 05 Gupta et al 08 Barnard et al 03 Remark: Approaches in yellowwill be used to compare with our model in later Experiments. Cao et al 07 Russell et al 06 Todorovic et al 06 Alipr Li et al 03 Sudderth et al 05
Weber et al 00 Fergus et al 03 Felzenswalb et al 04 Classification City Travel Fei-Fei et al 05 Sivic et al 05 Bosch et al 06 Oliva et al 01 Lazebnik et al 06 Total Scene Understanding Segmentation U Pagoda Duygulu et al 02 Shi et al 00 Sunrise Sunshine Sun Felzenszwalb et al04 Annotation Blei et al 03 Sali et al 99 Winn et al 05 Kumar et al 05 Gupta et al 08 Barnard et al 03 Cao et al 07 Russell et al 06 Todorovic et al 06 Alipr Li et al 03 Sudderth et al 05
Classification Annotation Segmentation Mutually beneficial!
Classification Annotation Segmentation class: Polo Athlete Horse Grass Trees Sky Saddle Horse Horse
Classification Annotation Segmentation class: Polo Sky Tree Athlete Athlete Horse Grass Trees Sky Saddle Horse Horse Horse Horse Horse Horse Grass
Classification Annotation Segmentation class: Polo Athlete Horse Grass Trees Sky Saddle Horse Horse Horse Horse Horse
Annotation Classification Classification Segmentation Segmentation Annotation Sky Sky Tree Tree Athlete Athlete Horse Horse Horse Horse Horse Grass Horse Horse Horse Horse Horse Horse Horse Grass Class: Polo Class: Polo Related Work: Tu et al 03 Heitz et al 08 Li & Fei-Fei 07
Outline Model Classification Learning Segmentation Annotation Recognition & Experiment
C S Athlete Horse Grass Trees Sky Saddle O T X R Z Ar NF Nr Nt D
class: Polo C Text Visual Athlete Horse Grass Trees Sky Saddle D Visual Component Joint distribution of random variable . Text Component
class: Polo C Text Visual O D . Text Component 14
class: Polo C Text Visual O R Color Location Texture Shape NF D . Text Component
class: Polo C Text Visual O X R Ar NF D . Text Component
class: Polo C Text Visual Athlete Horse Grass Trees Sky Saddle O X R Z Ar NF Nr Nt D “Connector variable” . Text Component
class: Polo C “Switch variable” Text Visible Not visible Visual S Athlete Horse Grass Trees Sky Saddle Athlete Horse Grass Trees Sky Saddle Athlete O Horse Horse Horse Horse Horse X R Horse Z Ar NF Nr Nt D “Connector variable” .
class: Polo C “Switch variable” Text Visible Not visible Visual S Athlete Horse Grass Trees Sky Saddle Horse O T X R Z Ar NF Nr Nt D “Connector variable” .
Outline Model C Text Learning Visual S O T X R Z Recognition & Experiment Ar NF Nr Nt
Learning C Exact Inference is Intractable ! Text Visual S O T Relationship of the random variables X R Z Ar NF Nr Nt
Collapsed Gibbs Sampling (R. Neal, 2000) C Text Top-down force Visual Bottom-up force from visual information S O Bottom-up force from text information T Relationship of the random variables X R Z Ar NF Nr Nt
There is no object-text correspondence… Scene/Event images from the Internet Athlete Horse Grass Tree Saddle
Our model builds the correspondence… Scene/Event images from the Internet C S O T Athlete X R Horse Z Grass Ar NF Nr Tree Nt Saddle D
However, a big obstacle is: many objects always co-occur together Scene/Event images from the Internet ? Athlete Horse Grass Ball ? ? Athlete Horse Grass Trees Sky Saddle
One solution: some good initialization of O C Scene/Event images from the Internet S O T X R Z Nr NF Ar Nt Athlete Horse Grass Trees Sky Saddle Athlete Horse Grass
Initializing O: obtain internet images for each O Scene/Event images from the Internet Object images
Initializing O: train an object detector for each O Object images Event/Scene images Scene/Event images Any object detection & segmentation Algorithm C S O T X R Z Ar NF Nr Nt D
Initialize O in the scene image by the trained object detectors Object images Event/Scene images Scene/Event images Any object detection & segmentation Algorithm Black box object detection & segmentation … Black box object detection & segmentation C S O T X R Z Ar NF Nr Nt D
Initialize O in the scene image by the trained object detectors Object images Event/Scene images Cao & Fei-Fei, 2007 Scene/Event images θ C Black box object detection & segmentation Black box object detection & segmentation O R X Ar Nr … Black box object detection & segmentation C S O T X R Z Ar NF Nr Nt D Our Model
Auto-semi-supervised learning: Small # of initialized images + Large # of uninitialized images Scene/Event images Small # of initialized images Our Model + C S O T Athlete Athlete Athlete Snow Rock Horse X R Z Sky Grass Grass Ar NF Nr Nt Tree Tree Tree D Rope Wind Snowboard Large # of uninitialized images Saddle Sky
Outline Model Learning Small # of automatically initialized images C Text Visual S Large # of uninitialized images O T Athlete Athlete Athlete Recognition & Experiment Rock Snow Horse Sky Grass Grass X R Tree Tree Tree • Dataset • Learned Model • Results Rope Wind Snowboard Z Saddle Sky Ar NF Nr Nt
8 Event/Scene Classes Badminton Bocce Croquet Polo Remark: Tags are not used during testing
8 Event/Scene Classes Rockclimbing Rowing Sailing Snow boarding
Learned model: O C S O T X R Z Ar NF Nr Nt D
Learned model: R Athlete C S Grass O T X R Z Horse Ar NF Nr Nt D
Learned model: S C S O T X R Z Ar NF Nr Nt D
Classification Annotation Segmentation 8 way classification: 54%
Classification Annotation Segmentation Alipr: Li et al 03 Corr LDA: Blei et al 03
Classification Annotation Segmentation
Effect of top-down class context C Model w/o top-down class Full Model S S O O T T R R X X Z Z Horse
Learning Model Small # of automatically initialized images Sky Rock Mountain C Text Sky Visual S Large # of uninitialized images Tree Athlete Recognition & Experiment Athlete Class: Sailing O T Class: Rock climbing Athlete Sailboat Tree Water Sky Wind Athlete Athlete Athlete Tree Rock Snow sailboat Horse Water Sky Grass Grass Athlete Mountain Tree Rock Sky Ascent X R Tree Tree Tree Rope Wind Snowboard Z Saddle Sky Class: Snowboarding Ar NF Athlete Nr Nt Snowboard Athlete Snowboard Tree Snow Sky Powder Tree Snow
Thank Prof. Silvio Savarese , Juan Carlos Niebles, Chong Wang, Barry Chai, Min Sun, Bangpeng Yao, Hao Su, Jia Deng, anonymous reviewers And You