1 / 43

Li- Jia Li, Richard Socher , Li Fei-Fei

Towards Total Scene Understanding: Classification, Annotation and Segmentation in an Automatic Framework. Li- Jia Li, Richard Socher , Li Fei-Fei. City Travel. Pagoda. Sunrise Sunshine Sun. Weber et al 00 Fergus et al 03 Felzenswalb et al 04. Classification. City Travel.

gerard
Download Presentation

Li- Jia Li, Richard Socher , Li Fei-Fei

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Towards Total Scene Understanding:Classification, Annotation and Segmentation in an Automatic Framework Li-Jia Li, Richard Socher, Li Fei-Fei

  2. City Travel Pagoda Sunrise Sunshine Sun

  3. Weber et al 00 Fergus et al 03 Felzenswalb et al 04 Classification City Travel Fei-Fei et al 05 Sivic et al 05 Bosch et al 06 Oliva et al 01 Lazebnik et al 06 Segmentation Pagoda Duygulu et al 02 Shi et al 00 Sunrise Sunshine Sun Felzenszwalb et al04 Annotation Blei et al 03 Sali et al 99 Winn et al 05 Kumar et al 05 Gupta et al 08 Barnard et al 03 Remark: Approaches in yellowwill be used to compare with our model in later Experiments. Cao et al 07 Russell et al 06 Todorovic et al 06 Alipr Li et al 03 Sudderth et al 05

  4. Weber et al 00 Fergus et al 03 Felzenswalb et al 04 Classification City Travel Fei-Fei et al 05 Sivic et al 05 Bosch et al 06 Oliva et al 01 Lazebnik et al 06 Total Scene Understanding Segmentation U Pagoda Duygulu et al 02 Shi et al 00 Sunrise Sunshine Sun Felzenszwalb et al04 Annotation Blei et al 03 Sali et al 99 Winn et al 05 Kumar et al 05 Gupta et al 08 Barnard et al 03 Cao et al 07 Russell et al 06 Todorovic et al 06 Alipr Li et al 03 Sudderth et al 05

  5. Application

  6. Classification Annotation Segmentation Mutually beneficial!

  7. Classification Annotation Segmentation class: Polo Athlete Horse Grass Trees Sky Saddle Horse Horse

  8. Classification Annotation Segmentation class: Polo Sky Tree Athlete Athlete Horse Grass Trees Sky Saddle Horse Horse Horse Horse Horse Horse Grass

  9. Classification Annotation Segmentation class: Polo Athlete Horse Grass Trees Sky Saddle Horse Horse Horse Horse Horse

  10. Annotation Classification Classification Segmentation Segmentation Annotation Sky Sky Tree Tree Athlete Athlete Horse Horse Horse Horse Horse Grass Horse Horse Horse Horse Horse Horse Horse Grass Class: Polo Class: Polo Related Work: Tu et al 03 Heitz et al 08 Li & Fei-Fei 07

  11. Outline Model Classification Learning Segmentation Annotation Recognition & Experiment

  12. C S Athlete Horse Grass Trees Sky Saddle O T X R Z Ar NF Nr Nt D

  13. class: Polo C Text Visual Athlete Horse Grass Trees Sky Saddle D Visual Component Joint distribution of random variable . Text Component

  14. class: Polo C Text Visual O D . Text Component 14

  15. class: Polo C Text Visual O R Color Location Texture Shape NF D . Text Component

  16. class: Polo C Text Visual O X R Ar NF D . Text Component

  17. class: Polo C Text Visual Athlete Horse Grass Trees Sky Saddle O X R Z Ar NF Nr Nt D “Connector variable” . Text Component

  18. class: Polo C “Switch variable” Text Visible Not visible Visual S Athlete Horse Grass Trees Sky Saddle Athlete Horse Grass Trees Sky Saddle Athlete O Horse Horse Horse Horse Horse X R Horse Z Ar NF Nr Nt D “Connector variable” .

  19. class: Polo C “Switch variable” Text Visible Not visible Visual S Athlete Horse Grass Trees Sky Saddle Horse O T X R Z Ar NF Nr Nt D “Connector variable” .

  20. Outline Model C Text Learning Visual S O T X R Z Recognition & Experiment Ar NF Nr Nt

  21. Learning C Exact Inference is Intractable ! Text Visual S O T Relationship of the random variables X R Z Ar NF Nr Nt

  22. Collapsed Gibbs Sampling (R. Neal, 2000) C Text Top-down force Visual Bottom-up force from visual information S O Bottom-up force from text information T Relationship of the random variables X R Z Ar NF Nr Nt

  23. There is no object-text correspondence… Scene/Event images from the Internet Athlete Horse Grass Tree Saddle

  24. Our model builds the correspondence… Scene/Event images from the Internet C S O T Athlete X R Horse Z Grass Ar NF Nr Tree Nt Saddle D

  25. However, a big obstacle is: many objects always co-occur together Scene/Event images from the Internet ? Athlete Horse Grass Ball ? ? Athlete Horse Grass Trees Sky Saddle

  26. One solution: some good initialization of O C Scene/Event images from the Internet S O T X R Z Nr NF Ar Nt Athlete Horse Grass Trees Sky Saddle Athlete Horse Grass

  27. Initializing O: obtain internet images for each O Scene/Event images from the Internet Object images

  28. Initializing O: train an object detector for each O Object images Event/Scene images Scene/Event images Any object detection & segmentation Algorithm C S O T X R Z Ar NF Nr Nt D

  29. Initialize O in the scene image by the trained object detectors Object images Event/Scene images Scene/Event images Any object detection & segmentation Algorithm Black box object detection & segmentation … Black box object detection & segmentation C S O T X R Z Ar NF Nr Nt D

  30. Initialize O in the scene image by the trained object detectors Object images Event/Scene images Cao & Fei-Fei, 2007 Scene/Event images θ C Black box object detection & segmentation Black box object detection & segmentation O R X Ar Nr … Black box object detection & segmentation C S O T X R Z Ar NF Nr Nt D Our Model

  31. Auto-semi-supervised learning: Small # of initialized images + Large # of uninitialized images Scene/Event images Small # of initialized images Our Model + C S O T Athlete Athlete Athlete Snow Rock Horse X R Z Sky Grass Grass Ar NF Nr Nt Tree Tree Tree D Rope Wind Snowboard Large # of uninitialized images Saddle Sky

  32. Outline Model Learning Small # of automatically initialized images C Text Visual S Large # of uninitialized images O T Athlete Athlete Athlete Recognition & Experiment Rock Snow Horse Sky Grass Grass X R Tree Tree Tree • Dataset • Learned Model • Results Rope Wind Snowboard Z Saddle Sky Ar NF Nr Nt

  33. 8 Event/Scene Classes Badminton Bocce Croquet Polo Remark: Tags are not used during testing

  34. 8 Event/Scene Classes Rockclimbing Rowing Sailing Snow boarding

  35. Learned model: O C S O T X R Z Ar NF Nr Nt D

  36. Learned model: R Athlete C S Grass O T X R Z Horse Ar NF Nr Nt D

  37. Learned model: S C S O T X R Z Ar NF Nr Nt D

  38. Classification Annotation Segmentation 8 way classification: 54%

  39. Classification Annotation Segmentation Alipr: Li et al 03 Corr LDA: Blei et al 03

  40. Classification Annotation Segmentation

  41. Effect of top-down class context C Model w/o top-down class Full Model S S O O T T R R X X Z Z Horse

  42. Learning Model Small # of automatically initialized images Sky Rock Mountain C Text Sky Visual S Large # of uninitialized images Tree Athlete Recognition & Experiment Athlete Class: Sailing O T Class: Rock climbing Athlete Sailboat Tree Water Sky Wind Athlete Athlete Athlete Tree Rock Snow sailboat Horse Water Sky Grass Grass Athlete Mountain Tree Rock Sky Ascent X R Tree Tree Tree Rope Wind Snowboard Z Saddle Sky Class: Snowboarding Ar NF Athlete Nr Nt Snowboard Athlete Snowboard Tree Snow Sky Powder Tree Snow

  43. Thank Prof. Silvio Savarese , Juan Carlos Niebles, Chong Wang, Barry Chai, Min Sun, Bangpeng Yao, Hao Su, Jia Deng, anonymous reviewers And You

More Related