1 / 26

From Large Scale Image Categorization to Entry-Level Categories

From Large Scale Image Categorization to Entry-Level Categories. Vicente Ordonez, Jia Deng, Yejin Choi, Alexander C. Berg, Tamara L. Berg. What would you call this?. Grampus griseus. Dolphin. What would you call this?. Object. Organism. Animal. Chordate. Vertebrate. Bird.

dyami
Download Presentation

From Large Scale Image Categorization to Entry-Level Categories

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. From Large Scale Image Categorization to Entry-Level Categories Vicente Ordonez, Jia Deng, YejinChoi, Alexander C. Berg, Tamara L. Berg

  2. What would you call this? Grampus griseus Dolphin

  3. What would you call this? Object Organism Animal Chordate Vertebrate Bird Aquatic bird Swan Whistling swan Cygnus Colombianus

  4. Naming Image Content Grampus griseus American black bear Grizzly bear King penguin Cormorant Homing pigeon Vision Naming Ball-peen hammer Grampus griseus Spigot Diskette, floppy Steel arch bridge Farmhouse Pick the Best Soapweed Dolphin Brazilian rosewood Bristlecone pine Cliffdiving What Should I Call It? Crabapple Input Image Thousands of Noisy Category Predictions (0.80) (0.83) (0.16) (0.25) (0.11) (0.56) (0.26) (0.06) (0.07) (0.06) (0.16) (0.03) (0.12) (0.13) (0.04) (0.19)

  5. Entry-Level Category The category that people are likely to name when presented with a depiction of an object. Rosch et al, 1976 Jolicoeur, Gluck & Kosslyn, 1984 Superordinates: animal, vertebrateEntry Level: birdSubordinates: Black-capped chickadee

  6. Entry-Level Category The category that people are likely to name when presented with a depiction of an object. Rosch et al, 1976 Jolicoeur, Gluck & Kosslyn, 1984 Superordinates: animal, birdEntry Level: penguinSubordinates: Chinstrap penguin

  7. Is this hard? Living thing wordnet hierarchy Plant, Flora Bird Angiosperm Seabird Bulbous Plant Flower Cormorant Penguin Narcissus King penguin Orchid Daffodil Daisy Frog Orchid

  8. How will we do it? Wordnet Google Web 1T Little girl and her dog in northern Thailand. They both seemed. Interior design of modern white and brown living room furniture hanging. Computer Vision The Egyptian cat statue by the floor clock and perpetual motion Linguistic resources Lots of text Imagenet SBU Captioned Dataset Man sits in a rusted car buried in the sand on Waitarere beach Our dog Zoe in her bed Emma in her hat looking super cute Lots of images with text Labeled Images

  9. Scaling Naming Tasks! 48 categories > 7000 categories

  10. 1. Goal: Category Translation Detailed Category Grampus griseus What should I Call It?(Entry-Level Category) What should I Call It?(Entry-Level Category) 2. Goal: Content Naming dolphin dolphin Input Image

  11. 1. Goal: Category Translation Detailed Category Grampus griseus What should I Call It?(Entry-Level Category) What should I Call It?(Entry-Level Category) 2. Goal: Content Naming dolphin dolphin Input Image

  12. Category Translation by Humans Friesian, Holstein, Holstein-Friesian • cowcattlepasturefence

  13. 1.1 Category Translation: Text-based wordnet hierarchy 656M Animal n-gramFrequency 15M 366M Mammal Bird Naturalness 0.9M Cetacean 128M Seabird Semantic Distance 55M Whale 1.2M 88M Cormorant Penguin 30M King penguin Sperm whale Dolphin 22M 6.4M Grampus griseus 0.08M

  14. 1.2 Category Translation: Image-based Friesian, Holstein, Holstein-Friesian • (1.9071) cow(1.1851) orange_tree(0.6136) stall(0.5630) mushroom(0.3825) pasture(0.3156) sheep(0.3321) black_bear(0.3015) puppy(0.2409) pedestrian_bridge(0.2353) nest Vision System

  15. Category Translation: Examples IMAGEBASED TEXTBASED HUMANS

  16. 1. Goal: Category Translation Detailed Category Grampus griseus What should I Call It?(Entry-Level Category) What should I Call It?(Entry-Level Category) 2. Goal: Content Naming dolphin dolphin Input Image

  17. Large Scale Categorization Grampus griseus American black bear Grizzly bear King penguin Cormorant Homing pigeon Ball-peen hammer Spigot Flat Classifiers Diskette, floppy Steel arch bridge Farmhouse Soapweed Brazilian rosewood Bristlecone pine Cliffdiving Crabapple Selective Search Windows. van De Sande et al. ICCV 2011 Local descriptors Coding (LLC),Wang et al. CVPR 2010 Spatial pooling (0.80) (0.41) (0.16) (0.25) (0.11) (0.56) (0.26) (0.06) (0.07) (0.06) (0.16) (0.03) (0.12) (0.13) (0.04) (0.19)

  18. 2.1 Propagated Visual Estimates (1.0) - 656M Animal (0.8) 15M 366M (0.2) Mammal Bird 0.9M 128M (0.8) (0.2) Cetacean Seabird Accuracy Naturalness (0.05) 55M 88M 1.2M Specificity (0.8) Whale (0.15) Penguin Cormorant 22M King penguin (0.15) 6.4M 30M (0.6) (0.2) Sperm whale Dolphin Grampus griseus (0.6) 0.08M Our work Deng et al. CVPR 2012

  19. 2.2 Supervised Learning (0.80) Grampus griseus training from weak annotations SBU Captioned Photo Dataset (0.41) American black bear 1 million captioned images! (0.16) Grizzly bear (0.25) King penguin (0.11) Cormorant Bear (0.56) Homing pigeon Dog (0.26) Ball-peen hammer Building (0.06) Spigot House (0.07) Diskette, floppy Bird (0.06) Steel arch bridge Penguin (0.16) Farmhouse (0.03) Tree Soapweed (0.12) Brazilian rosewood Palm tree (0.13) Bristlecone pine (0.04) Cliffdiving (0.19) Crabapple

  20. Extracting Meaning from Data Weights learned to recognize images with “tree” in caption snag shade tree bracket fungus, shelf fungus bristlecone pine, Rocky Mountain bristlecone pine, PinusaristataBrazilian rosewood, caviuna wood, jacaranda, Dalbergianigraredheaded woodpecker, redhead, Melanerpeserythrocephalusredbud, Cerciscanadensismangrove, Rhizophora manglechiton, coat-of-mail shell, sea cradle, polyplacophorecrab apple, crabapple papaya, papaia, pawpaw, papaya tree, melon tree, Carica papayafrogmouth Mammals Birds Instruments Structures Plants Other

  21. Extracting Meaning from Data Weights learned to recognize images with “water” in caption water dogsurfing, surfboarding, surfridingmanatee, Trichechusmanatuspuntdip, plunge cliff diving fly-fishing sockeye, sockeye salmon, red salmon, blueback salmon, Oncorhynchusnerkasea otter, EnhydralutrisAmerican coot, marsh hen, mud hen, water hen, Fulicaamericanaboobycanal boat, narrow boat, narrowboat Mammals Birds Instruments Structures Plants Other

  22. Results: Content Naming

  23. Results: Content Naming

  24. Evaluation: Content Naming

  25. Conclusions/Future Work • We explored different models for content naming in images. • Results can be used to improve the larger goal of generating human-like image descriptions. • Go beyond nouns and infer other type of abstractions on action and attribute words.

  26. Questions?

More Related