310 likes | 434 Views
Enhancing Human-Machine Communication via Visual Attributes. Devi Parikh Virginia Tech. Interacting with Vision Systems. User. Supervisor. Interacting with Vision Systems. Mode of communication is important. Semantic Gap. Interacting with Vision Systems. Necessary for communication
E N D
Enhancing Human-Machine Communication via Visual Attributes Devi Parikh Virginia Tech
Interacting with Vision Systems User Supervisor
Interacting with Vision Systems Mode of communication is important Semantic Gap
Interacting with Vision Systems • Necessary for communication • Language that humans understand (semantic) • Language that machines understand (visual) • Attributes • Example: furry, natural, chubby, shiny, etc. • Better features, deeper image understanding, etc. Farhadi et al., Kumar et al., Lampert et al., etc. • Human-machine communication
User Role of the Human Supervisor Image Search Instilling Domain Knowledge Human Polar bears are white and larger than rabbits. My missing brother is fuller-faced than this boy. Active and Interactive Learning Supervisor User Communicator Characterizing Failure Modes Interpretable Models I think this is a polar bear because this is a white and furry animal. If the image is blurry or the face is not frontal, I may fail. Reading Between the Lines Machine User Supervisor
User Role of the Human Supervisor Image Search Instilling Domain Knowledge Human Polar bears are white and larger than rabbits. My missing brother is fuller-faced than this boy. Active and Interactive Learning Supervisor User Communicator Characterizing Failure Modes Interpretable Models I think this is a polar bear because this is a white and furry animal. If the image is blurry or the face is not frontal, I may fail. Reading Between the Lines Machine User Supervisor
Image Search Query: “black shoes” … Binary Relevance Feedback
Image Search Query: “black shoes” … “more formalthan these” “shinierthan these” …
Relative Attributes Linear ranking function: open Training Testing Openness [Parikh and Grauman, ICCV 2011]
Image Search • System has pre-trained relative attribute predictors • Relevance of image = # constraints satisfied … “shinier” “more formal”
WhittleSearch shiny … “shinier” “more formal” formal
WhittleSearch shiny formal
WhittleSearch [Kovashka, Parikh and Grauman, CVPR 2012] (Patent pending) 13
Whittle Search: Demo (Online) [Prepared by NamanAgrawal, Demo at CVPR 2013] (Patent pending) 14
User Role of the Human Supervisor Image Search Instilling Domain Knowledge Human Polar bears are white and larger than rabbits. My missing brother is fuller-faced than this boy. Active and Interactive Learning Supervisor User Communicator Characterizing Failure Modes Interpretable Models I think this is a polar bear because this is a white and furry animal. If the image is blurry or the face is not frontal, I may fail. Reading Between the Lines Machine User Supervisor
User Role of the Human Supervisor Image Search Instilling Domain Knowledge Human Polar bears are white and larger than rabbits. My missing brother is fuller-faced than this boy. Active and Interactive Learning Supervisor User Communicator Characterizing Failure Modes Interpretable Models I think this is a polar bear because this is a white and furry animal. If the image is blurry or the face is not frontal, I may fail. Reading Between the Lines Machine User Supervisor
User Role of the Human Supervisor Image Search Instilling Domain Knowledge Human Polar bears are white and larger than rabbits. My missing brother is fuller-faced than this boy. Active and Interactive Learning Supervisor User Communicator Characterizing Failure Modes Interpretable Models I think this is a polar bear because this is a white and furry animal. If the image is blurry or the face is not frontal, I may fail. Reading Between the Lines Machine User Supervisor
Traditional Active Learning Is this a forest? No, this is not a forest.
Classifier Feedback I think this is a forest. What do you think ? No, this is too opento be a forest. [Images more open than query] Ah! These images must not be forests either then. [Parkashand Parikh, ECCV 2012] …
Classifier Feedback I think this is a forest. What do you think ? No, this is too opento be a forest. Pre-trained relative attributes [Images more open than query] Ah! These images must not be forests either then. …
Classifier Feedback I think this is a forest. What do you think ? No, this is too opento be a forest. Learn attributes on the fly [Images more open than query] Ah! These images must not be forests either then. …
Classifier Feedback I think this is a forest. What do you think ? No, this is too opento be a forest. [images labeled as forest] Ah! These images must be less open than query …
Classifier Feedback • Learning attributes on the fly • Start only with unlabeled images (+ a supervisor) • Categories and attributes learnt from scratch • Confidence in instances • Active learning for learning with attributes-based classifier feedback [Biswasand Parikh, CVPR 2013]
Classifier Feedback Accuracy Parkash and Parikh ECCV 2012 Biswas and Parikh CVPR 2013 Number of iterations
User Role of the Human Supervisor Image Search Instilling Domain Knowledge Human Polar bears are white and larger than rabbits. My missing brother is fuller-faced than this boy. Active and Interactive Learning Supervisor User Communicator Characterizing Failure Modes Interpretable Models I think this is a polar bear because this is a white and furry animal. If the image is blurry or the face is not frontal, I may fail. Reading Between the Lines Machine User Supervisor
WhittleSearch Query: “black shoes” … “more formalthan these” “shinierthan these” …
Image Search [Parikh andGrauman, ICCV 2013]
Saying the Right Thing • Improved image search, description Not smiling Smiling more than [Sadovnik, Gallagher, Parikh and Chen, ICCV 2013]
Saliency of Attributes • Improved image search, zero-shot learning, description Scary, sharp teeth White, furry [Turakhia and Parikh, ICCV 2013]
User Role of the Human Supervisor Integrating AI with today’s machine learning tools Image Search Instilling Domain Knowledge Human Polar bears are white and larger than rabbits. My missing brother is fuller-faced than this boy. Accessing user’s intensions for mental image search Getting more from what the human says without added human effort Enhanced human-machine communication via attributes for improved visual recognition Active and Interactive Learning Supervisor User Communicator Characterizing Failure Modes Interpretable Models Trustworthy systems: keyfor effective human-machine teams More usable computer vision systems even with their imperfections I think this is a polar bear because this is a white and furry animal. If the image is blurry or the face is not frontal, I may fail. Reading Between the Lines Machine User Supervisor