1 / 40

I know what you did last summer: object-level auto-annotation of holiday snaps

I know what you did last summer: object-level auto-annotation of holiday snaps. Stephan Gammeter , Lukas Bossard , Till Quack, Luc Van Gool. Outline. Introduction Automatic object mining Scalable object cluster retrieval Object knowledge from the wisdom of crowds

lani
Download Presentation

I know what you did last summer: object-level auto-annotation of holiday snaps

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. I know what you did last summer: object-level auto-annotation of holiday snaps Stephan Gammeter, Lukas Bossard, Till Quack, Luc Van Gool

  2. Outline • Introduction • Automatic object mining • Scalable object cluster retrieval • Object knowledge from the wisdom of crowds • Object-level auto-annotation • Experiments and Results • Conclusions

  3. Outline • Introduction • Automatic object mining • Scalable object cluster retrieval • Object knowledge from the wisdom of crowds • Object-level auto-annotation • Experiments and Results • Conclusions

  4. Intorduction • Most of photo organization tools allow tagging (labeling) with keywords • Tagging is a tedious process • Automated annotation

  5. Auto annotation step • First step : Build database on large-scale data crawling from community photo collections • Second step : Recognition from database

  6. Step detail • The crawling stage : • Create a large database of object model, each object is represented as a cluster of images (object clusters) • Tell us what the cluster contain (labels, GPS location, related content ) • The retrieval stage : • Consists of a large scale retrieval system which is based on local image feature • Optimize this stage

  7. Step detail (2) • The annotation stage : • Estimates the position of object within image (bounding box) • Annotates with text, location, related content from the database

  8. Resulting method differs • Not general annotation of image with words • The annotation happens at the object level, and include textual labels, related web-sites, GPS location • The annotation of a query image happens within seconds Building Taipei 101

  9. Outline • Introduction • Automatic object mining • Scalable object cluster retrieval • Object knowledge from the wisdom of crowds • Object-level auto-annotation • Experiments and Results • Conclusions

  10. Automatic object mining • Geospatial grid is overlaid over the earth, query Flickr to retrieve geo-tagged photo GPS location

  11. Outline • Introduction • Automatic object mining • Scalable object cluster retrieval • Object knowledge from the wisdom of crowds • Object-level auto-annotation • Experiments and Results • Conclusions

  12. Scalable object cluster retrieval • Visual vocabulary technique : Created by clustering the descriptor vectors of local visual features such as SIFT or SURF • Ranked using TF*IDF • Using RANSAC to estimate a homography between candidateand queryimage • Retain only candidate when the number of inliers exceeds a give threshold

  13. TF*IDF D : candidate document (candidate image) contain set of visual word v : visual words (local feature) df(v) : document frequency of visual word v Note : we want to know which object is present in the query image, so we return a ranked list of object clusters instead of image

  14. Outline • Introduction • Automatic object mining • Scalable object cluster retrieval • Object knowledge from the wisdom of crowds • Object-level auto-annotation • Experiments and Results • Conclusions

  15. Object knowledge from the wisdom of crowds • Database : • Not organized by individual images but by object clusters • We can use partly redundant information to : • Obtain a better understanding of the object appearance • Segment objects • Create more compact inverted indices

  16. Object-specific feature confidence score • Use the feature matches from pair-wise can derive a score for each feature • Only feature which match to many of their counterparts in other image will receive a high score • Many of the photo are taken from varying viewpoint around the object, the background will receive less match

  17. Object-specific feature confidence score f : feature , i : image : set of inlying feature matches for image ij : number of images in the current object cluster o , : parameter set 1 and 1/3 Note : The bounding box is drawn around all feature with confidence higher than

  18. Object-specific feature confidence score

  19. Better indices through object-specific feature sampling • Estimate bounding boxes can help to compact our inverted index of visual word • Removing object clusters taken by a single user

  20. Last step of retrieval stage • Select the best object cluster as a final result • Simple voting with retrieved image for their parent clusters • Normalizing by cluster size is not feasible • Only votes of 5 images per cluster with the highest retrieval scores are counted

  21. Outline • Introduction • Automatic object mining • Scalable object cluster retrieval • Object knowledge from the wisdom of crowds • Object-level auto-annotation • Experiments and Results • Conclusions

  22. Object-level auto-annotation • Consists of two steps : • Bounding box estimation • Labelling • Bounding box estimation • Estimated in the same way for database images • The query image matched to a number of images in the cluster returned at the top • Labelling • Simply copy the information to serve as labels for the query image from object cluster

  23. Outline • Introduction • Automatic object mining • Scalable object cluster retrieval • Object knowledge from the wisdom of crowds • Object-level auto-annotation • Experiments and Results • Conclusions

  24. Experiments • Conducted a large dataset collected from Flickr • Collected a challenging test-set of 674 images from Picasa Web-Albums • Estimated bounding boxes cover on average 52% of each images

  25. Efficiency and Precision of Recognition • : baseline, TF*IDF-ranking on 500K visual vocabulary as it is used in other work • : bounding box features + no single user clusters • : all features + no single user clusters • : 66% random features subset + no single user clusters • : 66% random features subset

  26. 67%

  27. Annotation precision • Evaluate how well our system localize bounding boxes by measuring the intersection-over-union(IOU) measure for the ground-truth and hypothesis overlap 76.1%

  28. Results

  29. Outline • Introduction • Automatic object mining • Scalable object cluster retrieval • Object knowledge from the wisdom of crowds • Object-level auto-annotation • Experiments and Results • Conclusions

  30. Conclusions • Presented a full auto-annotation pipeline for holiday snaps • Object-level annotation with bounding box, relevant tags, Wikipedia articles and GPS location

  31. Thanks!!!!

More Related