Tour the World: building a web-scale and landmark recognition engine

ICCV 2009 Tour the World: building a web-scale and landmark recognition engine Yan-Tao Zheng1, Ming Zhao2, Yang Song2, Hartwig Adam2 Ulrich Buddemeier2, Alessandro Bissacco2, Fernando Brucher2 Tat-Seng Chua1, and Hartmut Neven2 1 NUS Graduate Sch. for Integrative Sciences and Engineering, National University of Singapore, Singapore 2 Google Inc., U.S.A

outline • Introduction • Approach ( Framework) • Experiments • Conclusion (Future Work)

Introduction What is the motivation ? With the vast amount of landmark multimedia data on the web

Introduction • Application • Provide clean landmark images for building virtualtourism of a large number of landmarks • Facilitate both content understanding and geo-location detection of images and video • Provide tour guide recommendation and visualization

Introduction • Issuemust be tackled • No readily available list of landmarks in the world • Explore two source : • (1) geographically calibrated images in photo sharing websites • (2)travel guide articles from websites • Even if , it’s still challenging to collect true landmark image • Download landmark images from two sources: • (1)photo sharing websites (2)Google Image Search • Efficiency is a challenge for a large-scale system • Accomplish by three means: • (1)parallel computing (2)efficient clustering algo. • (3)efficient image matching by k-d tree indexing http://www.panoramio.com/

Approachframework

Approachframework Learning landmarks from GPS-tagged photos >> Perform the agglomerative hierarchical clustering on the photo’s GPS coordinates >> Validation criterion is unique number of authors of photos is larger than a threshold

Approachframework Learning landmarks from travel guide articles >> with the hierarchy, we can extract city names from country in six continents >> satisfy following criteria, text is deemed to be a landmark candidate http://wikitravel.org/en/Taipei Set of images

Approachframework Learning landmarks from travel guide articles >> Validating landmarks (1) if it is too long or most of its words are not capitalized (2) the number of unique authors of images in the cluster >> which reflects the popular appeal of landmarks Set of images

Approachdiscover landmarks in the world Most of users are located in Europe and North America !!

Approachlearning of landmark images Object matching based on local features • Detect interest point >>LoG filters [11] • Local descriptor >> SIFT [9] • Reduce the feature dimensionality to 40 >> Principle Component Analysis (PCA) [2] • The match interest points of two images are verified >> affine transformation [9]

Approachmatch score • Match score which is the probability of a false positive By using a cumulative binomial distribution Can be estimated by Bayes Theorem [2]

Approachmatch region Classified into two types: match edge and region overlap edge ---- Match edge ---- Region overlap edge

ApproachGraph clustering Do not have a priori knowledge of the # of clusters >> k-means are unsuitable >> exploit the hierarchical agglomerative clustering [2] The distance of region

ApproachCleaning visual model • Photographic v.s non-photographic image classifier • Based on Adaboost algorithm over low level visual features of color histogram and hough transform. • Adopt a multi-view face detector[15]

Approachefficiency issues • Make efficiency essential in two aspects: (1) the landmark image mining (2) landmark recognition of query images • Achieve efficiency in three measures: • Parallel computing to mine true landmark images • Efficiency in hierarchical clustering • Indexing local feature for matching • Use k-d tree[1] ~0.2 sec in a P4 computer

Experiments 174 landmarks are found to be common in both lists >> land mark is a perceptional and cognitive concept

Experiments • Evaluation of landmark image mining • 1000 visual clusters are randomly selected 68 of them are found to be negative outliers (0.68%) • The classifier is trained based on ~5000 photographic and non-photographic images , while the face detector is base on [15] • After cleaning , cluster rate drops to 0.37%

Experiments Evaluation of landmark recognitioin Positive testing 728 images from 124 randomly selected landmarks Negative testing Caltech-256 [5] Pascal VOC 07 [3]

Experiments Recognition : local feature matching of query image against model images, NN principle A match is found when the match score is larger than the threshold Recognition accuracy: 80.8% fairly satisfactory Image content analysis and geo-location detection: 46.3% moderately satisfactory

Conclusionfuture work • Conclusion • Build a world-scale landmark recognition engine • Utilize ~21.4M images to build up landmark visual model • Incorporates 5312 landmarks from 1259 cities in 144 countries • Future work • Multi-lingual aspect of landmark engine >> help to discover more landmarks and collect more clean landmark images in their native languages in the Internet

Related work Thank You !! Related Work 3D visualization of landmarks http://www.cs.cornell.edu/~snavely/

Tour the World: building a web-scale and landmark recognition engine

Tour the World: building a web-scale and landmark recognition engine

Presentation Transcript

A Tour of the World

The Anatomy of a Large-Scale Hypertextual Web Search Engine

The Lincoln Project Building a Web-Scale Semantic Search Engine

Tour the World: building a web-scale landmark recognition engine

“The Anatomy of a Large-Scale Hypertextual Web Search Engine” ‘98

Major Scale Recognition

The anatomy of a Large-Scale Hypertextual Web Search Engine

Collaboratively Building Web-Scale with Libraries The Web-Scale Platform

The Anatomy of a Large-Scale Hypertextual Web Search Engine

Major Scale Recognition

The Architecture of a Large-Scale Web Search and Query Engine

The Anatomy of a Large-Scale Hypertextual Web Search Engine

The Anatomy of a Large-Scale Hypertextual Web Search Engine

The Anatomy of a Large-Scale Hypertextual Web Search Engine

Landmark-Based Speech Recognition

Building Web Scale for Libraries

The Anatomy of a Large-Scale Hypertextual Web Search Engine

The Anatomy of a Large-Scale Hypertextual Web Search Engine

Landmark-Based Speech Recognition

Landmark-Based Speech Recognition

The Anatomy of a Large-Scale Hypertextual Web Search Engine