330 likes | 342 Views
This paper introduces a method for navigating very large landscape images by leveraging saliency detection. It outlines a sliding-window saliency map approach, as well as techniques for information discovery and interactive refinement. The proposed method enhances visual exploration and scalability for large image datasets.
E N D
Saliency-Assisted Navigation of Very Large Landscape Images Cheuk Yiu Ip AmitabhVarshney
Very Large Landscape Images • Image Acquisition: • Gigapan • MS HDViewToG 2007 • Image Stitching: • Kazhdan et al ToG 2008, 2010 • Summa et al ToG 2010 • Stitch images to create multi-gigapixel very large images • But WHERE should we start looking?
Visual Knowledge Discovery • Visual knowledge discovery • Identify what is interesting • Visualize them
Information Scalability • Design effective algorithms to process large images • The SMALL unique regions in the large images contain the MOST information • Identify informative regions from repetitive scene elements
Data Scalability • Very large images represent a large amount of data 5 Gpix RGBA = 20GB uncompressed • Multicore and manycore parallel processing • Requires efficient algorithms O(n) and out-of-core GPU methods
Overview • Sliding-Window Saliency Map • Detection Anomalous Regions • Interactive Exploration
Traditional Multiscale Image Saliency • Detects “Pop-out” spots from the scene • Inspired by human visual system • Pre-attentive vision • Find multiscale contrasting regions • Intensity, Color Opponencies (I, RG, BY) • Convolve (I, RG, BY) with Difference of Gaussians (DoG) filter (σ is stdev) • Repeat on downsampled images for multiscales • Image Saliency • Itti et al.PAMI,1998 • Bruce et al. IJCV,2009 • Goferman et al. CVPR 2010 • Work on small images, very accurate but slow.
Multiscale Aggregation • Works well on small images • If we have many more scales … • Large regions dominate small regions • Wait… we don’t want to miss the small regions • Traditional multiscale saliency is insufficient
Our Sliding-Window Aggregation • We see different things at different zoom levels • One saliency map per level • Only aggregate up to 4x • Use a sliding-window across scales • Why 4x? • Eye resolution difference ~5x 16σ – 64σ All (σ - 256σ) 4σ – 16σ σ – 4σ
There are still too many regions… • 18,000+ regions in 1.3Gpix (5 hours if a user spends 1s on each) • Regions are enlarged for visibility • There are many contrasting repetitive elements
Information Discovery • Identify the informative regions from the salient regions • Compare regions to find the most different ones • Detect the anomalous regions and outliers • Visual Data Analysis • Mesh and Volume Saliency (Lee et al. ToG 2005, Kim et al. TVCG 2006) • Video Summarization (Daniel et al. Vis 2003) • Flow and Information Theory (Janicke et al. TVCG 2010) • Molecular Dynamics Layout (Patro et al. Biovis 2011)
Represent salient regions by histograms (rotational invariance) Global Colors RGB, HSV, CIELAB: Not discriminative Local Edges: Too discriminative Histograms of colors in 8x8 moving windows work well(MPEG-7 CSD) Compare histograms, p, q, by the Euclidean distance Image Region Descriptors
k-Nearest-Neighbors Anomaly Detection • Uniqueness, U(p), is the average distance of p to its k-Nearest-Neighbors. • Repeating regions have a low U(p) • Distinct regions have a high U(p) • Spatial data structures (kD-trees) accelerate the retrieval
Where are they … ? • Top 3% (500) of the most distinct regions. • Most of the repeating region are eliminated. • Can you see the remaining regions?
Visualizing the Detected Regions • Problem: Small regions of interests are NOT visible • Adaptively enlarge regions • Determine the scale and colors by the region’s rank of uniqueness • Increase when zooming out • Decrease when zooming in • (Formula in paper)
Automatic Exploration • Explore the regions in descending order of their uniqueness • k-NN anomaly detection step provides uniqueness ordering
Interactive Refinement • Locate similar undesired regions • Select a representative • Move the slider to adjust the coverage • Delete the selection The spatial data structure indexes the regions and provides fast retrieval
After User Refinement • The remaining 300 regions after 3 refinement interactions
Data Scalability • GPU Out-of-core saliency computation • Break the image into tiles • Parallel Gaussian filtering on GPU • Filter overlapping boundary tiles to maintain continuity • Saliency map storage • Fit and store ellipses of the salient regions • Do not store an extra image • Tiled Image Viewer • View dependent mipmap image tiles loading and prefetching for smooth pan and zoom
Gigapan Community Tags Grimsel Pass Royal Gorge Bridge
Gigapan Community Tags Cacti Mount Whitney
Limitations • Buffelgrass after fire • The “Original” cactus • Tags with semantic information • Domain knowledge necessary • Why are they tagged ?
Performance • Each GPix takes 2.5 1 hours to preprocess(1 NVIDIA GeForce GTX 285 GPU and 1 CPU) • Each interaction takes 10 ms
Conclusions • First step on visual knowledge discovery on very large landscape images • Visual Scalability: Sliding-Window Saliency • Information Scalability: Anomaly Detection • Data Scalability: Parallel filtering, Saliency Storage • Interactive Navigation
Future work • There are a lot of very large images • Astronomy • Microscopy • Product inspection • Urban Scenes • Domain specific descriptors • Fast discovery of locally distinct regions. • Accurate Identification of globally unique regions.
Acknowledgements • National Science Foundation: CCF 05-41120, CMMI 08-35572, CNS 09-59979 • NVIDIA CUDA Center of Excellence Program • Derek Juba, SujalBista, Rob Patro, Icaroda Cunha, Yang Yang, AdilYalcin, and the reviewers for improving this paper and presentation • The Vis paper award committees Thank you!
Questions ? • Please see our websitesfor the paper and video: • Cheuk Yiu Ip • www.cs.umd.edu/~ipcy/ • GVIL Research Highlights • www.cs.umd.edu/gvil/