370 likes | 547 Views
Presenter: Chen-Ping Yu, PhD Candidate Research Advisors: Dr. Dimitris Samaras (computer science) Dr. Greg Zelinsky (psychology) Department of Computer Science Stony Brook University February 5, 2014. Tech Talk @ shutterstock.
E N D
Presenter: Chen-Ping Yu, PhD Candidate Research Advisors: Dr. Dimitris Samaras (computer science) Dr. Greg Zelinsky (psychology) Department of Computer Science Stony Brook University February 5, 2014 Tech Talk @ shutterstock Modeling visual clutter using proto-objects
Visual search • Examples • Visual clutter • Models • Proto-objects • Parametric proto-object segmentation • Superpixels • Graph and clustering • Data • Experiment and results • Conclusion Agenda
Visual search • Ubiquitous, happens everyday. • Finding your car in a parking lot, finding you keys on a cluttered desk, etc. • Modeling visual search performance • Are we able to predict how easy/hard a search task is? • Helps in advertisement design, item placement (i.e. shelf organization for supermarkets, electronic stores). • Attributes that affect visual search performance • The similarity of the target to the distractor items (Wolfe, 1994, 1998). • The similarity of the distractors (Duncan & Humphreys, 1989). • Set size – the number of items in an image (Wolfe, 1998). Visual search
Target patch Visual search • Example: find the target patch in the query image
Source: M. Asher, D. Tolhurst, T. Troscianko and I. Gilchrist, “Regional effects of clutter on human target detection performance”, Journal of Vision, 2013 Visual search
Target patch Visual search • Another example
Target patch Visual search • Set size effect example
Source: M. Neider, and G. Zelinsky, “Cutting through the clutter: searching for targets in evolving complex scenes.” Journal of Vision, 2011. Visual search
Visual clutter • Visual clutter • In general, it is a ‘‘confused collection’’ or a ‘‘crowded disorderly state’’. • Alternatively, it is the state in which excess items, or their representation or organization, lead to a degradation of performance at some task (Rosenholtz et al. 2007).
Visual clutter • Set size effect • Set size: number of items/objects in an image • Visual search task performance degrades as more objects are added to the display, i.e. looking for a particular building in a rural setting vs in an urban setting (Neider et al. 2008, 2011). • Number of objects is proportional to level of clutter. • Set size in the real world • However, most “objects” in real world scene are not visually countable • grass, rocks, patches of textures, shadows, etc. • Alternative approach • Analysis in the feature space
Both contain 24 objects! What are objects in these scenes? What is the ranking of their clutterness? Visual clutter
Top row: input images Bottom row: edge density Clutter models Segmenting objects is difficult, therefore: • Edge density model (Mack et al. 2004) • Counts the pixels on a Canny edge detected image. (r = 0.83) • Result is very sensitive to Canny’s edge detection setting, i.e. smoothing, thresholding.
Left: input, Right: feature variance ellipses 25 weather and US map dataset Clutter models • Feature congestion model (Rosenholtz et al. 2007) • Compute the feature variances of: Color, Luminance, and Orientation • Build a 3D ellipse using the feature variances, and the volume of the ellipse is the clutter measure for that image. • State-of-the-art, widely being used as the comparison gold standard. (r = 0.75)
Clutter models • Power Law model (Bravo et al. 2008) • Using Felzenszwalb’s graph-based method to segment the input image, r = 0.62.
Proto-objects as color blobs * Left images: from Wischnewski et al. 2010; Right image: from Bravo et al. 2008 (24 objects) Proto-objects • Direct modeling of set size: proto-objects • Low-level information processed before the focus of attention, and then focus of attention acts as a ‘‘hand’’ that grabs the relating proto-objects together into forming a true stable object, and proto-object itself are groupings of similar low level features that are nearby by the visual neurons (Rensink 1997, 2000). • Directly related to set size. • Better representation of set size than “objects”.
Input image Superpixels Proto-objects • Our clutter model • Quantify set size, using # of proto-objects instead of objects • Segment proto-objects by performing superpixel clustering Proto-object segmentation
Image from left to right: input image, mean-shift, graph-based, turbopixel, normalized-cut. Superpixel segmentation • Superpixel segmentation • Over-segment an image into regions of similar pixels that are also boundary preserving. • As a pre-processing can reduce the need to find boundaries. • Can provide region statistics.
Input image Superpixels Superpixel Graph SLIC k = 1000 • Superpixelgraph • Neighboring superpixels are connected, into a graph structure Proto-object segmentation
Merge the connected clusters, represented as proto-objects Compute similarity threshold, remove edges that are higher than the threshold Within-cluster edge Between-cluster edge (identify, then remove) Proto-object segmentation 0.15 0.15 0.77 0.77 0.11 0.11 0.35 0.35 0.63 0.63 0.28 0.75 0.28 0.86 0.75 0.12 0.86 0.12 0.77 0.77 0.04 0.04 0.31 0.31 0.82 0.81 0.82 0.21 0.81 0.93 0.21 0.93 0.32 0.32 0.65 0.65 0.68 0.68 0.71 0.05 0.71 0.38 0.05 0.23 0.38 0.23 0.75 0.75
Weibull-Mixture Model (WMM): Similarity Threshold – the crossing point between the two components: Parametric proto-object segmentation Orientation Color Intensity
Parametric proto-object segmentation • Clutter model • Count the resulting # of proto-objects. • Divide the count by the initial # of superpixels, results in a scale-invariant normalized clutter measure. • The clutter measure is between 0 and 1, the larger the more cluttered.
Data • 90 images from the SUN dataset • 800x600 • Real world images • 6 groups with 15 images each (total = 90 images). • Group 1: 1~10 objects • Group 2: 11~20 objects • … • Group 6: 51~60 objects • Rated by 15 human subjects age from 18~30, from least to most clutter. • Avg correlation over all pairs of subjects: R = 0.6919 (p<0.001) • Using the median ranked position for each image as the ground truth.
Results • Results • Achieved R = 0.7557, p<0.001 against human rated ground truth ordering by clutter • 10-fold cross validation with avg test set correlation of R = 0.6808. **latest results:
Clutter measure: 0.1713 Clutter measure: 0.2612 Results
Clutter measure: 0.3725 Clutter measure: 0.5038 Results
Clutter measure: 0.6750 Results
Conclusion • Applications • Image-level feature for image retrieval. • Image-to-painting style transformation. • Advertisement, user interface, and item organization quantified analysis. • Next steps • Apply our clutter model to the target search task performances. • Explore more on proto-objects for automatic object formation and detection. • Eye-movement related projects.
Related papers • Chen-Ping Yu, Wen-Yu Hua, Dimitris Samaras, and Gregory Zelinsky, “Modeling clutter perception using parametric proto-object partitioning.” Advances in Neural Information Processing (NIPS), Lake Tahoe, USA, Dec 2013. • Chen-Ping Yu, Dimitris Samaras, and Gregory Zelinsky, “Modeling visual clutter perception using proto-object segmentation.”, Journal of Vision (to appear), 2014. • For more information, please visit my project webpage: http://mysbfiles.stonybrook.edu/~cheyu/projects/proto-objects.html • For full citation information of this presentation, please refer to the NIPS 2013 paper. conclusion