Image Retrieval Part II

Image Retrieval Part II

Topics • Applications of CBIR in digital library • Human-controlled interactive CBIR • Machine-controlled interactive CBIR

“Get similar images” CBIR Query Sample Results Query by Example • Pick query examples and ask the system to retrieve “similar” images.

QBIC(TM) – IBM's Query By Image Content http://www.hermitagemuseum.org/fcgi-bin/db2www/qbicSearch.mac/qbic?selLang=English

Photobook @ MIT Media Labs

NETRA @ UCSB http://nayana.ece.ucsb.edu/M7TextureDemo/Demo/client/M7TextureDemo.html

Medical Decision Support • Breast cancer is among the top killers of women in the developed world. • Early detection of malignancy can greatly reduce the risk of death. Mammogram

Nationwide Support for Physicians

Summary of Fundamental CBIR • CBIR using query by example • CBIR Algorithm: • First step --- Image Indexing • Second step --- content matching • Third step --- Ranking and displaying • Relevance feedback (RF)

Step I: Image Indexing • Image content = { Color, Shape, Texture } • color: Color histogram, Color Moments • Shape: Chaincodes, Fourier descriptors • Texture: Gabor wavelet features, Co-occurrence matrix • = { Color, Shape, Texture } • example

Example of feature vectors in relational database

Step II: Content Matching • Content similarity measure is obtained by a distance function: where is the feature vector of query image is the feature vector of image in the database • Many distance function have been used: • Euclidean distance • l1-norm • cosine measure

Step III: Similarity Ranking • Calculate for each image in the database • Sort in decreasing order (assume J=10) • top 3 images: image6, image2, image9 • top 5 images: image6, image2, image9, image10, image7

Problems with CBIR In essence, retrieval is a pattern recognition problem with special characteristics. • Huge volume of (visual) data. • High dimensionality in feature space. • Query design: Gap between the high level concepts and low level features. • Linear matching criteria: A mismatch to the popular human perception model.

Example: Compressed Domain • Visual database in the compressed domain: • DCT: Many of the current image/video coding standards; JPEG, MPEG-1,2, and H.261/3. • Wavelets/VQ: Related to the new image coding standard JPEG2000. • Significant gap between human visual perception and information presentation in DCT/wavelet. Database JPEG/MPEG Image.jpg video.mpg feature vector DCT coeff.

State-of-the-art • Human controlled interactive CBIR (HCI-CBIR) • Integrating human perception into content-based retrieval. • Machine controlled interactive CBIR (MCI-CBIR) • To reduce bandwidth requirement for browsing and searching over the Internet. • To minimize errors caused by excessive human involvement.

Integrating human perception into content-based retrieval

Scenario • Machine provides initial retrieval results, through query-by-keyword, sketch, or example, etc.; • Iteratively: • User provides judgment on the current results as to whether, and to what degree, they are relevant to her/his request; • The machine learns and try again.

Query Feedback Feedback 2nd Result 1st Result Relevance Feedback • User gives a feedback to the query results • System recalculates feature weights and modified query Initial sample

Basic GUI for Relevance Feedback Checkbox Slider

ImageGroup Result View Pallete Panel

Texture Structure color 3D MARS Initial Display Result

Human-Controlled Interactive CBIR (HCI-CBIR) • An attractive solution to numerous applications • Main feature: an active role played by users to improve retrieval accuracy • State-of-the-art • query design considerations • linear criteria in similarity ranking

Effective Retrieval through User Interaction • Current systems: • QBIC: interactive region segmentation (IBM). • FourEyes: including human in the image annotation and retrieval loop, (MIT Media Lab). • WebSEEk: dynamic feature vector recomputation based on the user’s feedback (Colombia University). • PicHunter: a Bayesian framework for content based image retrieval (NEC Research Institute). • PicToSeek: (UVA). • MARS: a relevance feedback architecture in image retrieval (UIUC).

A “New” Proposal for HCI-CBIR The framework: Relevance feedback The key features: • Modeling: mapping a high level concept to low level features • Matching: • capturing user’s perceptual subjectivity to modify the query using non-linear measurement • overcoming the difficulties faced by the traditional linear matching criteria

The Relevance Feedback Framework • The goal: measure feature relevance to improve performance of image matching in retrieval • A supervised learning procedure based on regression • For a given query z and a set of the retrieved items, , category xn into two classes: • a relevant class (visually similar to z): x_m, m = 1, 2, …, M, and • an irrelevant class (not similar to z): x_q, q = 1, 2, …, Q • Structure a new query based on the information in x_m and x_q • Use the new query in the next round of retrieval

1 M å ¢ = = P { z } x z1 i 1 m M = m 1 Query Modification Model 1 The new query:

1 M å ¢ ¢ = , x x m M = m 1 ¢ ¢ + a - ¢ ¢ ¢ za < Þ Z2 = = x ( z N ) when x z 1 ¢ ¢ + a - ¢ ¢ ¢ zb > Þ = = Z2 x ( z N ) when x z 2 Query Modification Models 2&3(Anti Re-enforcement Learning) The new query: zb za Example (1-D) small positive constants Center of relevant items Center of non-relevant items query at previous iteration Where

Nonlinear Search Unit • Non-linear (Gaussian) Search Unit (NSU) - image feature vector: - adjustable query vector: - the tuning parameters (NSU widths): • Small i : a relevant feature (sensitive to change) • Large i : a nonrelevant feature (insensitive to change)

Linear Search Unit (LSU) • To benchmark the performance of the NSU • To initiate the search • The parameters: exactly the same as in the NSU

Output: k -retrieved images Perceptual similarity Weight-parameter image database User Interaction Measure updating storing of Feature Extraction& feature vectors RAM Similarity Measure (feature database) (tentative) Interactive searching query image Query Initial Searching User when iteration n=0 when iteration n>0 Architecture for Interactive CBIR

Codebook 1 i= 2 i= Image Blocks Code labels i i=n-3 i=n-2 i=n-1 i=n VQ Codewords as “content descriptors” The usage of codewords reflects the content of the input image encoded.

Two-level WT/VQ coding (1 bpp) Multiresolution Codebook Mallat’s two-level decomposition CB1 HL2 CB2 HL1 2 bpp H5 H4 H3 H2 H1 8 bpp 0.5 bpp CB3 VL2 Label Histogram 0.5 bpp 2 bpp CB4 VL1 0.5 bpp 0 bpp CB5 DL2

Test Database 1: Bordatz database • A texture image database provided by Mahjunath, at http://vivaldi.ece.ucsb.edu/users/wei/codes.html • 1,856 patterns in 116 different classes • 16 similar patterns in each class • Maintained as a single unclassified image database

Queries (the Bordatz Database) [116 different image classes]

Performance Comparison • Methods Compared • LSU2: linear search unit & query model 2 • NSU2: non-linear search unit & query model 2 • Interactive CBIR in MARS: Multimedia Analysis and Retrieval System (developed at UIUC)

Retrieval Rate (%) Methods t=0 t=1 t=2 t=3 Avg. LSU2 73.7 83.0 85.1 85.9 NSU2 73.7 84.9 88.2 89.2 MARS 67.0 75.1 76.4 76.7 Retrieval Results (the Bordatz Database) Table 1. Average retrieval rate (%) Note: The retrieval rate is defined as the average percentage of images belonging to the same class as the query in the top 16 matched.

iARM: Interactive-based Analysis and Retrieval of Multimedia On The Internet @iarm.ee.ryerson.ca:8000/corel

Strategy • iARM implements interactive retrieval for the large image database, running on the J2EE Web Server. • Interaction architecture. • Based on a non-linear relevance feedback, a multi-model SRBF network. • Positive and negative feedbacks. • Properties: local and non-linear learning, fast and robust on a small input data.

A single-pass Radial Basis function (SBRF) Network • the positive examples; • The SRBF network characterizes the query by multiple clusters, each of which is modeled by a p-D Gaussian distribution as: (1) where (2)

SRBF Network Cont.. • The Weighted-Euclidean Space (3) where (4) • A summation of M Gaussian units (1) yields similarity function for the input vector as follows: (5)

Multi-class approach Single-class approach

Negative Feedback Strategy • Tuning decision boundary with negative samples: • Antireinforced Learning Algorithm: • If (6) • Then (7)

Performance of IARM • Using Corel Image Collection, containing 40,000 real-life images, www.corel.com. • A total of 400 queries were generated, and relevance judgments were based on the “ground true” from Corel. • Multiple descriptors: <shape, color, texture> <Fourier descriptors, HSV color histogram& color moments, Gabor Wavelet transform>

Result Non-interactive CBIR iARM r(1) r(2) r(3) r(8) 53% 80.08% 85.99% 87.58% 89.00% Table 1: Average Precision Rate (%) obtained by retrieving 400 queries, measured from the top 16 retrievals.

Test 1: Fast and Robust with small # relevance feedbacks

Example: Looking for “model”.0. Start with choosing image at the bottom right corner as the query.

1. Result after the initial search, then five relevant images are the feedbacks.

2. Result after one relevance feedback: all the top sixteen are relevant.

Test 2: Non-linearity

Image Retrieval Part II

Image Retrieval Part II

Presentation Transcript

Image and Video Retrieval

Content-Based Image Retrieval

Content-based Image Retrieval

Content-Based Image Retrieval

Image Retrieval

Image-based Material Retrieval

Image Database Retrieval

Image Retrieval

Image Information Retrieval

The Image: Part II

Image Retrieval

Image Retrieval

Content Based Image Retrieval

Image Retrieval

Reusable Software Component Retrieval: Part II

Content Based Image Retrieval

Image and Video Retrieval

Botany Image Retrieval

Keypoints in Image Retrieval

Image Database Retrieval

Image retrieval and categorization

Image Retrieval