830 likes | 998 Views
Image Retrieval Part II. Topics. Applications of CBIR in digital library Human-controlled interactive CBIR Machine-controlled interactive CBIR. “Get similar images”. CBIR. Query Sample. Results. Query by Example. Pick query examples and ask the system to retrieve “similar” images.
E N D
Topics • Applications of CBIR in digital library • Human-controlled interactive CBIR • Machine-controlled interactive CBIR
“Get similar images” CBIR Query Sample Results Query by Example • Pick query examples and ask the system to retrieve “similar” images.
QBIC(TM) – IBM's Query By Image Content http://www.hermitagemuseum.org/fcgi-bin/db2www/qbicSearch.mac/qbic?selLang=English
NETRA @ UCSB http://nayana.ece.ucsb.edu/M7TextureDemo/Demo/client/M7TextureDemo.html
Medical Decision Support • Breast cancer is among the top killers of women in the developed world. • Early detection of malignancy can greatly reduce the risk of death. Mammogram
Summary of Fundamental CBIR • CBIR using query by example • CBIR Algorithm: • First step --- Image Indexing • Second step --- content matching • Third step --- Ranking and displaying • Relevance feedback (RF)
Step I: Image Indexing • Image content = { Color, Shape, Texture } • color: Color histogram, Color Moments • Shape: Chaincodes, Fourier descriptors • Texture: Gabor wavelet features, Co-occurrence matrix • = { Color, Shape, Texture } • example
Step II: Content Matching • Content similarity measure is obtained by a distance function: where is the feature vector of query image is the feature vector of image in the database • Many distance function have been used: • Euclidean distance • l1-norm • cosine measure
Step III: Similarity Ranking • Calculate for each image in the database • Sort in decreasing order (assume J=10) • top 3 images: image6, image2, image9 • top 5 images: image6, image2, image9, image10, image7
Problems with CBIR In essence, retrieval is a pattern recognition problem with special characteristics. • Huge volume of (visual) data. • High dimensionality in feature space. • Query design: Gap between the high level concepts and low level features. • Linear matching criteria: A mismatch to the popular human perception model.
Example: Compressed Domain • Visual database in the compressed domain: • DCT: Many of the current image/video coding standards; JPEG, MPEG-1,2, and H.261/3. • Wavelets/VQ: Related to the new image coding standard JPEG2000. • Significant gap between human visual perception and information presentation in DCT/wavelet. Database JPEG/MPEG Image.jpg video.mpg feature vector DCT coeff.
State-of-the-art • Human controlled interactive CBIR (HCI-CBIR) • Integrating human perception into content-based retrieval. • Machine controlled interactive CBIR (MCI-CBIR) • To reduce bandwidth requirement for browsing and searching over the Internet. • To minimize errors caused by excessive human involvement.
Scenario • Machine provides initial retrieval results, through query-by-keyword, sketch, or example, etc.; • Iteratively: • User provides judgment on the current results as to whether, and to what degree, they are relevant to her/his request; • The machine learns and try again.
Query Feedback Feedback 2nd Result 1st Result Relevance Feedback • User gives a feedback to the query results • System recalculates feature weights and modified query Initial sample
Basic GUI for Relevance Feedback Checkbox Slider
ImageGroup Result View Pallete Panel
Texture Structure color 3D MARS Initial Display Result
Human-Controlled Interactive CBIR (HCI-CBIR) • An attractive solution to numerous applications • Main feature: an active role played by users to improve retrieval accuracy • State-of-the-art • query design considerations • linear criteria in similarity ranking
Effective Retrieval through User Interaction • Current systems: • QBIC: interactive region segmentation (IBM). • FourEyes: including human in the image annotation and retrieval loop, (MIT Media Lab). • WebSEEk: dynamic feature vector recomputation based on the user’s feedback (Colombia University). • PicHunter: a Bayesian framework for content based image retrieval (NEC Research Institute). • PicToSeek: (UVA). • MARS: a relevance feedback architecture in image retrieval (UIUC).
A “New” Proposal for HCI-CBIR The framework: Relevance feedback The key features: • Modeling: mapping a high level concept to low level features • Matching: • capturing user’s perceptual subjectivity to modify the query using non-linear measurement • overcoming the difficulties faced by the traditional linear matching criteria
The Relevance Feedback Framework • The goal: measure feature relevance to improve performance of image matching in retrieval • A supervised learning procedure based on regression • For a given query z and a set of the retrieved items, , category xn into two classes: • a relevant class (visually similar to z): x_m, m = 1, 2, …, M, and • an irrelevant class (not similar to z): x_q, q = 1, 2, …, Q • Structure a new query based on the information in x_m and x_q • Use the new query in the next round of retrieval
1 M å ¢ = = P { z } x z1 i 1 m M = m 1 Query Modification Model 1 The new query:
1 M å ¢ ¢ = , x x m M = m 1 ¢ ¢ + a - ¢ ¢ ¢ za < Þ Z2 = = x ( z N ) when x z 1 ¢ ¢ + a - ¢ ¢ ¢ zb > Þ = = Z2 x ( z N ) when x z 2 Query Modification Models 2&3(Anti Re-enforcement Learning) The new query: zb za Example (1-D) small positive constants Center of relevant items Center of non-relevant items query at previous iteration Where
Nonlinear Search Unit • Non-linear (Gaussian) Search Unit (NSU) - image feature vector: - adjustable query vector: - the tuning parameters (NSU widths): • Small i : a relevant feature (sensitive to change) • Large i : a nonrelevant feature (insensitive to change)
Linear Search Unit (LSU) • To benchmark the performance of the NSU • To initiate the search • The parameters: exactly the same as in the NSU
Output: k -retrieved images Perceptual similarity Weight-parameter image database User Interaction Measure updating storing of Feature Extraction& feature vectors RAM Similarity Measure (feature database) (tentative) Interactive searching query image Query Initial Searching User when iteration n=0 when iteration n>0 Architecture for Interactive CBIR
Codebook 1 i= 2 i= Image Blocks Code labels i i=n-3 i=n-2 i=n-1 i=n VQ Codewords as “content descriptors” The usage of codewords reflects the content of the input image encoded.
Two-level WT/VQ coding (1 bpp) Multiresolution Codebook Mallat’s two-level decomposition CB1 HL2 CB2 HL1 2 bpp H5 H4 H3 H2 H1 8 bpp 0.5 bpp CB3 VL2 Label Histogram 0.5 bpp 2 bpp CB4 VL1 0.5 bpp 0 bpp CB5 DL2
Test Database 1: Bordatz database • A texture image database provided by Mahjunath, at http://vivaldi.ece.ucsb.edu/users/wei/codes.html • 1,856 patterns in 116 different classes • 16 similar patterns in each class • Maintained as a single unclassified image database
Queries (the Bordatz Database) [116 different image classes]
Performance Comparison • Methods Compared • LSU2: linear search unit & query model 2 • NSU2: non-linear search unit & query model 2 • Interactive CBIR in MARS: Multimedia Analysis and Retrieval System (developed at UIUC)
Retrieval Rate (%) Methods t=0 t=1 t=2 t=3 Avg. LSU2 73.7 83.0 85.1 85.9 NSU2 73.7 84.9 88.2 89.2 MARS 67.0 75.1 76.4 76.7 Retrieval Results (the Bordatz Database) Table 1. Average retrieval rate (%) Note: The retrieval rate is defined as the average percentage of images belonging to the same class as the query in the top 16 matched.
iARM: Interactive-based Analysis and Retrieval of Multimedia On The Internet @iarm.ee.ryerson.ca:8000/corel
Strategy • iARM implements interactive retrieval for the large image database, running on the J2EE Web Server. • Interaction architecture. • Based on a non-linear relevance feedback, a multi-model SRBF network. • Positive and negative feedbacks. • Properties: local and non-linear learning, fast and robust on a small input data.
A single-pass Radial Basis function (SBRF) Network • the positive examples; • The SRBF network characterizes the query by multiple clusters, each of which is modeled by a p-D Gaussian distribution as: (1) where (2)
SRBF Network Cont.. • The Weighted-Euclidean Space (3) where (4) • A summation of M Gaussian units (1) yields similarity function for the input vector as follows: (5)
Multi-class approach Single-class approach
Negative Feedback Strategy • Tuning decision boundary with negative samples: • Antireinforced Learning Algorithm: • If (6) • Then (7)
Performance of IARM • Using Corel Image Collection, containing 40,000 real-life images, www.corel.com. • A total of 400 queries were generated, and relevance judgments were based on the “ground true” from Corel. • Multiple descriptors: <shape, color, texture> <Fourier descriptors, HSV color histogram& color moments, Gabor Wavelet transform>
Result Non-interactive CBIR iARM r(1) r(2) r(3) r(8) 53% 80.08% 85.99% 87.58% 89.00% Table 1: Average Precision Rate (%) obtained by retrieving 400 queries, measured from the top 16 retrievals.
Example: Looking for “model”.0. Start with choosing image at the bottom right corner as the query.
1. Result after the initial search, then five relevant images are the feedbacks.
2. Result after one relevance feedback: all the top sixteen are relevant.