1 / 29

Image Databases

Image Databases. Conventional relational databases, the user types in a query and obtains an answer in response It is different in image databases

lowell
Download Presentation

Image Databases

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Image Databases • Conventional relational databases, the user types in a query and obtains an answer in response • It is different in image databases • a police officer may issue a query: “ Retrieve all pictures from the image database that are “similar” to this person and give the identities of the people.” • This query is fundamentally different from ordinary queries for 2 reasons: • 1. The query includes a picture as part of the query • 2. The query asks about similar pictures and therefore uses a notion of “imprecise match”

  2. Raw images • the content of an image consists of all “interesting” objects in that image • each object is characterized by • a shape descriptor: that describes the shape/location of the region within which the object is located inside the image • a property descriptor that describes the properties of individual pixels (e.g. RGB values of the pixel, RGB values aggregated over a group of pixels, grayscale levels) • a property consists of • a property name, e.g., red, green, blue, texture • a property domain - range of values that a property can assume {0, 1, ..7}

  3. Images • Every image is associated with a pair of positive integers (m,n), called grid-resolution, which divides the image into (mn) cells of equal size (called image grid) • Each cell consists of a collection of pixels • A cell property: (Name, Values, Method) • Example • (bwcolor, {b,w}, bwalgo}, where the possible values are b(black) and w(white), and bwalgo is an algorithm that takes a cell as an input and returns either black or white by somehow combining the black/white levels of the pixels in the cell • (graylevel, [0,1], grayalgo), where the possible values are real numbers within the interval [0,1].

  4. Image Database • Image Database: (GI,Prop,Rec) • GI is a set of gridded images (Image,m,n) • Prop is a set of cell properties • Rec is a mapping that associates with each image, a set of rectangles denoting objects (in fact this does not necessarily have to be rectangle)

  5. Problems with image databases • Images are often very large • infeasible to explicitly store the properties on a pixel by pixel basis • This led to a family of image “compression” techniques: attempt to compress the image into one containing fewer pixels • There is a need to determine the “features” of the image (compressed or raw) • done by “segmentation” : breaking up the image into a set of homogeneous rectangular regions called segments • Need to support “match” operations that compare either a whole image or a segmented image against another

  6. Image Compression • Lossy Compression • Image may contain details that human eye cannot recognize • get rid of those details • DCT(Discrete Cosine Transform) • DFT(Discrete Fourier Transform) • DWT(Discrete Wavelet Transform) • convert images from time domain(Spatial) to frequency domain • get rid of the frequencies which do not contain information. • Transforms • DCT and DFT are similar concepts • From time domain to signal domain • Given a signal of length “n”, these transforms return a sequence of n frequencies. • Sample1, Sample2, . . . . . . . , Sample n transforms to : • Freq1, Freq2, . . . . . . . . . , Freq n.

  7. Why do we use the transform • Noise removal is easier in the frequency domain • Various filters are easier to implement in frequency domain • Compression (gathers similar values together)

  8. Desirable Properties of Transforms • DFT • Invertibility: It is possible to get back the original image I from its DFT representation. (useful for decompression) • Note: practical implementations of DFT often use DFT with other non-invertible operations: thus sacrifice invertibility • Distance preservation: DFT preserves Euclidean distance. • This is important in image matching applications where we often use distance measures to represent similarity levels • DCT • DCT preserves all the above • a given signal can be represented with fewer frequencies • DWT • DFT and DCT have no temporal locality • a change in one single part of data changes all frequencies • wavelets introduce locality

  9. Distance preservation

  10. Distance preservation

  11. Fractal Compression • Transform-based approaches benefit from the difference in visual perception in different frequencies • What else can we use for compression ? • Self similarity • We can find self similarities in a given image and describe the image in terms of these similarities.

  12. Fractal Compression

  13. Image Processing: Segmentation • A process of taking an image as input and cutting up the image into disjoint homogeneous regions • Connected region (R): • is a set of cells C1 .. Cn in R such that the Euclidean distance between Ci and Ci+1 for all i < n is 1 • Example • R1,R2,R3 is connected • R1 R2 is connected • R2 R3 is connected • R1R2 R3 is connected • R1 R3 is not connected • Because the Euclidian • distance between (2,3) • and (3,4) is 2>1 4 3 2 1 R3 R1 R2 1 2 3 4

  14. Measuring Homogeneity • Homogeneity predicate: is a function H that takes any connected region as input and returns either true or false • Example 1: • Suppose  is some real number between 0 and 1 • Hbw can be defined as Hbw (R) is true if over (100*)% of cells in R have the same color Region H0.8bw H0.89bwH0.92bw R1 true false false R2 true true false R3 true true false Region # of black #of white cells cells R1 800 200 R2 900 100 R3 100 900

  15. Measuring Homogeneity • Example 1: • Suppose each cell has a real value between 0, 1, this value is bw-level • Suppose f assigns a value between 0 and 1 to each cell • Assume  is the noise factor and  a threshold • H,f,(R) is true if {(x,y)| |bwlevel(x,y)-f(x,y)|< }/(mn) > 

  16. Segmentation • Given an image I with (mn) cells, a segmentation of I wrt a homogeneity predicate P is a set of R1, .Rk such that • Ri  Rj =  for all 1 i  j  k • I = R1  ..  Rk • H(Ri) = true for all i  j  k • for all distinct i,j, 1 I, j  n such that Ri  Rj is a connected region, it is the case that H(Ri  Rj) = false

  17. An Example of Segmentation • For Hdyn,0.03(R) of the following (44) image will yield the following segmentation • R1 = {(1,1),(1,2)} • R2 = {(1,3),(2,1),(2,2),(2,3)} • R3 = {(3,1),(3,2),(3,3),(4,1),(4,2)} • R4 = {(3,4),(4,3),(4,4)} • R5 = {(1,4),(2,4)} Row/Col 1 2 3 4 1 0.1 0.25 0.5 0.5 2 0.05 0.30 0.6 0.6 3 0.35 0.30 0.55 0.8 4 0.6 0.63 0.85 0.90 Row/Col 1 2 3 4 1 0.1 0.25 0.5 0.5 2 0.05 0.30 0.6 0.6 3 0.35 0.30 0.55 0.8 4 0.6 0.63 0.85 0.90

  18. Segmentation Algorithm • Split: • if the whole image is homogeneous, we are done • otherwise, split the image into two parts and recursively repeat this process till we find a set of R1 .. Rn such that each region is homogeneous • Merge: • check whether any of the Ri’s can be merged together • at the end of this step, we obtain a valid segmentation R1, ..Rk

  19. Similarity Based Retrieval

  20. Similarity Based Retrieval

  21. Similarity Based Retrieval • The Metric Approach: • Uses a distance measure d that can compare tow images • The smaller the distance, the more similar they are • I.e., given an input image I, find the “nearest neighbor” of I in the image archive • The Transformation Approach: • The metric approach assumes that the notion of similarity is fixed • Whereas the transformation approach computes the cost of transforming one image into another based on user-specified cost functions that may vary from one query to another

  22. The Metric Approach • We define a distance function on a k dimensional space (k=n+2) • the distance function satisfies the following properties • d(x,y) = d(y,x) • d(x,z)  d(x,z) + d(z,y) • d(x,x) = 0 • Example: Let the image object consists of (256256) cells with 3 attributes (red,green,blue) each of which assumes a value from {0,…7} • di(o1,o2) =    (diffr[i,j]+diffg[i,j]+diffb[i,j]) • where diffr[i,j] = (o1[i,j].red - o2[i,j].red)2 • diffg[i,j] = (o1[i,j].green - o2[i,j].green)2 • diffb[i,j] = (o1[i,j].blue - o2[i,j].blue)2 • Such computations can be cumbersome (65536 expressions being computed inside the sum)

  23. The Metric Approach • How can this massive similarity computation be avoided? • Through feature extraction! • Use a good feature extraction function fe and use it to map objects into single points in a s-dimensional space where s would typically be pretty small compared to n+2 • This leads to two reductions • an object is originally is a set of points in an (n+2) dimensional space. In contrast, fe(0) is a single point • fe(o) is a point in s-dimensional space where s << (n+2) • The feature extraction mapping must preserve the distance relationships in the original space • (n+2) dim space  s-dim space  indexing algorithm  index  object repository (could be quadtree, R-tree for s-dim data)

  24. Searching • Finding the best matches • find the nearest neighbors of fe(o) in the tree using a nearest neighbor search technique. • Finding sufficiently similar objects • execute a range query in the tree with center fe(o) and radius 

  25. The Transformation Approach • The main principle • the level of dis-similarity between o1,o2 is proportional to the cost of transforming o1 into o2, or vice-versa • Transformation operators • translation • rotation • scaling (uniform and nonuniform) • excision • Transformation of o into o’ is a sequence of transformation operations (to1,to2, ..tor) such that • to1(o) = o1 • …... • To(or) = o’ • Cost of transformation, cost(TS) =  cost(toi)

  26. Example

  27. Example

  28. Example

  29. Transformation vs. Metric • Advantages of the transformation model • user can setup his own notion of similarity by specifying certain transformation operators • user may associate a cost function with each transformation operator • Advantages of the metric model • by forcing the user to use only one similarity metric, the system can facilitate the indexing of data so as to optimize nearest neighbor search

More Related