150 likes | 252 Views
Eyes detection in compressed domain using classification. Technical University of Cluj-Napoca Faculty of Electronics, Telecommunications and Information Technology. Eng. Alexandru POPA alexandru_popa@autenticmedia.com. Contents:. Object detection in digital images
E N D
Eyes detection in compressed domain using classification Technical University of Cluj-Napoca Faculty of Electronics, Telecommunications and Information Technology Eng. Alexandru POPA alexandru_popa@autenticmedia.com
Contents: • Object detection in digital images • The principle of image processing in the compressed domain • The Discrete Cosine Transform (DCT) • The spatial relationship of DCT coefficients between a block and its sub-blocks • Object recognition using classification • The linear discriminant classifier (LDA, Fisher classifier) • Demo • Results • Conclusions 2
Object detection in digital images • the approached method consists in feature extraction using image transformations, creation of a new space of features followed by objects classification in that space • feature extraction methods: DCT, Wavelet, Gabor • DCT gives in general good features for object description.Is the base of the JPEG standard, and the properties of the DCT coefficients blocks, makes them very good for generating features spaces • the idea is to make the classification of the objects direct in JPEG compressed domain DCT = Discrete Cosine Transform 3
The principle of image processing in the compressed domain • almost all image processing algorithms are defined in pixel level; rewriting them in the compressed domain is not direct • standard implementation schemes decompress the image, apply the algorithm and them recompress the image. The disadvantage is that these schemes are time consuming • it is wished to rewrite these algorithms directly in the compressed domain for optimizing the processing chain 4
The Discrete Cosine Transform • The formula for DCT applied on a image: (1) (2) • Properties: • Decorelation – the principal advantage of transformed images is the low redundancy between neighbours pixels. From this fact results uncorrelated coefficients which can be coded independently • Energy compactness – the capacity of the transformation to pack the input datas in as few coefficients as possible • Separability – the 2D DCT can be calculated in two steps by applying the 1D formula successively on the lines and the columns of an image 5
The spatial relationship of DCT coefficients between a block and its sub-blocks • a new problem could occur from the fact that various DCT block sizes have to be used in order to ensure optimized performances • 8x8 blocks used in JPEG, 4x4 blocks used in image indexing, and 16x16 macro-blocks in MPEG • to deal with inter-transfer of DCT coefficients from different blocks with various sizes, the existing approach would have to decompress the pixel data in the spatial domain via the IDCT, redivide the pixels into new blocks with the required size and then apply the DCT again to produce the DCT coefficients • it is obvious that the approach is inefficient Bibliography:The Spatial Relationship of DCT Coefficients Between a Block and Its Sub-blocks, Jianmin Jiang and Guocan Feng 6
The spatial relationship of DCT coefficients between a block and its sub-blocks Original image 4x4 block • Transformation from 4 blocks of 2x2 pixels in one of 4x4 pixels: DCT The block with the pixels luminance The DCT coefficients of 4 block of 2x2 pixels Matricea A* • Ecuation: (3) 7
The spatial relationship of DCT coefficients between a block and its sub-blocks Original image 4x4 block • Transformation form a 4x4 block to 4 block of 2x2 pixels: DCT The block with the pixels luminance The DCT coefficients of The 4x4 block The inverse matrix of A* • Ecuation : (4) 8
Object recognition using classification • geometric classifiers are those classifiers which implies the deduction of some decision borders in the features space • a classifier demands a set of training datas(datas + labels) • the number of datasmust be big enough for a correct learning with generalization capacity for unknown datas • Data classification: • means that an unknown sample is presented to the classifier, his position regarding the decision boundaries is calculated and depending on it a label is associated 9
The linear discriminant classifier (LDA, Fisher classifier) • LDA (Linear Discriminant Analysis) using Fisher’s classifier implies finding a line in the features space and projecting the datas from the training set on this line. Describes the datas by their projections • Considering a bi-dimensional space we have: Fisher’s criteria for selecting w and w0 parameters: • The optimal direction w is the line direction for which: • 1) the distance between the projections of the classes centers on w is maximum • 2) the variance of the projections from each class is minimum • The optimum value w0 is the scalar value which minimize the classification error in the training data set is the label assigned to the i data by the Fisher classifier 10
Results The image form which the training set was taken 13
Conclusions • it was proved that the implementation of Fisher`s classifier in compressed domain was a wise choice because it has good results in eyes regions detection • it`s a novelty in the image processing field because this algorithm wasn`t written in compressed domain • using the spatial relationship of DCT coefficients between a block and its sub-blocks facilitates the computation of coefficients for big blocks starting from small blocks in the way of speed and computation complexity • Others applications that can derive: • gaze tracking/focusing • automatic system for detecting the vigilance of drivers • biometrics applications: person identification using iris recognition • , contează foarte mult structura acesteia precum şi setul de antrenare 14