190 likes | 319 Views
Modification of Correlation Kernels in SVM, KPCA and KCCA in Texture Classification. Yo Horikawa Kagawa University , Japan. ・ Support vector machine (SVM) ・ Kernel principal component analysis (kPCA) ・ Kernel canonical correlation analysis (kCCA)
E N D
Modification of Correlation Kernels in SVM, KPCA and KCCA in Texture Classification Yo Horikawa Kagawa University , Japan
・Support vector machine (SVM) ・Kernel principal component analysis (kPCA) ・Kernel canonical correlation analysis (kCCA) with modified versions of correlation kernels → Invariant texture classification Compare the performance of the modified correlation kernels and the kernel methods.
Support vector machine (SVM) Sample data: xi (1 ≤ i ≤ n), belonging to Class ci ∊ {-1, 1} SVM learns a discriminant function for test data x: d(x) = sgn(∑i=1n’ αicik(x, xsi) + b) αi and b are obtained through the quadratic programming problem. Kernel function: Inner product of nonlinear maps φ(x): k(xi, xj) = φ(xi)・φ(xj) Support vectors: xsi (1 ≤ i ≤ n’ (≤ n)): a part of sample data Feature extraction process is implicitly done in SVM through the kernel function and support vectors.
Kernel principal component analysis (kPCA) Principal components for for the nonlinear map φ(xi) are obtained through the eigenproblem:Φv =λv (Φ: Kernel matrix (Φij = φ(xi)∙φ(xj) = k(xi, xj)) )Let vr = (vr1, …, vrn)T (1 ≤ r ≤ R ( ≤ n)) be the eigenvectors in the non-increasing order of the corresponding non-zero eigenvalues λr, which are normalized as λrvr∙Tvr = 1. The rth principal component ur for a new data x is obtained by ur = ∑i=1n vriφ(xi)∙φ(x) = ∑i=1n vrik(xi, x) Classification methods, e.g., the nearest-neighbor method, can be applied in the principal component space (u1, ∙∙∙, uR).
Kernel canonical correlation analysis (kCCA) Pairs of feature vectors of sample objects: (xi, yi) (1 ≤ i ≤ n) KCCA finds projections (canonical variates) (u, v) that yield maximum correlation between φ(x) and θ(y). (u, v) = (wφ・φ(x), wθ・θ(y)) wφ = ∑i=1nfiφ(xi), wθ = ∑i=1ngiθ(yi)where fT = (f1, ∙∙∙, fn) and gT = (g1, ∙∙∙, gn) are the eigenvectors of the generalized eigenvalue problem: Φij= φ(xi)・φ(xj) Θij = θ(yi)・θ(yj)I: Identity matrix of n×n
Application of KCCA for classification problemsUse an indicator vector as the second feature vector y. y = (y1, ∙∙∙, ync) corresponding to x: yc = 1 if x belongs to class cyc = 0 otherwise (nc: the number of classes) Mapping θ of y is not used. A total of nc-1 eigenvectors fr = (fr1, …, fkn) (1 ≤ k ≤ nc-1) corresponding to non-zero eigenvalues are obtained. Canonical variatesur (1 ≤ r ≤ nc-1) for a new object (x, ?) are calculated by ur = ∑r=1nfrφ(xr)・φ(x) = ∑r=1nfr k(xr, x) Classification methods can be applied in the canonical variate space (u1, …, unc-1).
Correlation kernel The kth-order autocorrelation of data xi(t): rxi(t1, t2, ∙∙∙ , tk-1) = ∫xi(t)xi(t+t1)・・・xi(t+tk-1)dt The inner product between rxi and rxj is calculated with the k-th power of the cross-correlation function (2nd-order): rxi・rxj =∫{ccxi, xj(t1)}k dt1 (ccxi, xj(t1) =∫xi(t)xj(t+t1)dt) The calculation of explicit values of the autocorrelations is avoided. → High-order autocorrelations are tractable with practical computational cost. ・Linear correlation kernel: K(xi, xj) = rxi・rxj ・Gaussian correlation kernel: K(xi, xj) = exp(-μ|rxi - rxj|2) = exp(-μ(rxi・rxj + rxi・rxj - 2rxi・rxj))
Calculation of correlation kernels rxi・rxj for 2-dimensional image data: x(l, m) (1≤ l ≤ L, 1≤ m ≤ M) ・Calculate the cross-correlations between xi(l, m) and xj(l, m): ccxi, xj(l1, m1) = ∑l=1L-l1∑m=1M-m1xi(l, m)xj(l+l1, m+m1)/(LM) (1 ≤ l1 ≤ L1, 1 ≤ m1 ≤ M1) ・Sum up the kth-power of the cross-correlations: rxi・rxj = ∑l1=0L1-1∑m1=0M1-1 {ccxi, xj(l1, m1)}k /(L1M1) M M1 ∑l,m xi(l+m)xj(l+l1,m+m1) xi(l, m) L L1 xj(l+l1,m+m1) rxi・rxj = ∑l1, m1 {・}k
Problem of correlation kernels The order k of correlation kernels increases. → The generalization ability and robustness are lost. rxi・rxj = ∑t1 (ccxi, xj(t1))k → δi, j (k → ∞) For test data x (≠xi), rxi・rx = 0 In kCCA, Φ= I, Θ: block matrix, eigenvectors: f = (p1, …, p1, p2, …, p2, … , pC, …, pC) (fi = pc, if xi∊ class c) For sample data, canonical variates lie on a line through the origin corresponding to its class: uxi = (rxi・rxi)pc (pc = (pc,1, ∙∙∙, pc,C-1)) , if xi∊ class c For test data: ux≈0
(a) linear kernel (ⅰ) (b) Gaussian kernel (ⅱ) (d) 3rd-order correlation kernel (ⅲ) (c) 2nd-order correlation kernel (ⅲ) Most of test data u≈0 (e) 4th-order correlation kernel (ⅲ) (f) 10th-order correlation kernel (ⅲ) Fig. A. Scatter diagram of canonical variates (u1, u2) and (u3, u1) of Test 1 data of texture images in the Brodatz album in kCCA. Plotted are square (■) for D4, cross (×) for D84, circle (●) for D5 and triangle (Δ) for D92.
Modification of correlation kernels ・The kth root of the kth-order correlation kernel in the limit of k→∞ is related to the max norm, which corresponds to the Lp norm ||x||p = {∑|xi|p}1/p in the limit of p→∞. The max norm corresponds to the peak response of a matched filter, which maximizes SNR, and is then expected to have robustness. Then the correlation kernel can be modified with its kth root, taking account of its sign. ・A difference between the even and odd-order correlations is that the odd-order autocorrelations are blind to sinusoidal signals and random signals with symmetric distributions. This is attributed to the fact that changes in the sign of the original data (x→-x) cause changes in the signs of the autocorrelations of odd-orders but not of even-orders. In the correlation kernel, it appears as the parity of the number of the power of the cross-correlations. Then the absolute values of the cross-correlations might be used instead.
Proposed modified autocorrelation kernels Lp norm kernel (P) : sgn(ccxi, xj(l1, m1))|∑l1,m1{ccxi, xj(l1, m1)}k|1/k Absolute kernel (A) : ∑l1, m1|ccxi, xj(l1, m1)|k Absolute Lp norm kernel (AP): |∑l1, m1{ccxi, xj(l1, m1)}k|1/k Absolute Lp norm absolute kernel (APA): |∑l1, m1|ccxi, xj(l1, m1)|k|1/k Max norm kernel (Max): max l1, m1ccxi, xj(l1, m1) Max norm absolute kernel (MaxA): max l1, m1|ccxi, xj(l1, m1)|
Classification experiment Table 1. Sample and test sets. Fig. 1. Texture images. 4-class classification problems with SVM, kPCA and kCCA Original images: 512×512 pixels (256 gray scale) in the VisTex database and the Brodatz album Sample and test images: 50×50 pixels, chosen in the original images with random shift and scaling, rotation, Gaussian noise (100 images each)
Kernel functions K(xi, xj) Linear kernel: xi・xj Gaussian kernel: exp(-μ||xi – xj||2) Correlation kernels: rxi・rxj(C2-10)Modified correlation kernels: (P2-10, A3-7, AP3-7, APA3-7, Max, MaxA) Range of correlation lags: L1 = M1 = 10 (in 50×50 pixel images)The simple nearest-neighbor classifier is used for classification in the principal component space (u1, ∙∙∙, uR) in kPCA and with canonical variate space (u1, …, uC-1) in kCCA. Parameter values are empirically chosen. (Soft margin: C = 100, Regularization:γx =γy = 0.1)
Comparison of the performance Correct classification rates (CCRs) of the correlation kernels (C2-10) of odd- or higher-orders are low. With the modification, the Lp norm kernels (P2-10) and the absolute kernels (A3-7) give high CCRs even for higher-orders and for odd-orders, respectively. Their combination (AP3-7, APA3-7), and the max norm kernels (Max, MaxA) also show good performance. Table 2. Highest correct classification rates.
Summary Modified versions of the correlation kernels are proposed. ・Apply of the Lp norm and max norm → The poor generalization of the higher-order correlation kernels is improved. ・Use of the absolute values → The inferior performance of the correlation kernels of odd-orders to even-orders due to the blindness to sinusoidal or symmetrically distributed signals is also improved. SVMs, kPCA and kCCA with the modified correlation kernels show good performance in texture classification experiments.