170 likes | 273 Views
Significance Test for Feature Selection on Image Recognition. Qianren Xu Mohamed Kamel Magdy M. A. Salama. Outline. Introduction Methodology Experiment Results Conclusion. Introduction. Problems on Features. Problems on Feature Selection Methods. Very large number Irrelevant Noise
E N D
Significance Test for Feature Selection on Image Recognition Qianren Xu Mohamed Kamel Magdy M. A. Salama
Outline • Introduction • Methodology • Experiment Results • Conclusion
Introduction Problems on Features Problems on Feature Selection Methods • Very large number • Irrelevant • Noise • Correlation • Computational complexity • Optimal deficiency
Proposed Method Criterion of Feature Selection Significance of feature Significant difference = X Independence Pattern separabilityon individual candidate features Noncorrelation betweencandidate feature and already-selected features
Measurement of Pattern Separability on Individual Features Assuming data is continuous data with normal distribution Significant Difference More than two classes: ANOVA Two classes: t-test Difference between classes Variance among classes Distribution variance Variance within classes
Independence Definition: The degree of un-correlation between two variables (features). Magnitude within class: r is Pearson correlation coefficient Overall Independence over all classes:
Selecting Procedure MSDI: Maximum Significant Differenceand Independence Algorithm MIC: Monotonically IncreasingCurve Strategy
Maximum Significant Difference and Independence (MSDI) Algorithm Compute the significance difference (sd) of every initial features Select the feature with maximum sd as the first feature Computer the independence level (ind) between every candidate feature and the already-selected feature(s) Select the feature with maximum feature significance (sf = sd x ind) as the new feature
Monotonically Increasing Curve (MIC) Strategy Performance Curve The feature subset selected by MSDI 1 Plot performance curve 0.8 Rate of recognition Delete the features that have “no good” contribution to the increasing of recognition 0.6 0.4 0 10 20 30 Number of features Until the curve is monotonically increasing
Example I: Handwritten Digit Recognition • 32-by-32 bitmaps are divided into 8X8=64 blocks • The pixels in each block is counted • Thus 8x8 matrix is generated, that is 64 features
MSDI MIFS(β=0.2) MIFS(β=0.4) MIFS(β=0.6) MIFS(β=0.8) MIFS(β=1.0) MSDI: Maximum Significant Difference and Independence MIFS: Mutual Information Feature Selector Performance Curve 1 0.9 Battiti’s MIFS: 0.8 Rate of recognition 0.7 It is need to determined β 0.6 Random ranking 0.5 0.4 0 10 20 30 40 50 60 Number of features
Computational Complexity • Selecting 15 features from the 64 original feature set • MSDI: 24 seconds • Battiti’s MIFS: 1110 seconds (5 vales of β are searched in the range of 0-1)
Example II: Handwritten digit recognition The 649 features that distribute over the following six feature sets: • 76 Fourier coefficients of the character shapes, • 216 profile correlations, • 64 Karhunen-Love coefficients, • 240 pixel averages in 2 x 3 windows, • 47 Zernike moments, • 6 morphological features.
MSDI + MIC Random ranking MSDI: Maximum Significant difference and independence MIC: Monotonically Increasing Curve Performance Curve 1 0.8 Rate of recognition MSDI 0.6 0.4 0.2 0 10 20 30 40 50 Number of features
MSDI: Maximum Significant Difference and Independence MIFS: Mutual Information Feature Selector Comparison with MIFS MSDI is much better on large number of features 1 0.9 MSDI 0.8 MIFS (β=0.2) Rate of recognition MIFS (β=0.5) 0.7 0.6 0.5 MIFS is better on small number of features 0.4 0 10 20 30 40 50 Number of features
Conclusion • STFS selects features by maximum significant difference and independence (MSDI), it aims to determine minimum possible feature subset to achieve maximum recognition rate • Feature significance (selecting criterion ) is estimated based on the optimal statistical models in accordance with the properties of the data • Advantages: • Computationally effective • Optimality
Thanks for your time Questions?