210 likes | 800 Views
Machine Learning Feature Selection. Feature Selection for Pattern Recognition. J.-S. Roger Jang ( 張智星 ) CSIE Dept., National Taiwan University ( 台灣大學 資訊工程系 ) http://mirlab.org/jang jang@mirlab.org. Feature Selection: Goal & Benefits. Feature selection Also known as input selection
E N D
Machine Learning Feature Selection Feature Selection forPattern Recognition • J.-S. Roger Jang ( 張智星 ) • CSIE Dept., National Taiwan University • ( 台灣大學 資訊工程系 ) • http://mirlab.org/jang • jang@mirlab.org
Feature Selection: Goal & Benefits • Feature selection • Also known as input selection • Goal • To select a subset out of the original feature sets for better recognition rate • Benefits • Improve recognition rate • Reduce computation load • Explain relationships between features and classes
Exhaustive Search • Steps for direct exhaustive search • Use KNNC as the classifier, LOO for RR estimate • Generate all combinations of features and evaluate them one-by-one • Select the feature combination that has the best RR. • Drawback • d = 10 1023 models for evaluation Time consuming! • Advantage • The optimal feature set can be identified.
Exhaustive Search • Direct exhaustive search x1 x2 x3 x4 x5 1 input . . . x1, x2 x1, x3 x1, x4 x1, x5 x2, x3 2 inputs . . . x1, x2, x3 x1, x2, x4 x1, x2, x5 x1, x3, x4 3 inputs . . . x1, x2, x3, x4 x1, x2, x3, x5 x1, x2, x4, x5 4 inputs . . .
Exhaustive Search • Characteristics of exhaustive search for feature selection • The process is time consuming, but the identified feature set is optimum. • It’s possible to use classifiers other than KNNC. • It’s possible to use performance indices other than LOO.
Heuristic Search • Heuristic search for input selection • One-pass ranking • Sequential forward selection • Generalized sequential forward selection • Sequential backward selection • Generalized sequential backward selection • ‘Add m, remove n’ selection • Generalized ‘add m, remove n’ selection
Sequential Forward Selection • Steps for sequential forward selection • Use KNNC as the classifier, LOO for RR estimate • Select the first feature that has the best RR. • Select the next feature (among all unselected features) that, together with the selected features, gives the best RR. • Repeat the previous step until all features are selected. • Advantage • If we have d features, we need to evaluate d(d+1)/2 models A lot more efficient. • Drawback • The selected features are not always optimal.
Sequential Forward Selection • Sequential forward selection (SFS) x1 x2 x3 x4 x5 1 input x2, x1 x2, x3 x2, x4 x2, x5 2 inputs x2, x4, x1 x2, x4, x3 x2, x4, x5 3 inputs x2, x4, x3, x1 x2, x4, x3, x5 4 inputs . . .
Example: Iris Dataset • Sequential forward selection • Exhaustive search
Example: Wine Dataset • SFS • SFS with input normalization 6 selected features, LOO RR=97.8% 3 selected features, LOO RR=93.8% If we use exhaustive search, we have 8 features with LOO RR=99.4%
Use of Input Selection • Common use of input selection • Increase the model complexity sequentially by adding more inputs • Select the model that has the best test RR • Typical curve of error vs. model complexity • Determine the model structure with the least test error Test error Optimal structure Error rate Training error Model complexity (# of selected inputs)