1 / 20

Faten Hussein

The University of British Columbia Department of Electrical & Computer Engineering. Feature Selection and Weighting using Genetic Algorithm for Off-line Character Recognition Systems. Presented by. Faten Hussein. Outline. Introduction & Problem Definition Motivation & Objectives

Download Presentation

Faten Hussein

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The University of British Columbia Department of Electrical & Computer Engineering Feature Selection and Weighting using Genetic Algorithm for Off-line Character Recognition Systems Presented by Faten Hussein

  2. Outline • Introduction & Problem Definition • Motivation & Objectives • System Overview • Results • Conclusions

  3. Introduction Off-line Character Recognition System Text document • Address readers • Bank Cheques readers • Reading data entered in forms (tax forms) • Detecting forged signatures Scanning Pre-Processing Feature Extraction Classification Classified text Post-Processing

  4. Introduction For typical handwritten recognition task: • Many variants of character (symbol) shape, size. • Different writers have different writing styles. • Same person could have different writing style. • Thus, unlimited number of variations for a single character exists.

  5. Introduction Variations in handwritten digits extracted from zip codes L=0, E=3 To overcome this diversity, a large number of features must be added L=1, E=1 L=2, E=0 An example of features that we used are: moment invariants, number of loops, number of end points, centroid, area, circularity and so on.

  6. Problem Dilemma Add more features • Increase problem size • Increase run time/memory for classification • To accommodate variations in symbols • Add-hoc process, depends on experience and trail and error Character Recognition System • Might add redundant/irrelevant features which decrease the accuracy • Hope to increase classification accuracy

  7. Feature Selection Solution:Feature Selection Definition:Select a relevant subset of features from a larger set of features while maintaining or enhancing accuracy Advantages • Remove irrelevant and redundant features • Total of 40 features -> reduced to 16 • 7 Hu moments -> only first three • Area removed -> redundant (Circularity) • Maintain/enhance the classification accuracy 70% recognition rate using 40 features -> 75% after FS & using only 16features • Faster classification and less memory requirements

  8. Feature Selection/Weighting • The process of assigning weights (binary or real valued) to features needs a search algorithm to search for the set of weights that results in best classification accuracy (optimization problem) • Genetic algorithm is a good search method for optimization problems

  9. Genetic Feature Selection/Weighting Why use GA for FS/FW • Has been proven to be a powerful search method for FS problem • Does not require derivative information or any extra knowledge; only the objective function (classifier’s error rate) to evaluate the quality of the feature subset • Search a population of solutions in parallel, so they can provide a number of potential solutions not only one • GA is resistant to becoming trapped in local minima

  10. Objectives & Motivations Build a genetic feature selection/weighting system to be applied to character recognition problem and investigate the following issues: • Study the effect of varying weight values on the number of selected features (FS often eliminates more features than FW, how much ??) • Compare the performance of genetic feature selection/weighting in the presence of irrelevant & redundant features (not studied before) • Compare the performance of genetic feature selection/weighting for regular cases (test the hypothesis that says that FW should have better or at least same results as FS ??) • Evaluate the performance of the better method (GFS or GFW) in terms of optimality and time complexity (study the feasibility of genetic search for optimality & time)

  11. Methodology • The recognition problem is to classify isolated handwritten digits • Used k-nearest-neighbor as a classifier (k=1) • Used genetic algorithm as search method • Applied genetic feature selection and weighting in the wrapper approach (i.e. fitness function is the classifier’s error rate) • Used two phases during the program run: training/testing phase and validation phase

  12. System Overview Best feature subset (M <N) Pre-Processing Module Feature Extraction Module All Extracted features N Feature selection/weighting Module (GA) Input (isolated handwritten digits images) Clean images Assessment of feature subset Feature subset Evaluation Module (KNN classifier) Training/Testing Evaluation Validation

  13. Results (Comparison 1) Effect of varying weight values on the number of selected features • As the number of weight values increase, the probability of a feature having weight value=0 (POZ) decreases, so the number of eliminated features decreases • GFS eliminates more features (thus selects less features) than GFW because of its smaller number of weight values (0/1) and without compromising classification accuracy

  14. Performance of genetic feature selection/weighting in the presence of irrelevant features Results (Comparison 2) • The performance of 1-NN classifier rapidly degrades by increasing the number of irrelevant features • As the number of irrelevant features increases, FS outperform all FW settings in both classification accuracy and elimination of features

  15. Performance of genetic feature selection/weighting in the presence of redundant features Results (Comparison 3) • The classification accuracy of 1-NN does not suffer so much by adding redundant features, but they increase the problem size • As the number of redundant features increases, FS has slightly better classification accuracy than all FW settings, but significantly outperform FW in elimination of features

  16. Performance of genetic feature selection/weighting for regular cases (not necessarily having irrelevant/redundant) Results (Comparison 4) • FW has better training accuracies than FS, but FS is better in generalization (have better accuracies for unseen validation samples) • FW over-fits the training samples

  17. Results (Evaluation 1) Convergence of GFS to an Optimal or Near-Optimal Set of Features • GFS was able to return optimal or near-optimal values (reached by the exhaustive search) • The worst average value obtained by GFS less than 1% away from optimal value

  18. Number of Features Best Exh. (opt. & near-opt.) Exhaustive Run Time Best GA Average GA (for 5 runs) Number of Generations GA Run Time (single run) 8 74, 73.8 2 minutes 74 73.68 5 2 minutes 10 75.2, 75 13 minutes 75.2 74.96 5 3 minutes 12 77.2, 77 47 minutes 77 76.92 10 5 minutes 14 79, 78.8 3 hours 79 78.2 10 5.5 minutes 16 79.2, 79 6 hours 79.2 78.48 15 8 minutes 18 79.4, 79.2 1.5 days 79.4 78.92 20 11 minutes Convergence of GFS to an Optimal or Near-Optimal Set of Features within an Acceptable Number of Generations Results (Evaluation 2) The time needed for GFS is bounded by (lower) linear-fit and (upper) exponential-fit curves The use of GFS for highly dimensional problems need parallel processing

  19. Conclusions • GFS is superior to GFW in feature reduction and without compromising classification accuracy • In the presence of irrelevant features, GFS is better than GFW in both feature reduction and classification accuracy • In the presence of redundant features, GFS is also preferred over GFW due its increased ability to feature reduction • For regular databases, it is advisable to use 2 or 3 weight values at most to avoid over-fitting • GFS is a reliable method to find optimal or near-optimal solution, but need parallel processing for large problem sizes

  20. Questions ?

More Related