240 likes | 344 Views
Minshan Cui, Saurabh Prasad, Majid Mahroogy , Lori Mann Bruce, James Aanstoos. Genetic Algorithms and Linear Discriminant Analysis based Dimensionality Reduction for Remotely Sensed Image Analysis. Stepwise LDA (S-LDA), (DAFE)
E N D
Minshan Cui, Saurabh Prasad, Majid Mahroogy, Lori Mann Bruce, James Aanstoos Genetic Algorithms and Linear Discriminant Analysis based Dimensionality Reduction for Remotely Sensed Image Analysis
Stepwise LDA (S-LDA), (DAFE) A preliminary forward selection and backward rejection is employed to discard less relevant features. A Linear Discriminant Analysis (LDA) projection is applied on this reduced subset of features to further reduce the dimensionality of the feature space. Drawbacks In forward selection, one is unable to reevaluate the features that become irrelevant after adding some other features. In backward rejection, one is unable to reevaluate the features after they have been discarded. Traditional Approaches (Stepwise Selection, Greedy Search, …)
Genetic algorithms are a class of optimization techniques that search for the global minimum of a fitness function. This typically involves four steps – evaluation, reproduction, recombination, and mutation. Genetic Algorithm
Genetic Algorithm (Select 4 bands out of 10 bands) Repeating this process until one of stopping criteria is met Fitness Function Population Rank Next Generation Reproduction Fitness Value Mutation Crossover
Using genetic algorithm with BD or Fisher’s ratio as a fitness function to select most relevant features in a dataset. Bhattacharyya distance (BD) Fisher’s ratio Applying linear discriminant analysis on the selected features to further extract features. Genetic Algorithm based Linear Discriminant analysis
Genetic Algorithm based Linear Discriminant analysis Genetic Algorithm Original Features Selected Features Fitness Function Bhattacharyya Distance or Fisher’s Ratio Extracted Features Linear Discriminant Analysis
Hyperspectral Imagery (HSI) Experimental Hyperspectral Dataset • Using NASA’s AVIRIS sensor • 145x145 pixels and 220 bands • in the 400 to 2450 nm region • of the visible and infrared • spectrum. • Ground truth • of HSI data • Feature layers Figure 1: A plot of reflectance versus wavelength for eight classes of spectral signatures from AVIRIS Indian Pines data.
Experimental Hyperspectral Dataset 3 days after spray Untreated Check 21 days after spray Check 0.01 kg ae/ha 0.02 kg ae/ha 0.03 kg ae/ha 0.05 kg ae/ha 21 days after spray 0.11 kg ae/ha 0.43 kg ae/ha 0.22 kg ae/ha 0.43 kg ae/ha 3 days after spray
Synthetic Aperture Radar (SAR) Experimental Synthetic Aperture Radar Dataset • From NASA Jet Propulsion Laboratory’s Unmanned Aerial Vehicle Synthetic Aperture Radar (UAVSAR) • Two classes – healthy levees • and levees with landslides on • them • Breached • Levee • Ground truth • of SAR data Table 1: Illustrating some salient characteristics of UAVSAR • Feature layers • via GLCM
HSI and SAR analysis using: LDA Stepwise LDA (S-LDA) GA-LDA-Fisher (Using Fisher’s ratio as a fitness function in GA.) GA-LDA-BD (Using Bhattacharyya distance as a fitness function in GA.) Performance measures: Overall recognition accuracies Experiments
GA search is very effective at selecting the most pertinent features. Given a moderate feature space dimensionality and sufficient training samples, LDA is a good projection based dimensionality reduction strategy. As the number of features increases and the training-sample-size decreases, methods such as GA-LDA can assist by providing a robust intermediate step of pruning away redundant and less useful features. Conclusions
[1] Ho-Duck Kim, Chang-Hyun Park, Hyun-Chang Yang, Kwee-Bo Sim, “Genetic Algorithm Based Feature Selection Method Development for Pattern Recognition,” in SICE-ICASE, 2006. [2] Chulhee Lee and Daesik Hong, “Feature Extraction Using the Bhattacharyya Distance,” in IEEE International on Systems, Man, and Cybernetics, 1997 [3] Tran HuyDat, Cuntai Guan, “Feature selection based on fisher ratio and mutual information analysis for robust brain computer interface,” in IEEE International Conference on Acoustics, Speech and Signal Processing, 2007. [4] R.O. Duda, P.E. Stark, D.G. Stork, Pattern Classification, Wiley Inter-science, October 2000. [5] S. Prasad and L. M. Bruce, “Limitations of Principal Components Analysis for Hyperspectral Target Recognition,” in IEEE Geoscience and Remote Sensing Letters, vol. 5, pp. 625-629, 2008. [6] S. Kumar, J. Ghosh, M.M. Crawford, “Best-bases feature extraction algorithms for classification of hyperspectral data,” in IEEE Transactions on Geoscience and Remote Sensing, Vol. 39, No. 7, pp 1368-1379, July 2001. [7] Nakariyakul, S. ,Casasent, D.P., “Improved forward floating selection algorithm for feature subset selection,” in Proceedings of the 2008 International Conference on Wavelet Analysis and Pattern Recognition, HongKong, 30-31 Aug. 2008. [8] K.S. Tang, K.F. Man, S. Kwong, Q. He, “Genetic algorithms and their applications,” in IEEE Signal Processing Magazine, Vol. 13, Nov 1996. [9] NASA Jet Propulsion Laboratory Airborne Visible/Infrared Imaging Spectrometer (AVIRIS) web page http://aviris.jpl.nasa.gov/ [10] Kevin Wheeler, Scott Hensley, Yunling Lou, Tim Miller, Jim Hoffman, "An L-band SAR for repeat pass deformation measurements on a UAV platform", 2004 IEEE Radar Conference, Philadelphia, PA, April 2004. (classifying health levee from landslide in a UAVSAR image). References
Thank You Questions - Comments - Suggestions Minshan Cui minshan@gri.msstate.edu
Elite count Specifies the number of individuals that are guaranteed to survive to the next generation. Crossover fraction Specifies the fraction of the next generation, other than elite children, that are produced by crossover. Ex) Assume population size =10, elite count = 2 and crossover fraction = 0.8 nEliteKids = 2 nCrossoverKids = round(CrossoverFraction× (10 - nEliteKids)) = round( 0.8 × (10 – 2)) = 6 nMutateKids = 10 - nEliteKids - nCrossoverKids = 10 - 2 - 6 = 2 How many elite, crossover and mutate kids will be produced in next generation?
Since 2 parents produce 1 crossover kid and 1 parent produce 1 mutate kid, GA will need: nParents = 2 × nCrossoverKids + nMutateKids = 2 × 6 + 2 = 14. GA willneed 12 parents to produce 6 crossoverkids and 2 parents to produce 2 mutatekids. How many parents GA need to produce crossover kids and mutate kids?
Featurenumber: 1 2 3 4 5 6 7 8 9 10 Fitness Value: 3.68 17.24 21.46 - 9.267.59 7.92 104.22 6.47 13.25 12.22 The space of each individual having is proportional to its fitness (or rank). Place 14 equally spaced arrows in this line. Individuals with arrows placed will be selected as parents. Selected parents = 1 1 2 3 4 4 4 5 6 7 8 8 9 10 How to select individuals to be parents?
First we need to randomize the selected parents. parents = 3 4 4 2 1 5 10 4 1 8 6 9 6 8 12 parents selected to produce 6 crossover kids. parents = [3 4] [4 2] [1 5] [10 4] [1 8] [6 9] 2 parents selected to produce 2 mutate kids. parents = [6] [8] How to produce crossover and mutate kids?
Single point Two Parents ( Individual 3 & 4 ) 193.2 19.736 215.74 129.85 55.08 142.92 183.95 11.855 98.849 155.64 103.73 15.757 142.21 95.922 2.9524 95.908 184.02 63.021 179.7 183.07 Crossover kid 193.219.736 215.74 129.8555.08 95.908 184.02 63.021 179.7 183.07 Scattered Randomly produce binary strings. 1 means change, 0 means reserve. 1 0 0 1 1 0 1 0 0 1 Two Parents ( Individual 3 & 4 ) 193.2 19.736 215.74 129.85 55.08 142.92 183.95 11.855 98.849 155.64 103.73 15.757 142.21 95.922 2.9524 95.908 184.02 63.021 179.7 183.07 Crossover kid 103.73 19.736 215.74 95.922 2.9524 142.92 184.02 11.855 183.07 98.849 Crossover
Mutation-gaussian Adds a random number taken from a Gaussian distribution with mean 0 to each entry of the parent vector. Parent 103.17 192.15 51.61 210.8 78.211 188.76 177.62 125.36 198.09 140.78 Mutate kid 169.97 211.92 4.7027 82.935 172.73 23.559 123.35 32.11 158.63 214.26 Mutation-uniform First, the algorithm selects a fraction of the vector entries of an individual for mutation, where each entry has a probability Rate of being mutated. In the second step, the algorithm replaces each selected entry by a random number selected uniformly from the range for that entry. Parent 103.17 192.15 51.61 210.8 78.211 188.76 177.62 125.36 198.09 140.78 Mutate kid 103.17 213.45 51.61 210.8 45.231 188.76 177.62 97.56 198.09 140.78 Mutatation
nextGeneration = [ eliteKids, crossoverKids, mutateKids ] Next generation
Generations — Specifies the maximum number of iterations for the genetic algorithm to perform. The default is 100. Time limit — Specifies the maximum time in seconds the genetic algorithm runs before stopping. Fitness limit — The algorithm stops if the best fitness value is less than or equal to the value of Fitness limit. Stall generations — The algorithm stops if the weighted average change in the fitness function value over Stall generations is less than Function tolerance. Stall time limit — The algorithm stops if there is no improvement in the best fitness value for an interval of time in seconds specified by Stall time. Function tolerance — The algorithm runs until the cumulative change in the fitness function value over Stall generations is less than or equal to Function Tolerance. Stopping criteria