140 likes | 263 Views
RBF TWO-STAGE LEARNING NETWORKS: EXPLOITATION OF SUPERVISED DATA IN THE SELECTION OF HIDDEN UNIT PARAMETERS. An application to SAR data classification. Objectives.
E N D
RBF TWO-STAGE LEARNING NETWORKS: EXPLOITATION OF SUPERVISED DATA IN THE SELECTION OF HIDDEN UNIT PARAMETERS An application to SAR data classification
Objectives Improve performance of RBF two-stage learning networks by introducing parameter adaptation criteria where supervised data are exploited to adjust hidden unit parameters (traditionally detected by unsupervised learning techniques)
Topics of discussion • Properties of RBF networks vs MLPs • One- and two-stage RBF network learning strategies • Recent developments in RBF two-stage learning algorithms (Bruzzone, IEEE TGARS, 1999) • Classification examples
MLP network (hard threshold perceptron) RBF network
MLP network • Advantages • distribution-free • importance-free (data fusion) • one-stage supervised learning algorithm (BP) • Disadvantages • slow to train • convergence to a local minimum • high output responses to input data that fall into regions of the input space where there are no training examples (extrapolation) • sensitive to outliers that affect every free parameters • network topology is not data-driven (model selection)
RBF network • Advantages • simple two-layer architecture to solve complex tasks • two-stage hybrid (supervised 1st stage + unsupervised 2nd stage) learning scheme • fast to train • closed form linear optimization of the output weights • localized BFs • low output responses to input data that fall into regions of the input space where there are no training examples • learning of each input sample affects a specialized subset of the network parameters (modularization, outliers do not affect every free parameter) • easy interpretation of the processing units
RBF network • Disadvantages • the classification error strongly depends on the selection of the number, centers and widths of BFs • molteplicity of one-stage (supervised [error-driven]) and two-stage (supervised, hybrid [error-driven + data-driven]) learning algorithms
RBF network learning strategies • One-stage supervised (error-driven) learning • GBF sum-of-squared error gradient descent (Bishop, 1995) • Disadvantages • GBFs may not stay localized ( ) • no effect on the positions (centers) of GBFs • no model selection • new types of BFs suitable for gradient descent learning (Karayiannis, IEEE TNN, 1998) • Disadvantages • no model selection • constructive learning (Fritzke, IEEE TNN, 1994) • Disadvantages • unlimited growing • unstable (small input variations cause large output changes)
RBF network learning strategies • Two-stage learning • hybrid learning (Moody and Darken, 1989) • first stage (hidden layer): data-driven • BF centers • clustering • BF spread parameters • p-nearest neighbor heuristic • second stage: error-driven • gradient descent • pseudo-inverse linear optimization (may be unstable) • majority voting
RBF network learning strategies • Two-stage learning • hybrid learning (Moody and Darken, 1989) • Disadvantages • no model selection (number of BFs) • mixed clusters: unsupervised learning does not reflect the local complexity of the classification problem at hand • if the number of BFs increases, then there is no guarantee of improving the system’s performance
RBF network learning strategies • Two-stage learning • constructive (Karayiannis, 1998): error-driven location of new BFs • Disadvantages • only one unit is inserted per two-stage growing cycle • supervised learning • first stage (hidden layer) • BF centers • Karayiannis, 1998: gradient descent of the “localized class-conditional activation variance” (LOCCAV) method • Bruzzone, IEEE TGARS, 1999: class-conditional (constructive) clustering • BF spread parameters • Karayiannis: LOCCAV or “localized class-conditional quantization error” (LOCCEQ) • Bruzzone: class-conditional p-nearest neighbor heuristic
RBF network learning strategies • Two-stage learning • supervised learning • first stage (hidden layer) • BF spread parameters • Bruzzone: class-conditional p-nearest neighbor heuristic. Given one BF center i , which belongs to a class, if the 3-nearest BF centers h,,,k and m belong to the same class then less conservative choice: = (d(i , h) + d(i , k)) / 2 otherwise more conservative choice: LOCCEQ
Bruzzone’s RBF network two-stage learning strategy SAR ERS-1/ERS-2 tandem pair data classification task (IGARSS 2000)
Conclusions Simple heuristic techniques exploiting supervised data in learning hidden parameters of an RBF two-stage learning network may lead to: • improvement in classification performance • enhanced performance stability with respect to changes in the number of hidden units • integration of class-conditional data with constructive unsupervised learning techniques to address the model selection issue