1 / 14

Transfer functions: hidden possibilities for better neural networks.

Transfer functions: hidden possibilities for better neural networks. W ł odzis ł aw Duch and Norbert Jankowski Department of Computer Methods, Nicholas Copernicus University, Torun, Poland. http://www.phys.uni.torun.pl/kmk. Why is this an important issue?.

Download Presentation

Transfer functions: hidden possibilities for better neural networks.

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Transfer functions: hidden possibilities for better neural networks. Włodzisław Duch and Norbert Jankowski Department of Computer Methods, Nicholas Copernicus University, Torun, Poland. http://www.phys.uni.torun.pl/kmk

  2. Why is this an important issue? MLPs are universal approximators - no need for other TF? Wrong bias => poor results, complex networks. Example of a 2-class problems: Class 1 inside the sphere, Class 2 outside. MLP: at least N +1 hyperplanes, O(N2) parameters. RBF: 1 Gaussian, O(N) parameters. Class 1 in the corner defined by (1,1 ... 1) hyperplane, C2 outside. MLP: 1 hyperplane, O(N) parameters. RBF: many Gaussians, O(N2) parameters, poor approximation.

  3. Inspirations Logical rule: IF x1>0 & x2>0 THEN Class1 Else Class2 is not properly represented neither by MLP nor RBF! Result: decision trees and logical rules perform on some datasets (cf. hypothyroid) significantly better than MLPs! Speed of learning and network complexity depends on TF. Fast learning requires flexible „brain modules” - TF. • Biological inspirations: sigmoidal neurons are crude approximation at the basic level of neural tissue. • Interesting brain functions are done by interacting minicolumns, implementing complex functions. • Modular networks: networks of networks. • First step beyond single neurons: transfer functions providing flexible decision borders.

  4. Transfer functions Transfer function f(I(X)): vector activation I(X)and scalar output o(I). 1. Fan-in, scalar product activation W.X, hyperplanes. 2. Distance functions as activations, for example Gaussian functions: 3. Mixed activation functions

  5. Taxonomy - activation f.

  6. Taxonomy - output f.

  7. Taxonomy - TF

  8. TF in Neural Networks Choices: • Homogenous NN: select best TF, try several typesEx: RBF networks; SVM kernels (today 50=>80% change). • Heterogenous NN: one network, several types of TF Ex: Adaptive Subspace SOM (Kohonen 1995), linear subspaces.Projections on a space of basis functions. • Input enhancement: adding fi(X) to achieve separability. Ex: functional link networks (Pao 1989), tensor products of inputs; D-MLP model. Heterogenous: 1. Start from large network with different TF, use regularization to prune 2. Construct network adding nodes selected from a pool of candidates 3. Use very flexible TF, force them to specialize.

  9. Most flexible TFs Conical functions: mixed activations Lorentzian: mixed activations Bicentral - separable functions

  10. Bicentral + rotations 6N parameters, most general. Box in N-1 dim x rotated window. Rotation matrix with band structure makes 2x2 rotations.

  11. Some properties of TFs For logistic functions: Renormalization of a Gaussian gives logistic function where: Wi=4Di /bi2

  12. Example of input transformation Minkovsky’s distance function: Sigmoidal activation changed to: Adding a single input renormalizing the vector:

  13. Conclusions Radial and sigmoidal functions are not the only choice. StatLog report: large differences of RBF and MLP on many datasets. Better learning cannot repair wrong bias of the model. Systematic investigation and taxonomy of TF is worthwhile. Networks should select/optimize their functions. Open questions: Optimal balancebetweencomplex nodes/interactions (weights)? How to train heterogeneous networks? How to optimize nodes in a constructive algorithms? Hierarchical, modular networks: nodes that are networks themselves.

  14. The End ? Perhaps the beginning ...

More Related