1 / 27

ERROR ENTROPY, CORRENTROPY AND M-ESTIMATION

ERROR ENTROPY, CORRENTROPY AND M-ESTIMATION. Weifeng Liu, P. P. Pokharel, J. C. Principe CNEL, University of Florida weifeng@cnel.ufl.edu Acknowledgment: This work was partially supported by NSF grant ECS-0300340 and ECS-0601271. Outlines. Maximization of correntropy criterion (MCC)

jdonoghue
Download Presentation

ERROR ENTROPY, CORRENTROPY AND M-ESTIMATION

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ERROR ENTROPY, CORRENTROPY AND M-ESTIMATION Weifeng Liu, P. P. Pokharel, J. C. Principe CNEL, University of Florida weifeng@cnel.ufl.edu Acknowledgment: This work was partially supported by NSF grant ECS-0300340 and ECS-0601271.

  2. Outlines • Maximization of correntropy criterion (MCC) • Minimization of error entropy (MEE) • Relation between MEE and MCC • Minimization of error entropy with fiducial points • Experiments

  3. Supervised learning • Desired signal D • System output Y • Error signal E

  4. Supervised learning • The goal in supervised training is to bring the system output ‘close’ to the desired signal. • The concept of ‘close’, implicitly or explicitly employs a distance function or similarity measure. • Equivalently, to minimize the error in some sense. • For instance, MSE

  5. Maximization of Correntropy Criterion • Correntropy of the desired signal and the system output V(D,Y) is estimated by • where

  6. Correntropy induced metric • Define • satisfy the following properties: • Non-negativity • Identity of indiscernibles • Symmetry • Triangle inequality

  7. CIM contours • Contours of CIM(E,0) in 2D sample space • close, like L2 norm • Intermediate, like L1 norm • far apart, saturates with large-value elements (direction sensitive)

  8. MCC is minimization of CIM • MCC   

  9. MCC is M-estimation MCC   where

  10. Minimization of Error Entropy • Renyi’s quadratic error entropy is estimated by • Information Potential (IP)

  11. Relation between MEE and MCC • Define • Construct

  12. Relation between MEE and MCC

  13. IP induced metric • Define • is a pseudo-metric. • NO identity of indiscernibles.

  14. IPM contours • Contours of IPM(E,0) in 2D sample space • valley along e1 = e2, not sensitive to the error mean • saturates with points far from the valley

  15. MEE and its equivalences • MEE     

  16. MEE is M-estimation Assume the error PDF with then

  17. Nuisance of conventional MEE • How to determine the location of the error PDF since it is shift-invariant. • Conventionally by making the error mean equal to zero. • In the case that the error PDF is non-symmetric or has heavy tails the estimation of error mean is problematic. • Fixing the error peak at the origin is obviously better than the conventional method of shifting the error based on zero-mean.

  18. ERROR ENTROPY WITH FIDUCIAL POINTS • supervised training  most of the errors equal to zero • minimizes the error entropy with respect to 0 • Denote • E is the error vector and e0 serves a point of reference

  19. ERROR ENTROPY WITH FIDUCIAL POINTS • In general, we have

  20. ERROR ENTROPY WITH FIDUCIAL POINTS • λis a weighting constant between 0 and 1 • how many fiducial points at the origin • λ =0  MEE • λ =1  MCC • 0 < λ < 1  Minimization of Error Entropy with Fiducial points (MEEF).

  21. ERROR ENTROPY WITH FIDUCIAL POINTS • MCC term locates the main peak of the error PDF and fixes it at the origin even in the cases where the estimation of the error mean is not robust • Unifying two cost functions actually retains all the merits of being completely robust with outlier resistance and kernel size resilience.

  22. Metric induced by MEEF • Well-defined metric • directional sensitive • favor errors with the same sign • penalize errors have different signs

  23. X input variable f unknown function N noise Y observation Noise PDF Experiment 1: Robust regression

  24. Regression results

  25. Experiment 2: Chaotic signal prediction • Mackey-Glass chaotic time series with parameter t=30 • time delayed neural network (TDNN) • 7 inputs, • 14 hidden PEs • tanh nonlinearity • 1 linear output

  26. Training error PDF

  27. Conclusions • Establish connections between MEE, distance function and M-estimation • Theoretically explains the robustness of this family of cost functions • Unify MEE and MCC in the framework of information theoretic models • propose a new cost function—minimization of error entropy with fiducial points (MEEF) which solves the problem of MEE being shift-invariant in an elegant and robust way.

More Related