280 likes | 295 Views
ERROR ENTROPY, CORRENTROPY AND M-ESTIMATION. Weifeng Liu, P. P. Pokharel, J. C. Principe CNEL, University of Florida weifeng@cnel.ufl.edu Acknowledgment: This work was partially supported by NSF grant ECS-0300340 and ECS-0601271. Outlines. Maximization of correntropy criterion (MCC)
E N D
ERROR ENTROPY, CORRENTROPY AND M-ESTIMATION Weifeng Liu, P. P. Pokharel, J. C. Principe CNEL, University of Florida weifeng@cnel.ufl.edu Acknowledgment: This work was partially supported by NSF grant ECS-0300340 and ECS-0601271.
Outlines • Maximization of correntropy criterion (MCC) • Minimization of error entropy (MEE) • Relation between MEE and MCC • Minimization of error entropy with fiducial points • Experiments
Supervised learning • Desired signal D • System output Y • Error signal E
Supervised learning • The goal in supervised training is to bring the system output ‘close’ to the desired signal. • The concept of ‘close’, implicitly or explicitly employs a distance function or similarity measure. • Equivalently, to minimize the error in some sense. • For instance, MSE
Maximization of Correntropy Criterion • Correntropy of the desired signal and the system output V(D,Y) is estimated by • where
Correntropy induced metric • Define • satisfy the following properties: • Non-negativity • Identity of indiscernibles • Symmetry • Triangle inequality
CIM contours • Contours of CIM(E,0) in 2D sample space • close, like L2 norm • Intermediate, like L1 norm • far apart, saturates with large-value elements (direction sensitive)
MCC is minimization of CIM • MCC
MCC is M-estimation MCC where
Minimization of Error Entropy • Renyi’s quadratic error entropy is estimated by • Information Potential (IP)
Relation between MEE and MCC • Define • Construct
IP induced metric • Define • is a pseudo-metric. • NO identity of indiscernibles.
IPM contours • Contours of IPM(E,0) in 2D sample space • valley along e1 = e2, not sensitive to the error mean • saturates with points far from the valley
MEE and its equivalences • MEE
MEE is M-estimation Assume the error PDF with then
Nuisance of conventional MEE • How to determine the location of the error PDF since it is shift-invariant. • Conventionally by making the error mean equal to zero. • In the case that the error PDF is non-symmetric or has heavy tails the estimation of error mean is problematic. • Fixing the error peak at the origin is obviously better than the conventional method of shifting the error based on zero-mean.
ERROR ENTROPY WITH FIDUCIAL POINTS • supervised training most of the errors equal to zero • minimizes the error entropy with respect to 0 • Denote • E is the error vector and e0 serves a point of reference
ERROR ENTROPY WITH FIDUCIAL POINTS • In general, we have
ERROR ENTROPY WITH FIDUCIAL POINTS • λis a weighting constant between 0 and 1 • how many fiducial points at the origin • λ =0 MEE • λ =1 MCC • 0 < λ < 1 Minimization of Error Entropy with Fiducial points (MEEF).
ERROR ENTROPY WITH FIDUCIAL POINTS • MCC term locates the main peak of the error PDF and fixes it at the origin even in the cases where the estimation of the error mean is not robust • Unifying two cost functions actually retains all the merits of being completely robust with outlier resistance and kernel size resilience.
Metric induced by MEEF • Well-defined metric • directional sensitive • favor errors with the same sign • penalize errors have different signs
X input variable f unknown function N noise Y observation Noise PDF Experiment 1: Robust regression
Experiment 2: Chaotic signal prediction • Mackey-Glass chaotic time series with parameter t=30 • time delayed neural network (TDNN) • 7 inputs, • 14 hidden PEs • tanh nonlinearity • 1 linear output
Conclusions • Establish connections between MEE, distance function and M-estimation • Theoretically explains the robustness of this family of cost functions • Unify MEE and MCC in the framework of information theoretic models • propose a new cost function—minimization of error entropy with fiducial points (MEEF) which solves the problem of MEE being shift-invariant in an elegant and robust way.