370 likes | 384 Views
Delve into the convergence analysis, regularization, and learning models of Kernel LMS Algorithm in a Reproducing Kernel Hilbert Space. Understand the complexities, comparisons, and implications of this powerful computational approach.
E N D
LMS Algorithm in a Reproducing Kernel Hilbert Space Weifeng Liu, P. P. Pokharel, J. C. Principe Computational NeuroEngineering Laboratory, University of Florida Acknowledgment: This work was partially supported by NSF grant ECS-0300340 and ECS-0601271.
Outlines • Introduction • Least Mean Square algorithm (easy) • Reproducing kernel Hilbert space (tricky) • The convergence and regularization analysis (important) • Learning from error models (interesting)
Introduction • Puskal (2006) –Kernel LMS • Kivinen, Smola (2004) –Online learning with kernels (more like leaky LMS) • Moody, Platt (1990’s)—Resource allocation networks (growing and pruning)
LMS (1960, Widrow and Hoff) • Given a sequence of examples from U×R: U: a compact set of RL. • The model is assumed: • The cost function:
LMS • The LMS algorithm • The weight after n iteration: (1) (2)
Reproducing kernel Hilbert space • A continuous, symmetric, positive-definite kernel ,a mapping Φ, and an inner product • H is the closure of the span of all Φ(u). • Reproducing • Kernel trick • The induced norm
RKHS • Kernel trick: • An inner product in the feature space • A similarity measure you needed. • Mercer’s theorem:
Common kernels • Gaussian kernel • Polynomial kernel
Kernel LMS • Transform the input ui to Φ(ui): Assume Φ(ui) ∈RM • The model is assumed: • The cost function:
Kernel LMS • The KLMS algorithm • The weight after n iteration: (3) (4)
Kernel LMS (5)
Kernel LMS After the learning, the input-output relation: (6)
KLMS vs. RBF KLMS: RBF: α satisfy G is the gram matrix: G(i,j)=ĸ(ui,uj) • RBF needs regularization. • Does KLMS need regularization? (7) (8)
KLMS vs. LMS • Kernel LMS is nothing but LMS in the feature space--a very high dimensional reproducing kernel Hilbert space (M>N) • Eigen-spread is awful—does it converge?
Example: MG signal predication • Time embedding: 10. • Learn rate: 0.2 • 500 training data • 100 test data point. • Gaussian noise • noise variance: .04
The asymptotic analysis on convergence—small step-size theory • Denote • The correlation matrix is singular. Assume and
The asymptotic analysis on convergence—small step-size theory • Denote we have
The weight stays at the initial place in the 0-eigen-value directions • If we have
The 0-eigen-value directions does not affect the MSE • Denote It does not care about the null space! It only focuses on the data space!
The minimum norm initialization • The initialization gives the minimum norm possible solution.
Regularization Technique • Learning from finite data is ill-posed. • A priori information--Smoothness is needed. • The norm of the function, which indicates the ‘slope’ of the linear operator is constrained. • In statistical learning theory, the norm is associated with the confidence of uniform convergence!
Regularized RBF • The cost function: or equivalently
KLMS as a learning algorithm • The model with • The following inequalities hold • The proof…(H∞ robust + triangle inequality + matrix transformation + derivative + …)
The numerical analysis • The solution of regularized RBF is • The reason of ill-posedness is the inversion of the matrix (G+λI)
The numerical analysis • The solution of KLMS is • By the inequality we have
The conclusion • The LMS algorithm can be readily used in a RKHS to derive nonlinear algorithms. • From the machine learning view, the LMS method is a simple tool to have a regularized solution.
LMS learning model • An event happens, and a decision made. • If the decision is correct, nothing happens. • If an error is incurred, a correction is made on the original model. • If we do things right, everything is fine and life goes on. • If we do something wrong, lessons are drawn and our abilities are honed.
Would we over-learn? • If the real world is attempted to be modeled mathematically, what dimension is appropriate? • Are we likely to over-learn? • Are we using the LMS algorithm? • What is good to remember the past? • What is bad to be a perfectionist?
"If you shut your door to all errors, truth will be shut out."---Rabindranath Tagore