Transformative Tangent Learning in Machine Learning Explained

Manifold Estimation: Local (and Non-Local Parametrized) Tangent Learning Yavor Tchakalov Advanced Machine Learning Course by Dr. Jebara Fall 2006 Columbia University

Motivation • Importance of amount of training data • Classical solutions • Regularizers • A-priori knowledge embedding • Concept of tangent vectors • Compact representation of transformation invariance

Two classificaiton approaches • Learning techniques: building a model • Adjust a number of parameters to compute classification function • Memory-based techniques: • Training examples are stored in memory • New pattern => stored prototypes • Label is produced

Naïve Approach • Combine a • training dataset representing the input space • simple distance measure: Euclidean dist • Result • prohibitively large prototype set • poor accuracy

Classical Solution • Feature extractor • Compute representation that is minimally affected by certain transformations • Major bottleneck in classification • Invariant “true” distance measure • Deformable prototypes • Must know allowed transformations • Deformation search is expensive / unreliable

Transformation Manifold • For instance: simple 16x16 grayscale image => 256-D space • Transformation Manifolds: • Dimensionality • Non-linearity (i.e. geometric transformations of gray level image) • Implications • Solution: approximate the manifold by a tangent hyerplance at the prototype • tangent distance: truly invariant w.r.t. transformations used to define manifolds

Tangent hyperplane Hyperplane fully defined by original prototype (α=0) and first derivate of transformation Tayler’s expansion of transformation around α=0

Tangent Distance • Compute the minimum distance between tangent hyperplanes that approximate transformation manifolds (hence invariant to these transformations) • Three benefits: • Linear subspaces: simple analytical expressions can be computed and stored • Minimal distance is a simple least-squares problem • Distance is locally invariant but not glodablly invariant

Illustration

Implementation • Prototype approximation • Distance: linear least squares optimization problem

Illustration (I)

Illustration (II)

Results • Implementation caveats • Pre-computation of tangent vectors • Smoothing

Non-local Parameterized Tangent Learning • Problems with a large class of local manifold learning Nyström’s formula vector differences of neighbours • Instances of such problems • Non-local learning: minimize relative projection error • Goal: Transduction

Transformative Tangent Learning in Machine Learning Explained