Sparse, Flexible and Efficient Modeling using L 1 -Regularization

Sparse, Flexible and Efficient Modeling using L1-Regularization Saharon Rosset and Ji Zhu

Contents • Idea • Algorithm • Results

Part 1: Idea

Introduction Setting: • Implicit dependency on training data • Linear model (® use j-functions) • Model:

Introduction Problem: How to choose weight l of regularization? Answer:Find for all  [0, ) • Can this be done efficiently (time, memory)? • Yes, if we impose restrictions on

Restrictions shall be piecewise linear • What impact on L(w) and J(w)? • Can we still solve real world problems?

Restrictions must be piecewise constant • L(w) quadratic in w • J(w) linear in w

Quadratic Loss Functions • square loss in regression • hinge loss for classification (®SVM)

Linear Penalty Functions • Sparseness property

Bet on Sparseness • 50 samples with 300 independent Gaussian variables • Row: 3 non-zero variables • Row: 30 non-zero variables • Row: 300 non-zero variables

Part 2: Algorithm

„Linear Toolbox“ a(r), b(r) and c(r) piecewise constant coefficients Regression Classification

Optimization Problem

Algorithm Initialization • start at t=0 ® w=0 • determine set of non-zerocomponents • starting direction

Algorithm Loop follow the direction until one of the following happens: • addition of new component • vanishing of a non-zero component • hit of a “knot” (discontinuity of a(r), b(r), c(r) )

Algorithm Loop • direction update • stopping criterion

Part 3: Results

NIPS Results General procedure • pre-selection(univariate t-statistic) • Algorithm loss function:Huberized hinge loss • Find best * basedon validation dataset

NIPS Results Dexter Dataset • m=300, n=20'000, pre-selection: n=1152 • linear pieces of : 452 • Optimum at (® 120 non-zero components)

NIPS Results Not very happy with the results® working with the original variables® simple linear model® L1 regularization for feature selection

Conclusion • theory « practice • limited to linear classifier • other extensionsRegularization Path for the SVM (L2)

Sparse, Flexible and Efficient Modeling using L 1 -Regularization

Sparse, Flexible and Efficient Modeling using L 1 -Regularization

Presentation Transcript

Sparse Recovery Using Sparse (Random) Matrices

Modeling Hair-Hair Interactions Using Sparse Guide Hairs

: : regularization

Dynamic, flexible and efficient storage solutions

Sparse Recovery Using Sparse (Random) Matrices

Sparse Modeling of Graph-Structured Data … and … Images

Regularization

Sparse Recovery ( Using Sparse Matrices)

Efficient Sparse Voxel Octrees

Flexible, robust, and efficient multiscale QM/MD simulation using GridRPC and MPI

Velocity Analysis Using Shaping Regularization

APPARENT-DENSITY MAPPING USING ENTROPIC REGULARIZATION

Efficient and Numerically Stable Sparse Learning

Efficient and Numerically Stable Sparse Learning

Sparse Signal Reconstruction Using FOCUSS

Efficient and Flexible Parallel Retrieval using Priority Encoded Transmission(2004)

Regularization

Virtualization: Towards More Flexible and Efficient Grids

MMSE Estimation for Sparse Representation Modeling

Sparse Recovery Using Sparse (Random) Matrices