Learning with Structured Sparsity

Learning with Structured Sparsity Authors: Junzhou Huang, Tong Zhang, Dimitris Metaxas Zhennan Yan

Introduction • Fixed set of p basis vectors where for each j. --> • Given a random observation , which depends on an underlying coefficient vector . • Assume the target coefficient is sparse. • Throughout the paper, assume X is fixed, and randomization is w.r.t. the noise in observation y. Zhennan Yan

Introduction • Define the support of a vector as • So • A natural method for sparse learning is L0 regularization for desired sparsity s: • Here, only consider the least squares loss Zhennan Yan

Introduction • NP-hard! • Standard approach: • Relaxation of L0 to L1 (Lasso) • Greedy algorithms (such as OMP) • In practical applications, often know a structure on β in addition to sparsity. • Group sparsity: variables in the same group tend to be zero or nonzero • Tonal and transient structures: sparse decomposition for audio signals Zhennan Yan

Structured Sparsity • Denote the index set of coefficients • For any sparse subset • Coding complexity of F is defined as:

Structured Sparsity • If a coefficient vector has a small coding complexity, it can be efficiently learned. • Why ? • Number of bits to encode F is cl(F) • Number of bits to encode nonzero coefficients in F is O(|F|)

General Coding Scheme • Block Coding: Consider a small number of base blocks (each element of is a subset of ), every subset can be expressed as union of blocks in . • Define code length on : • Where • So

General Coding Scheme • a structured greedy algorithm that can take advantage of block structures is efficient: • Instead of searching over all subsets of up to a fixed coding complexity s (exponential), we greedily add blocks from one at a time • is supposed to contain only manageable number of base blocks

General Coding Scheme • Standard Sparsity: consisted only of single element sets and each base block has coding length . This uses bits to code each subset of cardinality k. • Group Sparsity: • Graph Sparsity:

General Coding Scheme • Standard Sparsity: • Group Sparsity: Consider , let contain the m groups, and contain p single element blocks. Element in has cl0 of ∞, and element in has cl0 of . only looks for signals consisted of the groups. • The result coding length is: if can be represented as union of g disjoint groups. • Graph Sparsity:

General Coding Scheme • Standard Sparsity: • Group Sparsity: • Graph Sparsity: Generalization of Group Sparsity. Employs a directed graph structure G on . Each element of is a node of G but G may contain additional nodes. • At each node , we define coding length clv(S) on the neighborhood Nv of v, as well as any other single node with clv(u), such that

General Coding Scheme • Example for Graph Sparsity: Image Processing Problem • Each pixel has 4 adjacent pixels, the number of the subsets in its neighborhood is 24 = 16, with a coding length . Encode all other pixels using random jumping with coding length • If connected region F is composed of g sub-regions, then the coding length is • While standard sparse coding length is Zhennan Yan

Algorithms for Structured Sparsity Zhennan Yan

Algorithms for Structured Sparsity • Extend forward greedy algorithms by using block structure, which is only used to limit the search space. Zhennan Yan

Algorithms for Structured Sparsity • Maximize the gain ratio: • Using least squares regression • Where is the projection matrix to the subspaces generated by columns of XF • Select by Zhennan Yan

Experiments-1D • 1D structured sparse signal with values +1~-1, • p = 512, • k =32 • g = 2 • Zero-mean Gaussian noise with standard deviation is a added to the measurements • n = 4k = 128 • Recovery result by Lasso, OMP and structOMP: Zhennan Yan

Experiments-1D Zhennan Yan

Experiments-2D • Generate a 2D structured sparsity image by putting four letters in random locations. • p = H*W = 48*48 • k = 160 • g = 4 • m = 4k = 640 • Strongly sparse signal, Lasso is better than OMP! Zhennan Yan

Experiments-2D Zhennan Yan

Experiments for sample size Zhennan Yan

Experiment on Tree-structured Sparsity • 2D wavelet coefficient • Weakly sparse signal Zhennan Yan

Experiments-Background Subtracted Images Zhennan Yan

Experiments for sample size Zhennan Yan

Learning with Structured Sparsity

Learning with Structured Sparsity

Presentation Transcript

Learning With Dynamic Group Sparsity

Learning structured ouputs A reinforcement learning approach

Efficient Decomposed Learning for Structured Prediction

Relational Learning, and Structured Output

Sparsity and Saliency

Efficient Large-Scale Structured Learning

Structured learning

Better Learning Through Structured Teaching

Structured Learning Conversations

Sparsity in Polynomial Optimization

Structured presentation: with POWERPOINT

Constraints Driven Structured Learning with Indirect Supervision

Machine-learning based Semi-structured IE

Learning structured ouputs

Structured Workplace Learning

Learning Structured Models for Phone Recognition

Learning With Dynamic Group Sparsity

Structured learning

What is a Structured Learning Environment?

Thesis Proposal Learning with Sparsity: Structures, Optimization and Applications

Structured Learning Groups to Increase Literacy

Constraints Driven Structured Learning with Indirect Supervision