Reporter: Zhihui Lai Supervised by Prof. Zhong Jin 2011-6

Regression Shinkage for Sparse Projection Learning ------Graduate Celebration Report Reporter: Zhihui Lai Supervised by Prof. Zhong Jin 2011-6

Outline • A review • Recommendations • Regressions • basic sparse learning methods • My works • Conclusions • Future works • Possible hot points in the future • Some suggestion on the younger

Sparse subspace learning-------reported at June 2009 A review Jieping Ye 2010 • Fast algorithm • Sparse visual attention system • Sparseness for one class problem • Sparse representation and explanation for gene data • Super-solution images and dictionary learning • Feature extraction and classification Cairong Zhao and I Chunhou Zheng, Lei Zhang Lei Zhang, Lili Wang and Guangwei Gao Jian Yang, Zhenghong GU, and I

10 Recommended References (1) • P.N. Belhumeur, J.P. Hespanha, D.J. Kriengman, Eigenfaces vs. Fisherfaces: recognition using class specific linear projection,IEEE Trans. Pattern Anal. Mach. Intelligence 19 (7) (1997)711–720. • X.F. He, S. Yan, Y. Hu, P. Niyogi, H.J. Zhang, Face recognition using laplacianfaces, IEEE Trans. Pattern Anal. Mach. Intelligence 27 (3) (2005) 328–340. +++++and its related papers • 2DPCA,UDP(T-PAMI) • ULDA OLDA (PR), NLDA • Graph embedding (T-PAMI)

10 Recommended References (2) • J. Wright, A.Y. Yang,..,Yi Ma,”Robust face recgontition via sparse represetation, T-PAMI 2009. ++++++and its 20 related references! • B. Efron, T. Hastie, I. Johnstone, and R. Tibshirani, “Least angle regression,” Annals of Statistics, vol. 32, 2004, pp. 407-499. • R. Tibshirani, “Regression shrinkage and selection via the lasso,” Journal of the Royal Statistical Society: Series B (Statistical Methodology), vol. 58, 1996, pp. 267-288. • Zou, H. (Standford), Hastie, T., & Tibshirani, R. (2004). Sparse principal component analysis (Technical Report). Statistics Department, Stanford University. • D. Cai, X. He, J.Han, Spectral Regression: A Unified Approach for Sparse Subspace Learning, Proc. 2007 Int. Conf. on Data Mining (ICDM 07), Omaha, NE, Oct. 2007.

Background---sparseness is needed • One key drawback of PCA is its lack of sparseness. • Sparse representations are generally desirable. • Reduce computational cost and promote better generalization in learning algorithms. • In many applications, the coordinate axis involved in the factors have a direct physical interpretation. • In financial or biological applications, each axis might correspond to a specific asset or gene.

The methods for sparse solutions CVX, L1-magic,L1_eq SDP,QCQP, GPRS,SLEP, Lasso,Glasso, Elastic net

regressions • Gaussian ProcessRegression, • Support Vector Regression, • Regression Trees, • and Nearest Neighbor Regression • OMP---Orthogonal OMP UNSOLVED!!

Why L1 norm learning?

some useful journals • Comm. Pure and Applied Math. • SIAM Rev. • J. Am. Statistical Assoc. • Comm. Pure and Applied Math. • IEEE Trans. Information Theory • Theoretical Computer Science • Foundations of Computational Math

基本投影理论与算法 ----PCA • 思想：最小化重构误差，保留最大方差几何意义：使投影后所得特征的总体散度最大

基本投影理论与算法 ----SPCA（1） SVD分解 • 思想：在旋转不变性的原则下最小化子空间之间的投影误差几何意义：在子空间之间使同一模式点的像与原像之差达到最小化

基本投影理论与算法----SPCA（2） 思想：在旋转不变性的原则下最小化稀疏子空间之间的投影误差几何意义：寻找一个稀疏线性变换，使得模式点在稀疏子空间的像及其在原子空间的像之差达到最小化

基本投影理论与算法 ----SDA（1） Y是只含0-1值的代表各类属性的m*c阶变量矩阵 • 思想：把类属变量看成量化变量来处理，并写成回归的形式 Optimal scoring 惩罚矩阵 Panelized discriminant analysis 几何意义：在低维子空间中逼近与类相关的量化变量

基本投影理论与算法 ----SDA（2） 思想：把类属变量看成量化变量来处理，并写成含L1范数回归的形式最优的稀疏投影通过迭代Elastic Net和SVD分解得到几何意义：在低维子空间中逼近与类相关的量化变量

基于图的稀疏投影学习模型 现有的稀疏学习模型（USSL）：本文提出的稀疏鉴别投影（SLDP）学习模型：

稀疏投影向量的比较及其语义解释 实验与分析（AR人脸数据集） AR人脸数据集中的一张人脸图像由SLDP (左)和USSL（右）算法得到的稀疏人脸子空间的二值图像，此时K=400，白点表示非0元，黑色区域为0元素

基于向量的稀疏投影学习小结 • 优点：稀疏特征提取方法还能给出特征层面上的语义解释，它可以发现最有效的鉴别特征用于分类，使我们知道到底哪些特征对分类起到了关键作用。 • 缺点: • 计算复杂度高，并且当非零元素较多时，这些算法往往比较耗时。 • 需要大量的投影才能有效地分开各个类，进一步增加了计算负担。 • 些方法用于人脸（图像）识别时，所得的投影轴仍然难于给出较为直观的、合理的人脸语义上的解释，投影向量基本不再含有图像对像的属性 • 稀疏鉴别投影方法与紧致鉴别投影理论上的联系仍然没有得到论证

基于流形学习的稀疏二维特征提取算法框架 基于图像矩阵的二维紧致投影学习方法：本文所提出的稀疏投影学习算法框架：

快速图谱特征分解 这两个定理为快速的稀疏回归提供了思路！

基于图像矩阵的二维回归拓展 基于图像矩阵的二维脊回归、二维Lasso回归、二维Elastic Net回归分别如下：

Sparsefaces:无监督S2DLPP算法 S2DLPP的目标函数： S2DLPP的算法过程：

算法时间复杂度与空间复杂度的比较 时间复杂性极大提高学习速度空间复杂性节省空间

Sparsefaces方法的变换矩阵 在Yale人脸数据集上的实验与分析从左到右: 2DPCA“脸”、 2DLDA“脸”、 2DLPP“脸”、 USSL“脸” S2DLPP所学习得到的稀疏“脸”图像，其中 K=2：2：10 稀疏脸的二值“脸”图像，白色点代表0元素，黑色部分为非0元素

无监督S2DLPP算法的特性 节省20%的时间快速！

S2DLPP算法对时间光照表情变化的有效性 本文提出的S2DLPP算法效果在AR人脸数据集上的实验比较第一次采集的前10幅图像用于训练，第二次采集的前10幅图像用于测试 S2DLPP对光照、表情及时间变化的鲁棒性快速！

S2DLPP在FERET数据库上的实验 200个人的1400张人脸图像，前5张图像用于训练，后两张图像用于测试，图像大小为40*40 比基于向量的稀疏学习方法快近100倍！

监督的S2DLDP算法 • S2DLDP的目标函数： S2DLDP算法过程：

S2DLDP的变换矩阵特性 在Yale人脸数据集上的实验从左到右:2DPCA“脸”、 2DLDA“脸”、 2DLPP“脸”、 2DLGEDA“脸” S2DLDP所学习得到的稀疏“脸” ， K=2：2：10 S2DLDP的二值“脸”，白色点代表非0元素，黑色部分为0元素

S2DLDP的橹棒性 含光照、表情与时间的变化含光照表情的变化 S2DLDP在Yale人脸数据库上识别率与非0元个数及维数的情况在AR人脸数据库上各方法的识别率与维数的变化情况

互相垂直的稀疏投影学习模型 现有的稀疏学习模型（USSL）：花了我大半年才发现它的解！互相垂直的限制！

multilinear sparse regression:MSPCA

MSPCA algorithm

multilinear sparse regression on manifolds Graph on manifolds

Conclusions • Sparseness might be necessary! • Sparseness can be more efficient! • Less atoms (loadings), higher accuracy!

Possible hot points in the future! • Effective dictionary learning for classification • Classifier (classification) based optimal dimensionality reduction • Information theory (entropy) based discriminant analysis (such as AIDA) • Game theory based discriminant analysis • (Multilinear) sparse projections and its applications for biometrics and interpretations (such as on gene)

Some suggestion on the younger • Elements: step by step, smaller to bigger • Writings: faster is more harmful! Careful Rewritings! Details decide the success or failure! 3~4 paper per year! • Submitions: comment on it and just do it! • Paper (40%)+writings(30%)+reviewers(30%)=1 • Ours visual angle decides ours height!

Thinks! ? any question?

Reporter: Zhihui Lai Supervised by Prof. Zhong Jin 2011-6