Distributed Lasso for In-Network Linear Regression

Distributed Lasso for In-Network Linear Regression Juan Andrés Bazerque, Gonzalo Mateos and Georgios B. Giannakis March 16, 2010 ARL/CTA grant DAAD19-01-2-0011, NSF grants CCF-0830480 and ECCS-0824007 Acknowledgements:

Distributed sparse estimation • Data acquired by Jagents • agent j • Linear model with sparse common parameter • (P1) Zou, H. “The Adaptive Lasso and its Oracle Properties,”Journal of the American Statistical Association, 101(476), 1418-1429, 2006. 2

Network structure • (P1) • Decentralized • Centralized • Scalability Fusion center Ad-hoc Robustness • Lack of infrastructure • Problem statement • Given data yj and regression matrices Xj available locally at agents j=1,…,J • solve (P1)with local communications among neighbors (in-network processing) 3

Motivating application • Scenario: Wireless Communications • Spectrum cartography Frequency (Mhz) • Goal: Find PSD map across space and frequency • Specification: coarse approx. suffices • Approach: basis expansion of J.-A. Bazerque, and G. B. Giannakis, “Distributed Spectrum Sensing for Cognitive Radio Networks by Exploiting Sparsity,”IEEE Transactions on Signal Processing, vol. 58, no. 3, pp. 1847-1862, March 2010. 4

Modeling • Sources • Sensing radios • Frequency bases • Sensed frequencies • Sparsity present in space and frequency 5

Space-frequency basis expansion • Superimposed Tx spectra measured at Rj • Average path-loss • Frequency bases • Linear model in 6

Consensus-based optimization • (P1) • Consider local copies and enforce consensus • Introduce auxiliary variables for decomposition • (P2) • (P1) equivalent to (P2) distributed implementation 7

Towards closed-form iterates • Introduce additional variables • (P3) • Idea: reduce to orthogonal problem 8

Alternating-direction method of multipliers • Augmented Lagrangianvars , , multipliers , , • AD-MoM 1st step: minimize w.r.t. • AD-MoM 2st step: minimize w.r.t. • AD-MoM 4st step: update multipliers • AD-MoM 3st step: minimize w.r.t. 9 D. P. Bertsekas and J. N. Tsitsiklis, Parallel and Distributed Computation: Numerical Methods, 2nd ed. Athena-Scientific, 1999.

D-Lasso algorithm • Agent j initializes and locally runs FOR k= 1,2,… • Exchange with agents in • Update END FOR offline, inversion NjxNj 10

D-Lasso: Convergence Proposition For every , local estimates generated by D-Lasso satisfy where Attractive features Consensus achieved across the network Affordable communication of sparse with neighbors Network-wide data percolates through exchanges Distributed numerical operation • (P1) 11

Power spectrum cartography • 5sources • Ns=121 candidate locations, J =50 sensing radios, p=969 iteration Error evolution Aggregate spectrum map • Convergence to centralized counterpart • D-Lasso localizes all sources through variable selection

Thank You! Conclusions and future directions • Sparse linear model with distributed data • Lasso estimator • Ad-hoc network topology • D-Lasso • Guaranteed convergence for any constant step-size • Linear operations per iteration • Application: Spectrum cartography • Map of interference across space and time • Multi-source localization as a byproduct • Future directions • Online distributed version • Asynchronous updates D. Angelosante, J.-A. Bazerque, and G. B. Giannakis, “Online Adaptive Estimation of Sparse Signals: Where RLS meets the 11-norm,”IEEE Transactions on Signal Processing, vol. 58, 2010 (to appear). 13

Leave-one-agent-out cross-validation • Agent j is set aside in round robin fashion • agents estimate • compute • repeat for λ= λ1,…, λNand selectλminto minimize the error c-v error vsλ path of solutions • Requires sample mean to be computed in distributed fashion

Test case: prostate cancer antigen • 67 patients organized into J = 7 groups • measures the level of antigen for patient n in group j • p = 8 factors: lcavol, lweight, age, lbph, svi, lcp, gleason, pgg45 • Rows of store factors measured in patients Lasso D-Lasso • Centralized and distributed solutions coincide • Volume of cancer affects predominantly the level of antigen

Distributed elastic net • Quadratic term regularizes the solution; centralized in [Zou-Zhang’09] Elastic net Ridge regression • Elastic net achieves variable selection on ill-conditioned problems H. Zou and H.H. Zhang, “On The Adaptive Elastic-Net With A Diverging Number of Parameters," Annals of Statistics, vol. 37, no. 4, pp. 1733-1751 2009.

Distributed Lasso for In-Network Linear Regression

Distributed Lasso for In-Network Linear Regression

Presentation Transcript

Linear methods for regression

Linear regression

Linear Regression

Linear Regression

Linear Methods for Regression

Linear Regression

Linear Regression

Chapter 2: Lasso for linear models

Linear Regression

Linear Regression

Linear Regression

Regression Linear Regression

Linear Methods for Regression

LINEAR REGRESSION

Linear Regression

Linear Regression

Linear Regression

Linear Regression

Linear Regression

Linear regression

Linear Regression