Distributed Lasso for In-Network Linear Regression

Distributed Lasso for In-Network Linear Regression Juan Andrés Bazerque, Gonzalo Mateos and Georgios B. Giannakis March 16, 2010 ARL/CTA grant DAAD19-01-2-0011, NSF grants CCF-0830480 and ECCS-0824007 Acknowledgements:

Distributed sparse estimation • Data acquired by Jagents • agent j • Linear model with sparse common parameter • (P1) Zou, H. “The Adaptive Lasso and its Oracle Properties,”Journal of the American Statistical Association, 101(476), 1418-1429, 2006. 2

Network structure • (P1) • Decentralized • Centralized • Scalability Fusion center Ad-hoc Robustness • Lack of infrastructure • Problem statement • Given data yj and regression matrices Xj available locally at agents j=1,…,J • solve (P1)with local communications among neighbors (in-network processing) 3

Motivating application • Scenario: Wireless Communications • Spectrum cartography Frequency (Mhz) • Goal: Find PSD map across space and frequency • Specification: coarse approx. suffices • Approach: basis expansion of J.-A. Bazerque, and G. B. Giannakis, “Distributed Spectrum Sensing for Cognitive Radio Networks by Exploiting Sparsity,”IEEE Transactions on Signal Processing, vol. 58, no. 3, pp. 1847-1862, March 2010. 4

Modeling • Sources • Sensing radios • Frequency bases • Sensed frequencies • Sparsity present in space and frequency 5

Space-frequency basis expansion • Superimposed Tx spectra measured at Rj • Average path-loss • Frequency bases • Linear model in 6

Consensus-based optimization • (P1) • Consider local copies and enforce consensus • Introduce auxiliary variables for decomposition • (P2) • (P1) equivalent to (P2) distributed implementation 7

Towards closed-form iterates • Introduce additional variables • (P3) • Idea: reduce to orthogonal problem 8

Alternating-direction method of multipliers • Augmented Lagrangianvars , , multipliers , , • AD-MoM 1st step: minimize w.r.t. • AD-MoM 2st step: minimize w.r.t. • AD-MoM 4st step: update multipliers • AD-MoM 3st step: minimize w.r.t. 9 D. P. Bertsekas and J. N. Tsitsiklis, Parallel and Distributed Computation: Numerical Methods, 2nd ed. Athena-Scientific, 1999.

D-Lasso algorithm • Agent j initializes and locally runs FOR k= 1,2,… • Exchange with agents in • Update END FOR offline, inversion NjxNj 10

D-Lasso: Convergence Proposition For every , local estimates generated by D-Lasso satisfy where Attractive features Consensus achieved across the network Affordable communication of sparse with neighbors Network-wide data percolates through exchanges Distributed numerical operation • (P1) 11

Power spectrum cartography • 5sources • Ns=121 candidate locations, J =50 sensing radios, p=969 iteration Error evolution Aggregate spectrum map • Convergence to centralized counterpart • D-Lasso localizes all sources through variable selection

Thank You! Conclusions and future directions • Sparse linear model with distributed data • Lasso estimator • Ad-hoc network topology • D-Lasso • Guaranteed convergence for any constant step-size • Linear operations per iteration • Application: Spectrum cartography • Map of interference across space and time • Multi-source localization as a byproduct • Future directions • Online distributed version • Asynchronous updates D. Angelosante, J.-A. Bazerque, and G. B. Giannakis, “Online Adaptive Estimation of Sparse Signals: Where RLS meets the 11-norm,”IEEE Transactions on Signal Processing, vol. 58, 2010 (to appear). 13

Leave-one-agent-out cross-validation • Agent j is set aside in round robin fashion • agents estimate • compute • repeat for λ= λ1,…, λNand selectλminto minimize the error c-v error vsλ path of solutions • Requires sample mean to be computed in distributed fashion

Test case: prostate cancer antigen • 67 patients organized into J = 7 groups • measures the level of antigen for patient n in group j • p = 8 factors: lcavol, lweight, age, lbph, svi, lcp, gleason, pgg45 • Rows of store factors measured in patients Lasso D-Lasso • Centralized and distributed solutions coincide • Volume of cancer affects predominantly the level of antigen

Distributed elastic net • Quadratic term regularizes the solution; centralized in [Zou-Zhang’09] Elastic net Ridge regression • Elastic net achieves variable selection on ill-conditioned problems H. Zou and H.H. Zhang, “On The Adaptive Elastic-Net With A Diverging Number of Parameters," Annals of Statistics, vol. 37, no. 4, pp. 1733-1751 2009.

Distributed Lasso for In-Network Linear Regression

Distributed Lasso for In-Network Linear Regression

Presentation Transcript

Linear methods for regression

Linear regression

Linear Regression

Linear Regression

Linear Methods for Regression

Linear Regression

Linear Regression

Chapter 2: Lasso for linear models

Linear Regression

Linear Regression

Linear Regression

Regression Linear Regression

Linear Methods for Regression

Distributed Lasso for In-Network Linear Regression

LINEAR REGRESSION

Linear Regression

Linear Regression

Linear Regression

Linear Regression

Linear regression

Linear Regression