160 likes | 310 Views
Privacy-Preserving Support Vector Machines via Random Kernels. The 2008 International Conference on Data Mining. April 2, 2014. Olvi Mangasarian UW Madison & UCSD La Jolla Edward Wild UW Madison. TexPoint fonts used in EMF.
E N D
Privacy-Preserving Support Vector Machines via Random Kernels The 2008 International Conference on Data Mining April 2, 2014 Olvi Mangasarian UW Madison & UCSD La Jolla Edward Wild UW Madison TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAAAAAAAAAA
Horizontally Partitioned Data Data Features 1 2 ..………….…………. n 1 2 ........m A A1 Examples A2 A3
Problem Statement • Entities with related data wish to learn a classifier based on all data • The entities are unwilling to reveal their data to each other • If each entity holds a different set of examples with all features, then the data is said to be horizontally partitioned • Our approach: privacy-preserving support vector machine (PPSVM) using random kernels • Provides accurate classification • Does not reveal private information
Outline • Support vector machines (SVMs) • Reduced and random kernel SVMs • Privacy-preserving SVM for horizontally partitioned data • Summary
_ _ _ + + + + + + _ _ + + + + + _ + + _ + + + _ + _ + _ _ _ _ _ _ _ _ _ _ _ _ _ _ Linear kernel: (K(A, B))ij = (AB)ij = AiB¢j = K(Ai, B¢j) Gaussian kernel, parameter : (K(A, B))ij = exp(-||Ai0-B¢j||2) SVMs Support Vector Machines • x 2Rn • SVM defined by parameters u and threshold of the nonlinear surface • A contains all data points • {+…+} ½A+ • {…} ½ A • e is a vector of ones K(A+, A0)u¸ e +e K(A, A0)u· ee Minimize e0y (hinge loss or plus function or max{•, 0}) to fit data Minimize e0s (||u||1 at solution) to reduce overfitting K(x0, A0)u = K(x0, A0)u = Slack variable y¸ 0 allows points to be on the wrong side of the bounding surface K(x0, A0)u = 1
Random Reduced Support Vector Machine Reduced Support Vector Machine Support Vector Machine Using the random kernel K(A, B0) is a key result for generating a simple and accurate privacy-preserving SVM L&M, 2001: replace the kernel matrix K(A, A0) with K(A, Ā0), where Ā0 consists of a randomly selected subset of the rows of A M&T, 2006: replace the kernel matrix K(A, A0) with K(A, B0), where B0 is a completely random matrix
Error of Random Kernels is Comparable to Full Kernels:Linear Kernels B is a random matrix with the same number of columns as A and either 10% as many rows, or one fewer row than columns Equal error for random and full kernels Each point represents one of 7 datasets from the UCI repository Random Kernel AB0 Error Full Kernel AA0 Error
Error of Random Kernels is Comparable Full Kernels:Gaussian Kernels Random Kernel K(A, B0) Error Full Kernel K(A, A0) Error
A1 A2 A3 Horizontally Partitioned Data:Each entity holds different examples with the same features A3 A1 A2
Privacy Preserving SVMs for Horizontally Partitioned Data via Random Kernels • Each of q entities privately owns a block of data A1, …, Aq that they are unwilling to share with the other q - 1 entities • The entities all agree on the same random basis matrix and distribute K(Aj, B0) to all entities • K(A, B0) = • Aj cannot be recovered uniquely from K(Aj, B0)
Privacy Preservation:Infinite Number of Solutions for Ai Given AiB0 Feng and Zhang, 2007: Every submatrix of a random matrix has full rank • Given • Consider solving for row r of Ai, 1 · r · mi from the equation • BAir0 = Pir , Air02 Rn • Every square submatrix of the random matrix B is nonsingular • There are at least • Thus there are solutions Ai to the equation BAi0 = Pi • If each entity has 20 points in R30, there are 3020 solutions • Furthermore, each of the infinite number of matrices in the affine hull of these matrices is a solution B Air0 Pir =
Results for PPSVM on Horizontally Partitioned Data • Compare classifiers that share examples with classifiers that do not • Seven datasets from the UCI repository • Simulate a situation in which each entity has only a subset of about 25 examples
Error Rate of Sharing Data is Better than not Sharing:Linear Kernels 7 datasets represented by one point each Error Sharing Data Error Rate WithoutSharing Error Rate WithSharing Error Without Sharing Data
Error Rate of Sharing Data is Better than not Sharing:Gaussian Kernels Error Sharing Data Error Without Sharing Data
Summary • Privacy preserving SVM for horizontally partitioned data • Based on using the random kernel K(A, B0) • Learn classifier using all data, but without revealing privately held data • Classification accuracy is better than an SVM without sharing, and comparable to an SVM where all data is shared • Related work • Similar approach for vertically partitioned data to appear in ACMTKDD • Liu et al., 2006: Properties of multiplicative data perturbation based on random projection • Yu et al., 2006: Secure computation of K(A, A0)
Questions • Websites with links to papers and talks: http://www.cs.wisc.edu/~olvi http://www.cs.wisc.edu/~wildt