160 likes | 287 Views
Privacy-preserving SOM-based recommendations on horizontally distributed data. Presenter : Jian-Ren Chen Authors : Cihan Kaleli , Huseyin Polat 2012 , KBS. Outlines. Motivation Objectives Methodology Privacy analysis Experiments Conclusions Comments. Motivation.
E N D
Privacy-preserving SOM-based recommendations on horizontally distributed data Presenter : Jian-Ren ChenAuthors : CihanKaleli, HuseyinPolat 2012 , KBS
Outlines • Motivation • Objectives • Methodology • Privacy analysis • Experiments • Conclusions • Comments
Motivation • CollaborativeFiltering (CF) systems are used to suggest web pages. • limited number of users’ data -> lack of accuracy -> Cold Start Problem • Horizontally partitioned among multiple vendors
Objectives • Those companies holding inadequate number of users’ data might decide to combine their data. • accurate predictions • Performance • Privacy-preserving scheme
Methodology a. Off-line i. Cluster users’ data distributed among multiplepartiesusing SOM while preserving data owners’privacy. ii. Compute aggregate data values required for recommendation estimations. b. Online i. Determine a’s cluster. ii. Estimate prediction after receiving requiredaggregate datafrom other parties. Return thereferral to a. Privacy-preserving SOM clustering on horizontally distributed data Privacy-preserving k-nn-based predictions on horizontally distributed data
Methodology SOM clustering Determine values of initial constants: Find the winning Kohonen layer neuron: k-nn-based collaborative filtering Update the weight vectors of all neurons:
Methodology SOM clustering Pearson correlation coefficient: k-nn-based collaborative filtering The prediction for a on q:
Methodology Privacy-preserving SOM clustering on horizontally distributed data 1. number of clusters 2. sequence of active party Determine values of initial constants 1. all users it holds are assigned to a cluster 2. updated Wjvectors to the second party Privacy-preserving k-nn-based predictions on horizontally distributed data SOM 1. the next party repeats step 2 2. sends new updated Wj vectors to the next party The last party sends the updated Wjvectors to the IP
Methodology Privacy-preserving SOM clustering on horizontally distributed data paq = va + P, where P is: Privacy-preserving k-nn-based predictions on horizontally distributed data among C parties, P can be written choose j percent of their zujvalues, remove their values, and replace with zero, wherej in(0,]. choose j percent of the users who did not rate q, where j in (0,)
Privacy analysis • Attacks and Vulnerabilities: • A1: Parties can coalesce for capturing a target party’s data • A2: Paying-off • V1: Not able to return any result • V2: Missing values in aggregate values vector
Experiments • Data sets
Conclusions • Integrating split data significantly improves preciseness. • Although privacy concerns make accuracy worse, accuracy losses are smaller than the accuracy gains due to collaboration.
Comments • Advantages • accuracy, performance, and privacy • Disadvantage • cost, accuracy • Applications • Collaborative Filtering • Privacy-preserving scheme