Personalized Social Recommendations – Accurate or Private?

Personalized Social Recommendations – Accurate or Private? A. Machanavajjhala (Yahoo!), with A. Korolova (Stanford), A. Das Sarma (Google)

Social Advertising Recommend ads based on private shopping histories of “friends” in the social network. • Armani • Gucci • Prada • Nikon • HP • Nike Alice Betty

Social Advertising … in real world Items (products/people) liked by Alice’s friends are better recommendations for Alice A product that is followed by your friends …

Social Advertising … privacy problem Only the items (products/people) liked by Alice’s friends are recommendations for Alice Fact that “Betty” liked “VistaPrint” is leaked to “Alice” Betty Alice

Social Advertising … privacy problem Recommending irrelevant items some timesimproves privacy, but reduces accuracy Betty Alice

Social Advertising Privacy problem Alice is recommended ‘X’ Can we provide accurate recommendations to Alice based on the social network, while ensuring that Alice cannot deduce that Betty likes ‘X’ ? Alice Betty

Outline of this talk • Formal social recommendations problem • Privacy for social recommendations • Accuracy of social recommendations • Example private algorithm and its accuracy • Privacy-Accuracy trade-off • Properties satisfied by a general algorithm • Theoretical bound

Social Recommendations • A set of agents • Yahoo/Facebook users, medical patients • A set of recommended items • Other users (friends) , advertisements, products (drugs) • A network of edges connecting the agents, items • Social network, patient-doctor and patient-drug history • Problem: • Recommend a new item i to agent a based on the network

Social Recommendations(this talk) • A set of agents • Yahoo/Facebook users, medical patients • A set of recommended items • Other users (friends), advertisements, products (drugs) • A network of edges connecting the agents, items • Social network, patient-doctor and patient-drug history • Problem: • Recommend a new friend i to target user a based on the social network

Social Recommendations Target Node (a) • Utility Function – u(a, i) • utility of recommending candidate i to target a • Examples [Liben-Nowell et al. 2003]: • # of Common Neighbors • # of Weighted Paths • Personalized Page Rank u(a, i1) u(a, i2) u(a, i3) Candidate Recommendations

Non-Private Recommendation Algorithm a Utility Function – u(a, i) utility of recommending candidate i to target a u(a, i1) u(a, i2) u(a, i3) Algorithm For each target node a For each candidate i Compute p(a, i) that maximizes Σ u(a,i) p(a,i) endfor Randomly pick one of the candidates with probability p(a,i) endfor

Example: Common Neighbors Utility a Utility Function – u(a, i) utility of recommending candidate i to target a u(a, i1) u(a, i2) u(a, i3) • Common Neighbors Utility: • “Alice and Bob are likely to be friends if they have many common neighbors” • u(a,i1) = f(2), u(a, i2) = f(3), u(a,i3) = f(1) • Non-Private Algorithm • Return the candidate with max u(a, i) • Randomly pick a candidate with probability proportional to u(a,i)

Differential Privacy [Dwork 2006] For every pair of inputs that differ in one value For every output … D1 D2 O Adversary should not be able to distinguish between any D1 and D2 based on any O Pr[D1 O] Pr[D2 O] . log < ε (ε>1)

Privacy for Social Recommendations • Sensitive information: Recommendation should not disclose the existence of an edge between two nodes. i i a a G1 G2 Pr[ recommending (i, a) | G1] log < ε Pr[ recommending (i, a) | G2]

Measuring loss in utility due to privacy • Suppose algorithm Arecommends node i of utility ui with probability pi. • Accuracy of A is defined as • comparison with utility of non-private algorithm

Algorithms for Differential Privacy Theorem: No deterministic algorithm guarantees differential privacy. • Exponential Mechanism • Sample output space based on a distance metric. • Laplace Mechanism • Add noise from a Laplace distribution to query answers.

Privacy Preserving Recommendations a Must pick a node with non-zero probability even if u = 0 Exponential Mechanism [McSherry et al. 2007] u(a, i1) u(a, i2) u(a, i3) Randomly pick a candidate with probabilityproportional to exp( ε∙u(a,i) / Δ ) (Δ is maximum change in utilities by changing one edge) • Satisfies ε-differential privacy

Accuracy of Exponential Mechanism + Common Neighbors Utility WikiVote Network (ε = 0.5) 60% of users have accuracy < 10%

Accuracy of Exponential Mechanism + Common Neighbors Utility Twitter sample (ε = 1) 98% of users have accuracy < 5%

Can we do better? • Maybe common neighbors utility is an especially non-private utility … • Consider a general utility functions that follow intuitive axioms • Maybe the Exponential Mechanism algorithm does not guarantee sufficient accuracy ... • Consider any algorithm that satisfies differential privacy

Axioms on Utility Functions u(a, i4) a Identical with respect to ‘a’. Hence, u(a, i3) = u(a, i4) u(a, i1) u(a, i2) u(a, i3)

Axioms on Utility Functions “Most of the utility of recommendation to a target is concentrated on a small number of candidates.”

Accuracy-Privacy Tradeoff Common Neighbors & Weighted Paths Utility*: To achieve constant accuracy for target node a, ε > Ω(log n / degree(a)) * under some mild assumptions on the weighted paths utility …

Implications of Accuracy-Privacy Tradeoff WikiVote Network (ε = 0.5) 60% of users have accuracy < 55%

Implications of Accuracy-Privacy Tradeoff Twitter sample (ε = 1) 95% of users have accuracy < 5%

Takeaway … • “For majority of the nodes in the network, recommendations must either be inaccurate or violate differential privacy!” • Maybe this is a “bad idea” • Or, Maybe differential privacy is too strong a privacy definition to shoot for.

Intuition behind main result Skip >>

Intuition behind main result i i u1(a, i), p1(a, i) a a j j u1(a, j), p1(a, j) G1 G2 p1(a,i) < eε u2(a, i), p2(a, i) p2(a,i) u2(a, j), p2(a, j)

Intuition behind main result i i i a a a j j j G1 G2 G3 p1(a,i) p3(a,j) < eε < eε p2(a,i) p1(a,j)

Using Exchangeability i i G3 is an isomorphism of G2. a a u2(a,i) = u3(a,j) implies p2(a,i) = p3(a,j) j j G2 G3 p1(a,i) p3(a,j) < eε < eε p2(a,i) p1(a,j)

Using Exchangeability G3 is an isomorphism of G2. u2(a,i) = u3(a,j) implies p2(a,i) = p3(a,j) p1(a,i) < e2ε p1(a,j)

Using Exchangeability • In general if any node i can be “transformed” to node j in t edge changes. • Then, probability of recommending highest utility node is at most etε times probability of recommending worst utility node. p1(a,i) < etε p1(a,j)

Final Act: Using Concentration • Few nodes have high utility for target a • 10s of nodes share a common neighbor with a • Many nodes have low utility for target a • Millions of nodes don’t share a common neighbor with a • Thus, there exist i and j such that p1(a,i) Ω(n) = < etε p1(a,j)

Summary of Social Recommendations • Question: “Can social recommendations be made while guaranteeing strong privacy conditions?” • General utility functions satisfying natural axioms • Any algorithm satisfying differential privacy • Answer: “For majority of nodes in the network, recommendations must either be inaccurate or violate differential privacy!” • Maybe this is a “bad idea” • Or, Maybe differential privacy is too strong a privacy definition to shoot for.

Summary of Social Recommendations • Answer: “For majority of nodes in the network, recommendations must either be inaccurate or violate differential privacy!” • Maybe this is a “bad idea” • Or, Maybe differential privacy is too strong a privacy definition to shoot for. • Open Question: “What is the minimum amount of personal information that a user must be willing to disclose in order to get personalized recommendations?”

Thank you 

Personalized Social Recommendations – Accurate or Private?