1 / 22

Graph Analysis with Node Differential Privacy

Graph Analysis with Node Differential Privacy . Sofya Raskhodnikova Penn State University. Joint work with Shiva Kasiviswanathan ( GE Research ), Kobbi Nissim ( Ben-Gurion U. and Harvard U. ), Adam Smith ( Penn State ). Publishing information about graphs.

rocco
Download Presentation

Graph Analysis with Node Differential Privacy

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Graph Analysis with Node Differential Privacy SofyaRaskhodnikova Penn State University Joint work withShiva Kasiviswanathan(GE Research), KobbiNissim(Ben-Gurion U. and Harvard U.), Adam Smith (Penn State)

  2. Publishing information about graphs Many datasets can be represented as graphs • “Friendships” in online social network • Financial transactions • Email communication • Romantic relationships image source http://community.expressor-software.com/blogs/mtarallo/36-extracting-data-facebook-social-graph-expressor-tutorial.html Privacy is a big issue! American J. Sociology, Bearman, Moody, Stovel

  3. Differential privacy for graph data Graph G Users Differential privacy [DworkMcSherryNissim Smith 06] An algorithm A is-differentially private if for all pairs of neighborsand all sets of answers S: Trusted curator Government, researchers, businesses (or) malicious adversary ) ( queries A answers image source http://www.queticointernetmarketing.com/new-amazing-facebook-photo-mapper/

  4. Two variants of differential privacy for graphs • Edge differential privacy Two graphs are neighbors if they differ in one edge. • Node differential privacy Two graphs are neighbors if one can be obtained from the other by deleting a node and its adjacent edges. G: G: G: G:

  5. Node differentially private analysis of graphs Graph G Users Trusted curator Government, researchers, businesses (or) malicious adversary ) ( queries A answers • Two conflicting goals:utility and privacy • Impossible to get both in the worst case • Previously: no node differentially private algorithms that are accurate on realistic graphs image source http://www.queticointernetmarketing.com/new-amazing-facebook-photo-mapper/

  6. Our contributions • First node differentially private algorithms that are accurate for sparse graphs • node differentially privatefor all graphs • accuratefor a subclass of graphs, which includes • graphs with sublinear (not necessarily constant) degree bound • graphs where the tail of the degree distribution is not too heavy • dense graphs • Techniques for node differentially private algorithms • Methodology for analyzing the accuracy of such algorithms on realistic networks Concurrent work on node privacy [Blocki Blum DattaSheffet 13]

  7. Our contributions: algorithms • Node differentially private algorithms for releasing • number of edges • counts of small subgraphs (e.g., triangles, -triangles, -stars) • degree distribution • Accuracy analysis of our algorithms for graphs with not-too-heavy-tailed degree distribution: with -decayfor constant Notation: fraction of nodes in G of degree • Every graph satisfies -decay • Natural graphs (e.g., “scale-free” graphs, Erdos-Renyi) satisfy … … Frequency A graph G satisfies -decayif for all … … … Degrees

  8. Our contributions: accuracy analysis • Node differentially private algorithms for releasing • number of edges • counts of small subgraphs (e.g., triangles, -triangles, -stars) • degree distribution • Accuracy analysis of our algorithms for graphs with not-too-heavy-tailed degree distribution: with -decay for constant • number of edges • counts of small subgraphs (e.g., triangles, -triangles, -stars) • degree distribution … … A graph G satisfies -decay if for all (1+o(1))-approximation }

  9. Previous work on differentially private computations on graphs Edge differentially private algorithms • number of triangles, MST cost [NissimRaskhodnikova Smith 07] • degree distribution[Hay RastogiMiklauSuciu 09, Hay Li MiklauJensen 09, KarwaSlavkovic 12] • small subgraph counts[KarwaRaskhodnikova Smith Yaroslavtsev 11] • cuts[Blocki Blum DattaSheffet12] Edge private against Bayesian adversary (weaker privacy) • small subgraphcounts [Rastogi Hay MiklauSuciu09] Node zero-knowledge private (strongerprivacy) • average degree, distances to nearest connected, Eulerian, cycle-free graphs for dense graphs [GehrkeLui Pass 12]

  10. Challenge for node privacy: high local sensitivity • Local sensitivity [NRS’07]: • Global sensitivity[DMNS’06]: For many functions of the data, node is high. • Consider adding a node connected to all other nodes. • Examples: • (G)= |E(G)|. • (G)= # of s in G. • Edge is 1; node is for all G. Edge is ; node is |E(G)|.

  11. “Projections” on graphs of small degree Let = family of all graphs, = family of graphs of degree . Notation. = node over. =node over . Observation. is low for many useful . Examples: • =(compare to= ) • = (compare to = ) Idea: ``Project’’ on graphs in for a carefully chosen d << n. Goal: privacy for all graphs

  12. Method 1: Lipschitz extensions • Release via GS framework[DMNS’06] • Requires designing Lipschitz extension for each function • we base ours on maximum flow and linear and convex programs A function is a Lipschitz extension of from to if • agrees with on and • = high = low

  13. Lipschitz extension of : flow graph For a graph G=(V, E), define flow graph of G: Add edge iff. (G) is the value of the maximum flow in this graph. Lemma.(G)/2is a Lipschitz extension of . 1 1' 1 2 2' s 3 3' t 4 4' 5 5'

  14. Lipschitz extension of : flow graph For a graph G=(V, E), define flow graph of G: Add edge iff. (G) is the value of the maximum flow in this graph. Lemma.(G)/2is a Lipschitz extension of . Proof: (1) (G) = for all G (2) = 2⋅ 1 1' 1/ 1 2 2' s 3 3' t 4 4' 5 5'

  15. Lipschitz extension of : flow graph For a graph G=(V, E), define flow graph of G: (G) is the value of the maximum flow in this graph. Lemma.(G)/2is a Lipschitz extension of . Proof: (1) (G) = for all G (2) = 2⋅= 2 1 1' 1 2 2' s 3 3' t 4 4' 5 5' 6 6'

  16. Lipschitz extensions via linear/convex programs For a graph G=([n], E), define LP with variables for all triangles : (G) is the value of LP. Lemma.(G)is a Lipschitz extension of . • Can be generalized to other counting queries • Other queries use convex programs Maximize for all triangles for all nodes

  17. Method 2: Generic reduction to privacy over Input: Algorithm B that is node-DP over Output: Algorithm A that is node-DP over , has accuracy similar to B on “nice” graphs • Time(A) = Time(B) + O(m+n) • Reduction works for all functions How it works: Truncation T(G) outputs G with nodes of degree removed. • Answer queries on T(G) instead of G high low • via Smooth Sensitivity framework [NRS’07] • via finding a DP upper bound on [Dwork Lei 09, KRSY’11] • and running any algorithm that is -node-DP over G A T(G) query f T S + noise (G)

  18. Generic Reduction via Truncation • Truncation T(G) removes nodes of degree . • On query , answer How much noise? • Local sensitivity of as a map • Lemma., where = nodes of degree . • Global sensitivityis too large. Frequency Nodes that determine … … d Degrees

  19. Smooth Sensitivity of Truncation Smooth Sensitivity Framework [NRS ‘07] is a smooth bound on local sensitivity ofif • for all neighbors Lemma. is a smooth bound for , computable in time “Chain rule”: is a smooth bound for G A T(G) query f T S + noise (G)

  20. Utility of the Truncation Mechanism Lemma. If we truncate to a random degree in , • Application to releasing the degree distribution: an -node differentially private algorithmsuch that with probability at least if satisfies -decay for Utility:If G is -bounded, expected noise magnitude is . G A T(G) query f T S + noise (G)

  21. Techniques used to obtain our results • Node differentially private algorithms for releasing • number of edges • counts of small subgraphs (e.g., triangles, -triangles, -stars) • degree distribution via Lipschitz extensions } via generic reduction

  22. Conclusions • It is possible to design node differentially private algorithms with good utility on sparse graphs • One can first test whether the graph is sparse privately • Directions for future work • Node-private algorithm for releasing cuts • Node-private synthetic graphs • What are the right notions of privacy for graph data?

More Related