370 likes | 513 Views
“Adversarial Deletion in Scale Free Random Graph Process” by A.D. Flaxman et al. . Hammad Iqbal CS 3150 24 April 2006. Talk Overview. Background Large graphs Modeling large graphs Robustness and Vulnerability Problem and Mechanism Main Results
E N D
“Adversarial Deletion in Scale Free Random Graph Process” by A.D. Flaxman et al. Hammad Iqbal CS 3150 24 April 2006
Talk Overview • Background • Large graphs • Modeling large graphs • Robustness and Vulnerability • Problem and Mechanism • Main Results • Adversarial Deletions During Graph Generation • Results • Graph Coupling • Construction of the proofs
Large Graphs • Modeling of large graphs has recently generated interest ~ 1990s • Driven by the computerization of data acquisition and greater computing power • Theoretical models are still being developed • Modeling difficulties include • Heterogeneity of elements • Non-local interactions
Large Graphs Examples • Hollywood graph: 225,000 actors as vertices; an edge connects two actors if they were cast in the same movie • World Wide Web: 800 million pages as vertices; links from one page to another are the edges • Citation pattern of scientific publications • Electrical Power-grid of US • Nervous system of the nematode worm Caenorhabditis elegans
Small World of Large Graphs • Large naturally occurring graphs tend to show: • Sparsity: • Hollywood graph has 13 million edges (25 billion for a clique of 225,000 vertices) • Clustering: • In WWW, two pages that are linked to the same page have a higher prob of including link to one another • Small Diameter: • ~log n • D.J. Watts and S.H. Strogatz, Collective dynamics of 'small-world' networks, Nature (1998)
Talk Overview • Background • Large graphs • Modeling large graphs • Robustness and Vulnerability • Problem and Mechanism • Main Results • Adversarial Deletions During Graph Generation • Results • Graph Coupling • Construction of the proofs
Erdos-Renyi Random Graphs • Developed around 1960 by Hungarian mathematicians Paul Erdos and Alfred Renyi. • Traditional models of large scale graphs • G(n,p): a graph on [n] where each pair is joined independently with prob p • Weaknesses: • Fixed number of vertices • No clustering
Barabasi model • Incorporates growth and preferential attachment • Evolves to a steady ‘scale-free’ state: the distribution of node degrees don’t change over time • Prob of finding a vertex with k edges ~k-3
Degree Distribution • Scale Free • P [X ≥ k] ~ ck-α • Power Law distributed • Heavy Tail • Erdos- Renyi Graphs • P [X = k] = e-λλk / k! • λ depends on the N • Poisson distributed • Decays rapidly for large k • P[X≥k] 0 for large k
Exponential (ER) vs Scale Free 130 vertices and 430 edges Red = 5 highest connected vertices Green = Neighbors of red Albert, Jeong, Barabasi 2000
Degree Sequence of WWW • In-degree for WWW pages is power-law distributed with x-2.1 • Out-degree x-2.45 • Av. path length between nodes ~16
Talk Overview • Background • Large graphs • Modeling large graphs • Robustness and Vulnerability • Problem and Mechanism • Main Results • Adversarial Deletions During Graph Generation • Results • Graph Coupling • Construction of the proofs
Robustness and Vulnerability • Many complex systems display inherent tolerance against random failures • Examples: genetic systems, communication systems (Internet) • Redundant wiring is common but not the only factor • This tolerance is only shown by scale-free graphs (Albert, Jeong, Barabasi 2000)
Inverse Bond Percolation • What happens when a fraction p of edges are removed from a graph? • Threshold prob pc(N): • Connected if edge removal probability p<pc(N) • Infinite-dimensional percolation • Worse for node removal
General Mechanism • Barabasi (2000) - Networks with the same number of nodes and edges, differing only in degree distribution • Two types of node removals: • Randomly selected nodes • Highly connected nodes (Worst case) • Study parameters: • Size of the largest remaining cluster (giant component) S • Average path length l
Main Results(Deletion occurs after generation) Why is this important? □ Random node removal ○ Preferential node removal
Talk Overview • Background • Large graphs • Modeling large graphs • Robustness and Vulnerability • Problem and Mechanism • Main Results • Adversarial Deletions During Graph Generation • Results • Graph Coupling • Construction of the proofs
Main Result • Time steps {1,…,n} • New vertex with m edges using preferential att. • Total deleted vertices ≤ δn (Adversarially) • m>> δ • w.h.p a component of size ≥ n/30
Formal Statements • Theorem 1 • For any sufficiently small constant δ there exists a sufficiently large constant m=m(δ) and a constant θ=θ(δ,m) such that whp Gn has a “giant” connected component with size at least θn
Graph Coupling Random Graph G(n’,p) Red = Induced graph vertices Γn
Informal Proof Construction • A random graph can be tightly coupled with the scale free graph on the induced subset (Theorem 2) • Deleting few edges from a random graph with relatively many edges will leave a giant connected component (Lemma 1) • There will be a sufficient number of vertices for the construction of induced subset (Lemma 2) w.h.p
Formal Statements • Theorem 2 • We can couple the construction of Gn and random graph Hn such that Hn ~ G(Γn,p) and whp e(Hn \ Gn) ≤ Ae-Bmn • Difference in edge sets of Gn and Hn decreases exponentially with the number of edges
Induced Sub-graph Properties • Vertex classification at each time step t: • Good if: • Created after t/2 • Number of original edges that remain undeleted ≥ m/6 • Bad otherwise • Γt = set of good vertices at time t • Good vertex can become bad • Bad vertex remains bad
Proof of Theorem 2Construction • H[n/2] ~ G(Γn/2,p) • For k > n/2, both G[k] and H[k] are constructed inductively: • Gk is generated by preferential attachment model. • H[k] is constructed by connecting a new vertex with the vertices that are good in G[k] • A difference will only happen in case of ‘failure’
Proof of Theorem 2Type 0 failure • If not enough good vertices in Gk • Lemma 2: whp γt ≥ t/10 • Prob of occurrence is therefore o(1) • Generate G[n] and H[n] independently if this occurs
Proof of Theorem 2Type 1 failure • If not enough good vertices are chosen by xk+1 in G[k] • r = number of good vertices selected • Let P[a given vertex is good] = ε0 • Failure if r ≤ (1-δ)ε0m • Upper bound:
Proof of Theorem 2Type 2 failure • If the number of good vertices chosen by xk+1 in G[k] is less than the random vertices generated in H[k] • X~Bi(r, ε0) and Y~Bi(γk,p) • Failure if Y>X • Upper bound on type 2 failure prob: Ae-Bm
Proof of Theorem 2Coupling and deletion • Take a random subset of size Y of the good chosen vertices in G[k] and connect them with the new vertex in H[k] • Delete vertices in H[k] that are deleted by the adversary in G[k] • Hn ~ G(Γn,p) • Difference can only occur due to failure
Proof of Theorem 2Bound on failures • Prob of failure at each step Ae-Bm • Total number of misplaced edges added: E[M] ≤ Ae-Bmn
Lemma 1Statement • Let G obtained by deleting fewer than n/100 edges from a realization of Gn,c/n. if c≥10 then whp G has a component of size at least n/3
Proof of Lemma 1 • Gn,c/n contains a set S of size n/3 ≤ s ≤ n/2 • P [at most n/100 edges joining s to n-s] is small • E [number of edges across this cut] = s(n-s)c/n • Pick some ε so that n/100 ≤(1-ε)s(n-s)c/n s n-s N/100
Proof of Lemma 2Statement and Notation • whp γt ≥ t/10 for n/2 < t ≤ n • Let • zt = number of deleted vertices • ν’t = number of vertices in Gt • It is sufficient to show that
Proof of Lemma 2Coupling • Couple two generative processes • P : adversary deletes vertices at each time step • P* : no vertices are deleted until t and then same vertices are deleted as P • Difference can only occur because of ‘failure’ • Upper bound on zt(P*)
Theorem 1Statement • For any sufficiently small constant δ there exists a sufficiently large constant m=m(δ) and a constant θ=θ(δ,m) such that whp Gn has a “giant” connected component with size at least θn
Proof of Theorem 1 • Let G1=Gn and G2= G(Γn,p) • Let G = G1 ∩ G2 • e(G2 \ G) ≤ Ae-Bmn by theorem 2 • whp |G|= γn ≥ n/10 by lemma 2 • Let m be large so that p>10/ γn • Proof by lemma 1