1 / 52

Worms, Viruses, and Cascading Failures in networks

Worms, Viruses, and Cascading Failures in networks. D. Towsley U. Massachusetts. Collaborators: W. Gong, C. Zou (UMass) A. Ganesh , L. Massoulie (Microsoft). Internet as enabler of terrific apps. Internet as enabler of terrific apps … but also of malicious behavior

maik
Download Presentation

Worms, Viruses, and Cascading Failures in networks

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Worms, Viruses, and Cascading Failures in networks D. Towsley U. Massachusetts Collaborators: W. Gong, C. Zou (UMass) A. Ganesh , L. Massoulie (Microsoft)

  2. Internet as enabler of terrific apps

  3. Internet as enabler of terrific apps • … but also of malicious behavior • worms, viruses • Internet as a complex system • critical DNS, BGP infrastructures

  4. Worms and failures • Code Red worm • more than 360,000 infected in less than one day • disrupted parts of BGP infrastructure • SQL Slammer • less than 15 minutes to infect 75,000 hosts • congested parts of Internet • BGP errors in one network → cascade of faults in BGP in another network

  5. Goals • what are appropriate models? • deterministic • stochastic • what makes worm/virus/failure virulent? • how does topology affect virulence?

  6. Outline • worms, deterministic models • cascading failures, stochastic models • summary

  7. Worm spreading behavior • scan for vulnerable hosts • sequential, random, topological • uniform, local preference • virulence sensitive to • scanning strategy • host speed, bandwidth • protocol • …

  8. W N Worm spreading model • address space, size W • N vulnerable hosts • scan rate (per host), h

  9. Simple worm spreading model I(t) - number of infected hosts at time t Epidemic model: with initial condition I(0)

  10. D. Goldsmith K. Eichman scan rate time Code Red: model • measurements from two Class A networks • scan rate  I(t) • epidemic model matches increasing part of observed Code Red data (Staniford) What about decrease? • human countermeasures • congestion Zou, etal, 2002

  11. Assumptions • classic epidemic model • ignore countermeasures • ignore congestion • Code Red parameters • h = 358/min • N = 360,000 • uniform scan, W = 232 • I(0) = 10 • 100s minutes to spread

  12. Worm virulence • increase h • increase I(0) • decrease W

  13. Worm virulence • increase h • increase I(0) • decrease W • smarter scanning

  14. The perfect worm • perfect worm • scan vulnerable nodes exactly once • flash worm (Staniford,…) • uniform scan of vulnerable nodes (W = N)

  15. Perfect Code Red worm • I(0) = 10 • h = 358/min • N = 360,000 • all hosts infected within 2 sec. • add 2 sec. infection delay -> six-fold slowdown • random scan almost perfect!

  16. Perfect Code Red worm • I(0) = 10 • h = 358/min • N = 360,000 • all hosts infected within 2 sec. • add 2 sec. infection delay -> six-fold slowdown • random scan almostperfect!

  17. Hitlist, routing worms • hitlist worm • increases I(0) • routing worm • decreases W • BGP table information: W = .29  232 • 29% of IP address space

  18. Hitlist, routing worms • Code Red style worm • h = 358/min • N = 360,000 • hitlist, I(0) = 10,000 • routing worm as effective as hitlist worm • hitlist/routing worm extremely virulent

  19. 1 1-p 2 K Local preference worm • K subnetworks • p – probability scan local subnet • (1-p) – prob. scan outside localsubnet p …

  20. Local preference worm • Nk, no. vulnerable hosts in subnet k • Ik(t), no. infected hosts in subnet k • fits epidemic model for interacting groups set of coupled ODEs

  21. Local preference worm • K = 116 • Nk = 360,000/K • I1(0) = 10; Ik(0) = 0, k>1 • h = 358/min • provides some of the locality of a routing worm

  22. Questions • topological worms • sequential scan • bandwidth constraints

  23. topology? • failure recovery?

  24. Topology and fast/slow recovery • model description • general network topologies • conditions for fast-slow recovery • specific network topologies • complete graphs (BGP routers) • hypercubes (peer-to-peer networks) • power-law graphs (Internet AS graph; E-mail address book graph)

  25. Susceptible-Infective-Susceptible (SIS) epidemic model Also known as contact process; see [Liggett] • topology: undirected, finite graph G=(V,E),connected ; • Xv = 1if nodevdown(infected) Xv = 0if nodevup (healthy)

  26. Model • {Xv vV} Markov process on {0,1}V with jump rates: • Xv→ 1 with rate  w→vXw • Xv → 0 with rate  • unique absorbing state at 0 • all other states communicate, 0 is reachable

  27. Time to absorption • system eventually recovers • how long does this take? • T = time to hit 0(from a given initial condition) • how does E[T] depend on , , G?

  28. Example • G = line segment or ring with n nodes • Fix =1 • Theorem (Durrett and Liu): There is critical c > 0 such that, • if  < c , then E[T] = O(log n) • if  > c , then log E[T] ≈ na • signature of phase transition in infinite 1-D lattice.

  29. Fast recovery, spectral radius  - spectral radius of graph adjacency matrix, A; n=|V| . Then, P(X(t)  0) ≤ c n½ exp([ -]t) Hence, when  < , Survival time T satisfies: E(T) ≤ [log(n)+1]/[  -  ]

  30. Coupling proof Consider “Branching Random Walk”, i.e. Markov process {Yv}vV • Yv→Yv +1 with rate  w~v Yw =  (AY)v • Yv → Yv -1 with rate  Yv Can couple processes so that, for all t, X(t) ≤ Y(t).

  31. Branching random walk bound By “linearity” of Y, dE[Y(t)]/dt = ( A -  I) Y(t), so E[Y(t)] = exp( A -  I) Y(0) ; Use P(X(t)  0) ≤vV E[Yv(t)]

  32. Slow recovery Graph isoperimetric constant: “perimeter” S “area”

  33. Generalized isoperimetric constant

  34. Slow die-out and isoperimetric constant Suppose for some m ≤ n/2, r := [m] / > 1 Then, with positive probability, epidemics survive for time at least rm/[2m] Hence, if m = na, survival time T satisfies log (E[T]) = (na)

  35. Coupling proof Let |X| = v Xv . Then |X| dominates process Z on {0,…,m} with transition rates: z→ z+1 at rate  z, z→ z-1 at rate  z. Then study absorption time for Z

  36. Complete graph Here,  = n-1, m = n-m By picking m = na, a < 1, Thresholds: fast recovery if / < 1/(n-1) slow recovery if / > 1/(n-na)

  37. Hypercube {0,1}d Here, d = log2(n) and  = d For m=2k, k < d, m = d-k Hence, for k = d, Thresholds: , fast recovery if / < 1/d slow recovery if / > 1/[d(1-)]

  38. Erdős-Rényi random graph • edge between each pair of nodes present with probability pn independent of others • dense: dn := npn = Ω(log n) • thenρ ~  ~ dn with high probability

  39. Star network • spectral radius: n1/2 • isoperimetric constant: m = 1 for all m < n/2 • general results not useful Specialized analysis yields: • for arbitrary constant c > 0, if / < c/n1/2, fast recovery, E[T] = O(log(n)) • if / > na-1/2 , for a > 0, slow recovery, log(E[T]) = (na)

  40. Power-law random graph Power-law graph with exponent : number of degree kvertices k- E.g. Internet AS graph with  = 2.1 Expected degree PLRG [Chung et al]: • expected degrees w1 > ··· > wn: edge (i,j) present w.p. wi wj/k wk • particular choice: wi = c1(i+c2)-1/( -1)

  41. Power-law random graph (2) Spectral radius of PLRG [Chung et al.,03]: Denote by m max. expected degree (m=w1), and by d average of expected degrees. Then:

  42. PLRG,  > 2.5 Epidemics on full graph live longer than on sub-graph. Look at star induced by node 1: slow die-out for / > m-1/2 Compare to spectral radius condition: Fast die-out for / < m-1/2 Two thresholds differ by m ; same gap as for star

  43. PLRG, 2 <  < 2.5 Consider top N nodes, for suitable N; Erdős-Rényi core, with isoperimetric constant:  = F()  Gap between thresholds  and : constant factor, F()

  44. Open problems • gap between upper and lower bounds in • sparse ER graphs • power law random graphs for  < 2.5 • spectral radius bound tight in examples, always true? • conditioned on slow recovery, how many nodes are down at intermediate times? • extensions to other graphs and to SIR epidemics

  45. Observations • neither parameter tight • gap for topologies with diverse degrees • spectral radius “seems” to be right • nothing between log n and exp(n)?

  46. 0110…0xxx 8 Hitlist, routing worms • hitlist worm • increase I(0) • routing worm • decrease W • BGP table information: W = .29  232 • 29% of IP address space • /8 aggregation: W = .45  232 • 116 out of 256 possible 8 bit prefixes

  47. The appearance of phase transitions N=200, ks =1, kl=0.01 Mean time to absorption goes down from 1047 , to about 0 in a matter of few states

  48. Accuracy of fluid model • population: 360,000 • scan rate h = N(358/min, 1002) normal distr. • scanning space: 232 • I(0) =1 • 100 simulations

  49. Accuracy of fluid model • population: 360,000 • scan rate h = N(358/min, 1002) normal distr. • scanning space: 232 • I(0) =10 • 100 simulations

  50. Accuracy of fluid model • population: 360,000 • scan rate h = N(358/min, 1002) normal distr. • scanning space: 232 • I(0) =10 • 100 simulations

More Related