820 likes | 977 Views
Reactive Patching: a viable worm defence strategy ?. Milan Vojnovi ć & Ayalvadi Ganesh Microsoft Research Cambridge, United Kingdom. Tutorial Performance 2005 Juan-le-Pins, France, Oct 4, ‘05. Who is this tutorial intended for?. Security non-specialist
E N D
Reactive Patching: a viable worm defence strategy ? Milan Vojnović & Ayalvadi Ganesh Microsoft Research Cambridge, United Kingdom Tutorial Performance 2005 Juan-le-Pins, France, Oct 4, ‘05
Who is this tutorial intended for? • Security non-specialist • Learn some strategies of worm spread & effectiveness of some countermeasures • Security specialist • Fundamental limits of patching • Whom is it not? • Those interested in gory details of particular worms and vulnerabilities they exploit
What is a Worm? • Self-replicating malicious code that • Exploits a known or unknown software vulnerability (e.g. buffer overflow) • To gain (partial?) control over the host • Uses the host to propagate copies of itself • Typically, does not require human intervention • Unlike viruses • Contributes to speed of spread
Buffer overflow vulnerability • Example: Web form Name Data Program Name Worm data Overwrites program
Motivation for studying worms • Self-replicating malicious code spreads very quickly • Code Red v2: 360,000 hosts in ~ 24 hours • Slammer: 75,000 hosts in ~ 10 minutes • Causes huge economic damage • Backbone saturation, cleanup • Things could be worse! • Hard or impossible to eliminate all vulnerabilities in code
Roadmap • Worm spread strategies • Target discovery mechanisms • SI epidemic model of worm spread • Patch spread strategies • Analysis of patching • Patching • Patching with filtering • Candidate strategies: PUSH, P2P • Summary and conclusions
Target discovery (1) • No a priori knowledge of vulnerable hosts • Random scanning • Generate IP address at random • If vulnerable host found at that address, infect it • Commonly used in current generation of worms, e.g., Code Red I, Slammer • Not very smart, but can still be fast
Host population types = S (susceptible) = I (infected) = P (patched)
Random scanning worm 1 IP address scan hit !
Random scanning worm 1
Target discovery (2) • Local preference /subnet biased scanning • Ex. Code Red II, Zotob • Infected hosts split their scanning effort between • Local subnet (IP addresses with the same 1-2-3 octet prefix) & • Global Internet • Why ?
Subnet preference: Code Red II 1/8 3/8 1/2 /8 /16
Subnet preference: Zotob • Scans local /16 address space until • 512 consecutive scans miss, or • Until 32 scans miss if there has been no success • Then switches to random scanning of entire IP address space
Target discovery (3) • Hit lists: Worm seeded with list of vulnerable targets identified in advance • Carried in worm payload, or • Looked up from external server, e.g., meta-server for games • Objective: accelerate initial spread
Target discovery (4) • Topological worms • Target lists obtained from data residing on host, ex, • Local DNS cache • Instant messenger contact lists • Neighbour lists of P2P applications • Potentially very fast, and hard to distinguish from legitimate use
Roadmap • Worm spread strategies • Target discovery mechanisms • SI epidemic model of worm spread • Patch spread strategies • Analysis of patching • Patching • Patching with filtering • Candidate strategies: PUSH, P2P • Summary and conclusions
Model of random scanning • Address space of size Ω = 232 • N vulnerable hosts, occupy fraction N/Ω of address space • Infected hosts scan addresses randomly at rate η • Code Red: η = 360 per minute • Slammer (UDP): η = 4000 per second • If scan locates a vulnerable host, it is infected
Stochastic epidemic model • Infected hosts scan IP address space at points of Poisson process of rate η • Independent at distinct hosts • Rate at which scans hit vulnerable hosts: β = η N / Ω • I(t) : Number of infected hosts, evolves as a Markov process • High-level model: ignores network congestion, latency
Deterministic epidemic model • Large population limit: • N→∞, η/Ω fixed • i(t) = I(t)/N : fraction of hosts infected • i(t): density dependent Markov chain • Converges to limit deterministic ODE i’(t) = β i(t) [1-i(t)]
Characteristic time scale • Epidemic growth follows logistic curve • Initially exponential, then saturates • Time to infect most of the susceptible population is a small multiple of 1/β: • ~ 40 minutes for Code Red • ~ 10 seconds for Slammer • Time scale for network-wide infection is hours for Code Red, minutes for Slammer
Not considered: non-constant per-infective scan rate • Some random scanning worms result in a non-constant per-infective scan rate • Example: Slammer (2003) • plausible cause: • bandwidth-saturation • see “extras” observed scans per unit time per-infective scan rate = number of infected hosts [Moore+04]
Roadmap • Worm spread strategies • target discovery mechanisms • SI epidemic model of worm spread • Patch spread strategies • Analysis of patching • patching • patching with filtering • candidate strategies: PUSH, P2P • Summary and conclusions
Patching strategies • Patching: Identify vulnerabilities and develop patches • Filtering: Detect and quarantine infected hosts / subnets • Other: ensure code has no vulnerabilities (non-trivial in practice)
Current approaches • Vulnerability is found & patch developed first • Patch is released • Worm is subsequently reverse engineered from patch • Patch needs to be installed before worm is released – hours to days
Example: Zotob • Aug 9-05: MS05-039 public disclosure • Plug-and-play vulnerability affecting mostly Win2k • Aug 12-05: Exploit code released • Aug 14-05: Zotob worm discovered • Followed by > 10 variants
Deficiencies • Zero-day worms: vulnerability not known or patch not yet available • Requires automatic response, involving • Detection of worm spread • Automatic patch generation • Automatic patch dissemination • Human reaction times too slow
Example: Vigilante [Costa+05] • Detectors distributed through network • Detect worms by analysis of stack in code execution • Can be combined with honeypots etc • Generate self-certifying alerts (SCAs) proving vulnerability • Disseminate to hosts which verify SCA and create filters (patches)
Problems we address • Architecture for alert dissemination • Vigilante uses structured overlay interconnecting all end hosts • We propose a hierarchical scheme • Analysis of competing spread of worm and patch • To establish if patching is feasible, and • quantify system requirements
Roadmap • Worm spread strategies • target discovery mechanisms • SI epidemic model of worm spread • Patch spread strategies • Analysis of patching • patching • patching with filtering • candidate strategies: PUSH, P2P • Summary and conclusions
Patching • Hierarchical dissemination: • Phase 1: among patching servers • Phase 2: patching-servers to hosts
Network partitioned into subnets subnet j 1 1 2 J j
Patching server in each subnet • patching servers termed superhosts patching server 1 1 2 J j
Superhosts interconnected by an overlay • alerts or patches disseminate over overlay • with alerts, patch generated at superhosts • non essential in modelling 1
PULL • hosts poll a superhost with unit rate • superhost service rate = m • results in a patched host, if the polling host was susceptible s(t) = fraction of susceptible hosts at time t patching rate at time t = m s(t)
Host population dynamics • if m = 0, standard logistic • in general: • patch susceptible hosts only • assumes worm prevents patching an infected host • plausible assumption for automatic patching (no human intervention) Patching system:
Limit host population • Result • Implication • Tight bound whenever infection rate b sufficiently small(final fraction of infectives small) • Exponential in the infection to patch rate ratio !
Limit host population (example) 10000 vulnerable hosts b = 0.1 dots = Monte Carlo
Subnets (cont’d) alerted subnets • patching with rate m only in alerted subnets • g(t) = fraction of alerted subnets at time t 1 2 J j
Broadcast curve • Natural candidate: logistic function g(t) = fraction of alerted subnets at time t T = broadcast time 1 1 t 0 0 T t Many-superhosts limit for random gossip: T = O(log(J)) Same order for standard overlays
Broadcast curve (example) • Example: • Pastry overlay of J superhosts • Topology = GaTech • Broadcasting = Flooding • Exhibits logistic growth • (Such overlays randomly constructed: locally tree)
Minimum Broadcast Curve • A curve m(t) such that at any time t, fraction of alerted superhosts m(t) • Comparison: Minimum broadcast curve yields an upper bound on the fraction of infectives • Example: flooding on Pastry overlay & logistic minimum broadcast curve m(t)
Host population dynamics fractions of infectives in alerted subnets fractions of susceptibles in alerted subnets
The migrations • g(t)(i(t)-i1(t)) = J g(t)(1-g(t)) [i(t)-i1(t)] / [J(1-g(t))] • Assume at time t, J(1-g(t)) = 5 • Pick a subnet at random • M := # infectives in randomly picked subnet • E(M) = (I1+…+I5) / 5 I1 I3 I2 I4 I5
Host population dynamics • The last ODE: Ricatti • Use substitution: w = 1 + 1/z(w = 1, a particular solution) familiar one-subnet patching system, but with patching rate = mw(t)
Per-susceptible patching rate “bottleneck” is patching within subnets are alerts over the overlay
Per-susceptible patching rate bottleneck = patching within subnets bottleneck = alerts over overlay
Overlays that satisfy a logistic minimum broadcast curve • Fast-overlay asymptotic: small m, b and m / b fixed • Intuitive: T replaced with log(1/g(0)) overlay “diameter” Heuristics: take g(0)=1/J,log(1/g(0)) = log(J)
Known broadcast time T • Result Implies: g(t) • If T = 0, then consistent with one-subnet patching • Uses minimum broadcast curve m(t) = 1{tT} • No patching until all subnets alerted 1 0 T t