170 likes | 199 Views
Understand the fast spread of Code Red worm, predict propagation, and find mitigation solutions. Explore epidemic modeling and human countermeasures. Evaluate the impact and patterns of the worm propagation using observed data. Validate with results and consider possible overestimation issues.
E N D
Code Red Worm Propagation Modeling and Analysis Cliff Changchun Zou, Weibo Gong, Don Towsley Univ. Massachusetts, Amherst
Motivation • Code Red worm incident of July 19th, 2001: • Showed how fast a worm can spread. • more than 350,000 infected in less than one day. • A friendly worm? • No real damage to compromised computers. • Did not send out flooding traffic. • A good model can: • Predict worm propagation and damage. • Understand the worm spreading characteristics. • Help to find effective mitigation technique.
Code Red worm background • Sent HTTP Get request to buffer overflow Win IIS server. • It generated 100 threads to scan simultaneously • One reason for its fast spreading. • Huge scan traffic might have caused congestion. • Characteristics: • Uniformly picked IP addresses to send scan packets.
infectious removed susceptible Epidemic modeling introduction • “infectious” hosts: continuously infect others. • “removed” hosts in epidemic area: • Recover and immune to the virus. • Dead because of the disease. • “removed” hosts in computer area: • Patched computers that are clean and immune to the worm. • Computers that are shut down or cut off from worm’s circulation.
Epidemic modeling introduction • Homogeneous assumption: • Any host has the equal probability to contact any other hosts in the system. • Number of contacts IS • Code Red propagation has homogeneous property: • Direct connect via IP • Uniformly IP scan
infectious I(t) susceptible t Deterministic epidemic models— Simple epidemic model • State transition: N: population;S(t): susceptible hosts; I(t): infectious hosts dI(t)/dt = S(t) I(t) S(t) + I(t) = N • I(t) S(t) symmetric • Problems: • Constant infection rate • No “removed” state.
infectious removed removed susceptible susceptible Deterministic epidemic models —Kermack-McKendrick epidemic model • State transition: R(t): removed from infectious; removal rate dI(t)/dt = S(t) I(t) – dR(t)/dt dR(t)/dt = I(t); S(t) + I(t) + R(t) = N • Epidemic threshold: • No outbreak if S(0) < / . • Problems: • Constant infection rate • No I(t) t
infectious removed susceptible Code Red modeling — Consider human countermeasures • Human countermeasures: • Clean and patch: download cleaning program, patches. • Filter: put filters on firewalls, gateways. • Disconnect computers. • Reasons for: • Suppress most new viruses/worms from outbreak. • Eliminate virulent viruses/worms eventually. • Removal of both susceptible and infectious hosts.
Code Red modeling — Consider human countermeasures • Model (extended from KM model): • Q(t): removal from susceptible hosts. • R(t): removal from infectious hosts. • I(t): infectious hosts. • J(t) I(t)+R(t): Number of infected hosts • hosts that have ever been infected dS(t)/dt = - S(t) I(t) - dQ(t)/dt dR(t)/dt = I(t) dQ(t)/dt = S(t)J(t) S(t) + I(t) + R(t) + Q(t) = N
Code Red modeling — Two-factor worm model • Code Red worm may have caused congestion: • Huge number of scan packets with unused IP addresses. • Routing table cache misses. ( about 30% of IP space is used) • Generation of ICMP (router error) in case of invalid IP. • Possible BGP instability. • Effect: slowing down of worm propagation rate: (t) • Two-factor worm model: dS(t)/dt = -(t)S(t)I(t) - dQ(t)/dt dR(t)/dt = I(t) dQ(t)/dt = S(t)J(t) (t) = 0 [ 1 - I(t)/N ] S(t) + I(t) + R(t) + Q(t) = N
Validation of observed data on Code Red • Local observation preserves global worm propagation pattern. • Network monitor: • record Code Red scan traffic into the local network. • Code Red worm uniformly picked IP to scan. • # of scans a cite received Size of the IP space of the cite. • # of scans a cite received at time t Overallscans in Internet at t. • # of infectious hosts sent scans to a cite at time t Overall infectious hosts in Internet at t.
# IP # scan UTC hours (July 19-20) UTC hours (July 19-20) Observed data on Code Red worm • Two independent Class B networks: x.x.0.0/16 (1/65536 of IP space) • Count # of Code Red scan packets and source IPs for each hour. • Corresponding to infectious hosts I(t) at each hour, not infected hosts J(t)=I(t)+R(t). • Uniformly scan IP Two networks, same results.
# scan UTC hours (July 19-20) Code Red worm modeling — Simple epidemic modeling • Staniford et al. used simple epidemic model approach. • Conclusion from this model: • At around 20:00UTC (16:00 EDT), Code Red infected almost all susceptible hosts. • On average, a worm infected 1.8 susceptible hosts per hour. EDT hours (July 19)
Code Red worm modeling — Simple epidemic modeling • Possible overestimation? • Issues on using simple epidemic for Code Red: • Constant infection rate — No considering of the impact of worm traffic • No recovery — removal from infectious hosts • No patching before infection — removal from susceptible hosts
Code Red modeling numerical analysis — Two-factor model • Conclusions: • At 20:00UTC (16:00 EDT), 60% ~ 70% have ever been infected. • Simple epidemic model overestimates worm spreading. • = 0.14: 14% infectious hosts would be removed after an hour. Two-factor model
Code Red Modeling — If no congestion is considered If no congestion considered • The congestion assumption is reasonable.
Summary • We must consider the changing environment when we model virus/worm propagation. • Human countermeasures/changing of behaviors. • Virus/worm impact on Internet infrastructure. • Worm modeling limitation: • Modeling worm continuously spreading part. • Homogeneous systems. • Future work: how to predict before worm’s outbreak? • Determine parameters of a virus/worm model.