270 likes | 413 Views
Automated Worm Fingerprinting. Authors: Sumeet Singh, Cristian Estan, George Varghese and Stefan Savage Publish: OSDI'04. Presenter: YanYan Wang. Introduction. Recent large scale internet worm post profound threat. Traditional detection methods are usually expensive and slow.
E N D
Automated Worm Fingerprinting Authors: Sumeet Singh, Cristian Estan, George Varghese and Stefan Savage Publish: OSDI'04. Presenter: YanYan Wang
Introduction • Recent large scale internet worm post profound threat. • Traditional detection methods are usually expensive and slow. • This paper investigate “Early bird” method that automatically detect and contain new worms on the network using precise signature.
Existing Detecting Techniques • Scan detection • Example: code red. • Network telescope: passive network monitors that observe large ranges of unused, yet routable, address space. • Assumption: worms select target victims at random • Limitations: not suited to non-random spreading worms
Existing Detecting Techniques • Honeypots • Monitoring idel hosts with untreated vulnerabilities • Limitations: requires significant amount of slow manual analysis, depend on the honeypot being quickly infected
Existing Detecting Techniques • Behavioral techniques at end hosts • Dynamically analyze the patterns of system calls for anomalous activity. • Limitations: expensive, only detect attack against a single host.
Characterization • Priori vulnerability signatures: match known exploitable vulnerabilities in deployed software. • Automation for signature extraction: extracts the infected decoy programs in a controlled environment and identify invariant code strings. • Autograph: (early bird)
Containment • To slow or stop the spread of an active worm • Host quarantine: preventing an infect host from communicating with other hosts • String matching: matches network traffic against particular strings, or signatures • Connection throttling: limit rate of all outgoing connection made by a machine, slow but not stop
Worm Behavior • Content invariance • Program is identical across every host it infects, though some has limited polymorphism • Content prevalence: content not prevalent is not useful for constructing signatures • Address dispersion: the no. of infected hosts will grow over time
Finding Worm Signature: Content Sifting • For each network: • Extract content and process substring • Index each substring into a prevalence table • Each table entry includes IP addresses • Sort the table
Finding Worm Signature: Content Sifting • Huge memory consumption: Multi-stage filters
Finding Worm Signature: Content Sifting • Address dispersion: trade precision for dramatic reductions in memory requirements • Example: For example, to count up to 64 sources using 32 bits, one might hash sources into a space from 0 to 63 yet only set bits for values that hash between 0 and 31 . thus ignoring half of the sources.
Finding Worm Signature: Content Sifting • Payload string requires significant processing: value sampling • select only those substrings for which the fingerprint matches a certain pattern. • Example: if f is the fraction of the tracked substrings (e.g. f = 1=64 if we track the substrings whose Rabin fingerprint ends on 6 0s), then the probability of detecting a worm with a signature of length x is
Finding Worm Signature: Content Sifting • If = 1=64 and = 40, the probability of tracking a worm with a signature of 100 bytes is 55%, but for a worm with a signature of 200 bytes it increases to 92%, and for 400 bytes to 99.64%.
Early Bird • As each packet arrives, its content (or substrings of its content) is hashed and appended with the protocol identifier and destination port to produce a content hash code. • 32 bit cyclic redundancy check (CRC) • 40 byte rabin fingerprints for substring hashses
Early Bird • If the content hash is not found in the dispersion table, it is indexed into the content prevalence table. • 4 independent hash functions creat indexes into 4 counter arrays.
Prototype System : Early Bird • Sensor: sifts through traffic on configurable address space “zones” of responsibility and reports anomalous signature. • Aggregator: coordinated real-time updates from the sensors, coalesces related signatures, activates any network-level or host level blosing services and is responsible for administrative reporting and control. • Single threaded, excute at user-level, and captures packets using libpcap library.
What’s the paper’s contribution? • A combination of existing and novel algorithms for content sifting • Low memory and CPU requirements
What’s the paper’s weakness? • Depend on invariant content • Attackers can design variant content for worms • Attackers can evade by creating metamorphic worms and traditional IDS evasion techniques • Assume max growing time • Automated containment can be used trigger a worm defense by attackers.
How to improve the paper? • Hybrid pattern matching: separate non code string from potential exploits • Investigate traffic normalization • Maintain triggering date across multiple time scale • Develop efficient mechanisms for comparing signature with existing traffic corpus