210 likes | 331 Views
ICSI Honeyfarm Status. Weidong Cui Vern Paxson Nicholas Weaver. General Concept: A Breeding Ground for Worms. We want a controlled, automatic breeding ground for worms and other self-propagating attacks: Worm attacks a "monitored" address and begins to propagate in our system
E N D
ICSI Honeyfarm Status Weidong CuiVern Paxson Nicholas Weaver
General Concept:A Breeding Ground for Worms • We want a controlled, automatic breeding ground for worms and other self-propagating attacks: • Worm attacks a "monitored" address and begins to propagate in our system • As the worm propagates, we have a suite of automatic analyzers to study the worm • What it can infect? • Any particulars of interest? • How does it attack? • And automatically analyze defense strategies • Does this signature block the worm? • All within a very short time: a few seconds • And with a single point of trust for exporting information • Also want to leverage the infrastructure for detecting other things: • Human attackers/non-self-propagating attacks • Non-random worms
Honeyfarm: Objectives • We use network telescopes and a honeyfarm to detect scanning worms • Network Telescopes • Distributed unallocated IP address ranges • Honeyfarm: • Centralized cluster of honeypots • On-demand: emulating a large number of hosts on a small number of honeypots • Detecting self-propagation • Detect self-propagation inside the honeyfarm by redirecting propagations from one honeypot to other honeypots • Other detectors possible: • Tripwire/modification detectors • Monitored honeypots, etc… Honeyfarm Global Internet Controller Honeypots
The Overall Goal • Framework for automatically detecting and analyzing new worms and other attacks • For self-propagating attacks, we want to generate: • Vulnerability signatures: What is vulnerable • Behavior signatures: What the worm needs to propagate • Attack signatures: Signatures which detect and block the attack • All signatures should be verified for effectiveness • For non-self-propagating attacks, as much of the above as possible • Based on providing a fertile ground for constrained propagation • Receive data from multiple sources • Small distributed telescopes, Large telescopes • Spam, Crawling? • For a RANDOM worm, with k addresses, V victims, and M systems infected: • Pdetect = 1 – ((V-k)/V)M after M machines infected • High probability of detection when M = V/k
ICSI's Honeyfarms • Honeyfarm Safety • ICSI's features: • Windows Centric • Hot Telescope • Replay • Replay-based filtering • Spam Telescope • The Main ICSI Honeyfarm • Other possibilities: • "Run this" Wormholes
ICSI Focus:Windows • Microsoft Windows is our primary (currently only) hosted OS • This requirement dictates VM choice: • VMWare Workstation or ESX server • Workstation: prototyping • Limited scalability • Runs on everything • ESX Server: production • Stringent hardware requirements • Memory sharing for (some) scalability • Could be better • But can work across multiple close variants due to coalescing • For now, NO host-OS specific customization • Dictates mechanism for demand allocation: NAT, instead of customization • Allows the possibility of non-virtual honeypots as well • ?Apple Systems?
ICSI'sArchitecture Policing Filtering Containment Detection Attacker Network Telescope GRE Tunnel Honeyfarm Policing Filtering Mapping Containment Detection VManager VM Clusters
Note onArchitecture • Most components implemented in Click • Provides a modular, reusable framework • Components in red we want to merge with UCSD • Need to better coordinate in this area • Relatively low overlap so far, but need
Safety: A Common FocusOf Both UCSD and ICSI • What if a worm propagates through the honeyfarm and then infects somebody else? • "But they would get infected anyway" doesn't cut it… • Two safety features: • Containment: the basic decision making on what is allowed outbound • Connections back to the infecting host • Some "phone-home" channels may also be allowed • Much malcode/attacks grab code from a third-party site • An independent policing module • Shutdown the honeyfarm once it detects any abnormal behavior on outbound connections • This is a safety belt, it should NEVER actually be invoked • Want a third safety feature as well: • A monitoring system which observes the control-plane • Has the ability to turn-off the honeyfarm by power-sequencing the network connections • Much more details on policies in UCSD's talk
The Telescopes • We have 4 /16s arranged as two (almost) contiguous /15s belonging to ESNet… • Network is directly advertised and routed by ESNet • But we also have, on loan, a "special" /23 netblock • Also advertised and routed by ESNet • Much malcode is NOT random: • Linear scanners starting from the local address: • Blaster and others • Local subnet preference • Nimda, etc • By selecting highly-likely addresses, we can gain an advantage in detection time • Local subnet preferences in particular have proven very effective
Filtering • But we can't allow all communication: • Honeypot allocation/deallocation is very expensive for us • VMWare doesn't support a lightweight clone • We want to filter out known threats • But we still want to detect new attacks for existing vulnerabilities • We want to detect Welchia as well as Blaster: • New attacks may require new signatures • New variants may be substantially more disruptive • And we would like to avoid identification by attackers as a honeypot system • Thus we need a low-cost mechanism to say whether an attack is worth forwarding to a real honeypot
Basic Filtering • Scan filtering • Allow traffic to the first N destinations from a source. • Intuition: Scans from a source is homogeneous • Init-Data filtering • Detect known attacks by looking at the first data transfer from a source • Intuition: Many simple attacks (e.g., CodeRed, Blaster, Slammer) can be filtered. • Scheme: Acknowledge to SYNs and any data packets following it • University of Michigan scheme • Is this enough? • Far too many active sources on the Internet • No, many attacks require complicated "conversations" before exposing its unique malicious attention • See Pang et al "Characterizing Internet Background Radiation" • Application-level responders are expensive in terms of development • Also, can't do "cut-through forwarding" if the attack deviates from the known script • Our idea: replay-based filtering
Application IndependentReplay • To positively identify a probe as being from a known or unknown source, it requires a complex dialog • EG, Windows SMB file transfer • We can't build target-specific responders • Too many variants and new targets • Can we use an existing dialog as a script for replaying an application session? • Take one or two instances of a dialog • Eg, a recorded attack by a particular worm against one of our honeypots • Recognize certain idioms: • Addresses, ports, and names encoded in the dialog • Ports which open for subsequent transfers • "Cookies" or session identifiers • Length fields • Prestated arguments • Then use the current interaction as a guide • Update ports/addresses/subsequent connections as appropriate • Mimic back cookies and other changes
Responder-Side Replay Original Flow Replay Flow Attacker Victim Attacker Filter 1’ 1 2 2’ 3 3’ 4 4’ 5 5 Infected! Detected!
ReplayStatus • This works for single dialogs • For both the initiator (client) and responder (server) • Tested with: • NFS file manipulation • FTP file transfer • Including changing the filename argument for the client • CIFS/SMB file transfers • The Blaster worm • W32.Randex.D worm • Performs attack through open file shares • Currently expanding to support multiple, simultaneous dialogs • Primarily for server-side replay to act as a radiation filter • Possibility: Recognize commands by where dialogs diverge? • Also desire replay for: • "Toxicology Screen": For this attack, what can get infected • Testing network devices, evaluating servers, interacting with Internet servers for measurement purposes…
Replay-BasedFilters • There are 1700 different application dialogs among 143224 connections to port 445/tcp • Connections to active honeypots • Used tethereal to generate a one-line summary for each data packet • Formulated each dialog in a canonical format • Want to ignore anything in the "known" dialogs set, while allowing anything in the "unknown" set • So use replay: • Replay as the server with the group of known dialogs • If replay successful, classify and ignore that source • If replay fails, begin replaying the new dialog against a honeypot as the client • Using the previous dialog as the starting script • Also, mark source as unknown and allow it to contact a live honeypot if seen again
VM Attacker Filter 1 2 3 4 5 Initiator-Side Replay Known? 1 Responder-Side Replay 2 3 4 5 Infected!
The Spam Telescope • Half of the emails to @acme.com are sent to our email server • 100,000 messages per day • 6000 unique executables in 4 days • We implemented a real time process to parse emails and retrieve attachments • Hash attachments to gain some statistics • We plan to run attached executables on our honeypots to detect new email worms or multimode worms • Use email to penetrate the firewall, then exploit with local exploits
The MainHoneyfarm • Located at LBNL in ESNet's machine room • Designed around HP DL360 G4 1u, dual processor servers • Currently: • 1 server as "head unit" • Previous head was a DL380, but suffered a catastrophic motherboard failure • 7 servers running ESX for honeypots • Near term expansion (next couple of weeks) • Convert one ESX server into raw Linux for processing acme.com email • Attach 3 TB disk array for tertiary storage • Add 6 more 1u servers • Add a redundant switch • Increase the disk space on the existing servers • Generous support from: • ESNet: Network connectivity and rackspace • Hewlett Packard: Equipment • Microsoft: OS and software liscences • VMWare: VMWare liscences
Possibility:The "Run This" Wormholes • We also want small, easy to use endpoints: • Distributed secrets • Endpoints in LANs • Nonblacklistable endpoints for crawlers • Our plan is to create a "Run This" endpoint in Click • Creates a new MAC address derived from the host's MAC • Obtain DHCP lease • Open GRE tunnel to the specified honeyfarm • All traffic is forwarded through the tunnel • Outgoing traffic is strongly policed by the "Run This" module: • Limited fanout • No contacting local addresses • ?What to do about LAN broadcast packets? • Goal is an easy to use and trustable endpoint • Which does not trust the honeyfarm.