220 likes | 239 Views
Global Intrusion Detection Using Distribute Hash Table. Jason Skicewicz, Laurence Berland, Yan Chen Northwestern University 6/2004. Current Architecture. Intrusion Detection Systems Vulnerable to attack Many false responses Limited network view Varying degrees of intelligence
E N D
Global Intrusion Detection Using Distribute Hash Table Jason Skicewicz, Laurence Berland, Yan Chen Northwestern University 6/2004
Current Architecture • Intrusion Detection Systems • Vulnerable to attack • Many false responses • Limited network view • Varying degrees of intelligence • Centralized Data Aggregation • Generally done manually • Post-mortem global view • Not real time!
Sensor Fusion Centers • Sensor fusion centers (SFC) aggregates information from sensors throughout the network • More global view • Larger information pool • Still vulnerable to attack • Overload potential if multiple simultaneous attacks • Can’t we leverage all the participants?
Distributed Fusion Centers • Different fusion centers for different anomalies • Must attack all fusion centers, or know more about fusion center assignments • Still needs to be manually set up, routed to • What if things were redundant and self-organizing?
What is DHT • DHT, or Distributed Hash Tables, is a peer-to-peer system where the location of a resource or file is found by hashing on the key • DHTs include CHORD, CAN, PASTRY, and TAPESTRY • DHT attempts to spread the keyspace across as many nodes as possible • Different DHT use different topologies
CAN • CAN is based on a multi-reality n-dimensional toroid for routing (Ratnasamy et al)
CAN • Each reality is a complete toroid, provides full redundancy • Network covers entire address space, dynamically splits space • Routes across the CAN, so you don’t need to connect directly to the Fusion Center
GIDS over DHT • Fusion centers are organized on a distributed hash table • Peer-to-peer • Self-organized • Decentralized • Resilient • We use Content Addressable Network (CAN) • Highly redundant • N-dimensional toroid enhances reachability
NIDS Reports to Fusion Center NIDS NIDS CAN directs to Fusion Center Peer-to-peer CAN Worm Probe Sent INTERNET Infected Machine Host IDS IDS on probed Host reports to Fusion Center
Reporting Information • Fusion Centers need enough information to make reasonable decisions • ID systems all have different proprietary reporting formats • Fusion Centers would be overloaded with data if full packet dumps were sent • We need a concise, standardized format for reporting anomalies
Symptom Vector • Standardized set of information reported to fusion centers. • Plugins for IDS could be written to handle producing these vectors and actually connect to the CAN • Flexibility for reporting more details
Symptom Vector <src_addr,dst_addr,proto,src_port,dst_port,payload,event_type,lower_limit,upper_limit> • Payload: Payload specifies some descriptor of the actual packet payload. This is most useful for worms. Two choices we’ve considered so far are a hash of the contents, or the size in bytes • Event_type: A code specifying an event type such as a worm probe or a SYN flood • Based on the event_type, upper_limit and lower_limit are two numerical fields available for the reporting IDS to provide more information
Payload Reporting • Hash: a semi-unique string produced by performing mathematical transformations on the content • Uniquely identifies the content • Cannot easily be matched based on “similarity” so it’s hard to spot polymorphic worms • Size: the number of bytes the worm takes up • Non-unique: two worms could be of the same size, though we’re doing research to see how often that actually occurs • Much easier to spot polymorphism: simple changes cause no or only small changes in size
Routing Information • DHT is traditionally a peer to peer file sharing network • Locates content based on name, hash, etc • Not traditionally used to locate resources • We develop a routing vector in place of traditional DHT addressing methods, and use it to locate the appropriate fusion center(s)
Routing Vector • Based on the anomaly type • Generalized to ensure similar anomalies go to the same fusion center, while disparate anomalies are distributed across the network for better resource allocation • Worm routing vector: <dst_port,payload,event_type,lower_limit,upper_limit>
Routing Vector • Worm routing vector avoids using less relevant fields such as source port or IP addresses • Designed to utilize only information that will be fairly consistent across any given worm • Used to locate fusion center, which receives full symptom vector for detailed analysis
Size and the boundary problem • Assume a CAN with several nodes. Each is allocated a range of sizes, say in blocks of 1000 bytes. • Assume node A has range 4000-5000 and node B has range 5000-6000 • If a polymorphic worm has size ranging between 4980 and 5080, the information is split • Solution? Have information sent across the boundary. Node A sends copies of anything with size >4900 to node B and node B sends anything with size <5100 to A
To DHT or not to DHT • DHT automatically organizes everything for us • DHT ensures anomalies are somewhat spread out across the network • DHT routes in real time, without substantial prior knowledge of the anomaly • DHT is redundant, making an attack against the sensor fusion center tricky at worst and impossible to coordinate at best
Simulating the system • We build a simple array of nodes, and have them generate the symptom and routing vectors as they encounter anomalies • Not yet complete, work in progress • Demonstrates fusibility of information appropriately; non-interference of multiple simultaneous anomalies
Further Work • Complete paper (duh) • Add CAN to simulation to actually route • Include real-world packet dumps in the simulation • Test on more complex topologies?