300 likes | 457 Views
Enriching Network Security Analysis with Time Travel. Gregor Maier 1 , Robin Sommer 2 , Holger Dreger 3 , Anja Feldmann 1 , Vern Paxson 4 , Fabian Schneider 1 ACM SIGCOMM 2008. 1 TU Berlin / DT Lab, 2 ICSI / LBNL, 3 Siemens AG Corporate Technology, 4 ICSI / UC Berkeley. Reference.
E N D
Enriching Network Security Analysis with Time Travel Gregor Maier1, Robin Sommer2, Holger Dreger3, Anja Feldmann1, Vern Paxson4, Fabian Schneider1 ACM SIGCOMM 2008 1TU Berlin / DT Lab, 2ICSI / LBNL, 3Siemens AG Corporate Technology, 4ICSI / UC Berkeley
Reference • Stenfan Kornel, Vern Paxson, Holger Dreger, Anja Feldmann, Robin Sommer, “Building a Time Machine for Efficient Recording and Retrieval of High-Volume Network Traffic,” 5th ACM IMC 2005. • Stenfan Kornel, “High-Performance Packet Recording for Network Intrusion Detection,” Master Thesis, 2005. • Gregor Maier, Robin Sommer, Holger Dreger, Anja Feldmann, Vern Paxson, Fabian Schneider, “Enriching Network Security Analysis with Time Travel,” ACM SIGCOMM 2008. • Time Machine webpage: • http://www.net.t-labs.tu-berlin.de/research/tm/ Speaker: Li-Ming Chen
Outline • Introduction • Time Machine (TM) Design • Performance Evaluation • Coupling TM with a Network Intrusion Detection System (NIDS) • Discussion • Conclusion & comments Speaker: Li-Ming Chen
Introduction • Definition • Time Travel is the capability allows us to conveniently “travel back in time” • Time Machine is the system that provides capability “Time Travel” • This paper present a Time Machine (TM) for network traffic to enable later inspection of activity that becomes interesting only in retrospect • Benefit for network security monitoring? • Security forensics • Network trouble-shooting • Event correlation Speaker: Li-Ming Chen
Problems • (Storage) wholesale recording and retention of entire data streams is infeasible • A Gigabit network several TB per day • However, network trace with full packet content can provide most information for investigating security incidents • (Data selection) only a very small subset of the traffic is relevant for later analysis • How to decide beforehand what data will be crucial? • (Analysis) data retrieval is like finding needle in a haystack • It’s time-consuming and cumbersome Speaker: Li-Ming Chen
Common Practice at LBNL (Before using TM) • LBNL: Lawrence Berkeley National Laboratory • About 10,000 hosts • 10Gbps Internet connectivity • 1-2TB per day • 320 Mbps (37 Kpps) at busy-hour (IMC’05) • Bulk-recording with tcpdump • Due to the storage constrains • Omit key services (HTTP, FTP, etc.) • Omit some high volume hosts • Manual analysis of traces after incident • The omissions constitutes a blind spot during analysis • Increasing number of attacks carried out over HTTP Speaker: Li-Ming Chen
Objective • Design a Time Machine (prototype) (IMC’05) • Record raw packets (not only headers but full contents, not aggregation or attribution) • Leverage heavy-tails to capture nearly all of the likely-interesting traffic while store only a small fraction of the total volume • A better Time Machine!! (SIGCOMM’08) • Re-architected for better performance based on real world experiences • Coupled with a rich query-interface • Facilitate both manual (operator-driven) and automated (NIDS-driven) retrospective analysis Speaker: Li-Ming Chen
Outline • Introduction • Time Machine (TM) Design • Performance Evaluation • Coupling TM with a Network Intrusion Detection System (NIDS) • Discussion • Conclusion & comments Speaker: Li-Ming Chen
Time Machine (Key Insight) • “Heavy-tailed” distribution in network traffic • Most network connections are quite short • 91% of connections < 10 KB • Minority of connections carry most of volume • Bulk data transfer (Video, Audio, etc.) • Relevant/interesting data mostly at beginning • Handshakes, application protocol headers… • Compromising is at the beginning of most attacks • For forensics and trouble-shooting applications the beginning of a large connection contains the most significant information Speaker: Li-Ming Chen
Time Machine (Employ Cutoff Limit) • Exploit the “heavy-tailed” nature to partition the traffic stream into a small subset of high interest vs. a large remainder of low interest • Then record the small subset and discard the rest • Cutoff limit, N: • Only store the first N bytes per connection • Greatly reduce the traffic we must buffer • Retain full context for small connections and the beginning for large connections Speaker: Li-Ming Chen
TM “Multi-threaded” Architecture using libpcap mapping packets to connections enforcing cutoff for each connection separating storage classes, different classes can have different cutoff and buffer budgets managing buffer budgets, subject to the budget constrains, TM always store most recent packets support query subscription support efficient query, indexes can be configured for any subset of packet’s header fields (depend on query) query must related to the indexes support 2 delivery method Speaker: Li-Ming Chen manage indexes
Outline • Introduction • Time Machine (TM) Design • Performance Evaluation • Coupling TM with a Network Intrusion Detection System (NIDS) • Discussion • Conclusion & comments Speaker: Li-Ming Chen
TM live deployments at MWN and LBNL Endace DAG card: http://www.endace.com/our-products/dag-network-monitoring-cards/ Speaker: Li-Ming Chen
Recording: Cutoff vs. Data Volume Connections in LBNL are more light-weight Bulk data transfer in MWN LBNL exhibits a higher variability (shows a diurnal variation) average data rate Speaker: Li-Ming Chen data reduction rate
Recording: Does TM has Sufficient CPU Resources for Query Processing? For recording & indexing, CPU utilization is low~ Speaker: Li-Ming Chen
Recording: Retention Time (how long we store packet data?) Avg. 4 days (original 3~6 TB /day) LBNL has larger retention time, even the budgets are small Speaker: Li-Ming Chen
Querying: number of queries can handle Suffices to cope with the number of automated queries generated by a NIDS (mentioned later) at LBNL, focus on in-memory queries Speaker: Li-Ming Chen
Querying: latency between issuing queries and receiving the corresponding replies at LBNL, with live traffic Naturally, we wish to keep the latency low, both to provide timely responses and to ensure accessibility of the data (in-memory queries) In-memory In-disk Speaker: Li-Ming Chen
Outline • Introduction • Time Machine (TM) Design • Performance Evaluation • Coupling TM with a Network Intrusion Detection System (NIDS) • Discussion • Conclusion & comments Speaker: Li-Ming Chen
Experiences for Operating the “Original” TM (IMC’05) at LBNL • 1.) manually query is infeasible • Lots of NIDS alerts require the analyst to manually interact with the TM to extract the corresponding traffic prior to inspecting it • Provide a direct interface between NIDS and TM to extract the relevant traffic • 2.) require dynamically adaptation of TM • Sometimes analyst needs to access to more details of problematic connections by bulk recording • NIDS can automatically instruct TM to suspend the cutoff Speaker: Li-Ming Chen
Experiences for Operating the “Original” TM (IMC’05) at LBNL (cont’d) • 3.) support two-tiered analysis strategy • Using cheap, preliminary heuristics to find a pool of possibly problematic connections, • and then perform much more expensive analysis on just that pool • Coupling TM with a NIDS, enable the NIDS to perform retrospective analysis • 4.) fine-tune TM’s performance • Accommodate the interactions among recording, indexing, and random queries for rigorous real-time requirements Speaker: Li-Ming Chen
Prototype Deployment at LBNL • 2-week experiences: • Network traffic: 22.7 TB • TM records 0.6 TB • retention time: 11 days • NIDS reports 66K alerts • 98% alerts are due to • scanning activity Bro • Improve forensics support on: • NIDS controls TM • NIDS retrieves data from TM • Support retrospective analysis Speaker: Li-Ming Chen
NIDS Controls the TM • NIDS dynamically change TM’s parameters • Change the storage class of the IP address the attacker is coming from to a more conservative set of parameters • Higher cutoff • Larger budget (longer retention time) • Storage classes: • Original (benign), 15KB cutoff • Scanners (for scan notifications), 50KB cutoff • Alarms (for non-scan notifications), disable cutoff Speaker: Li-Ming Chen
NIDS Retrieves Data from TM • NIDS queries the TM for the relevant packets • Then the packets feed back to NIDS and NIDS stores the reassembled payload stream on disk • Eases subsequent manual inspection of the activity • E.g., • HTTP 200 OK • Applications running on non-standard ports • Also design a web-interface to notifications and their corresponding network traffic Speaker: Li-Ming Chen
Retrospective Analysis • A tighter integration of TM and NIDS • Recovering from Packet Drops • NIDS may incur measurement drops • NIDS can query for connections that are missing packets and reprocess them • Offloading the NIDS • Address the tradeoffs between analysis and resource usage of NIDS • Broadening the analysis context • Analyses traffic from past Speaker: Li-Ming Chen
Outline • Introduction • Time Machine (TM) Design • Performance Evaluation • Coupling TM with a Network Intrusion Detection System (NIDS) • Discussion • Conclusion & comments Speaker: Li-Ming Chen
Deployment Tradeoffs • Risk of Evasion (fundamental limitation) • Solution: using different storage classes, using random cutoff limit • Network Load • Solution: better hardware, TM clustering • Floods • DDoS might stress the TM’s connection-handling, undermine the capture of useful packets, reduce retention time… • Solution: flood detection & mitigation • Retrieval Time • Should be careful and notice that disk queries are resource-consuming • NIDS and Cutoff • NIDS controls TM only for future activities, how about the past? Speaker: Li-Ming Chen
Conclusion • Build an evaluated efficient Time Machine • Support commodity hardware for Gigabit networks • Used operationally • Cutoff heuristic: keep first N bytes of every connection • Reduce volume typically by more than 90% • Retain days/weeks of full payload traffic traces • Coupled TM with a NIDS (Bro) • Improved forensics support • Automatic queries for deeper inspection Speaker: Li-Ming Chen
Future Work • Mitigate evasion risk • Use randomized cutoff • Keep some packets even after cutoff hit • Use NIDS to disable cutoff • Cutoff processing in hardware • E.g., NetFPGA • Aggregation instead of direct eviction Speaker: Li-Ming Chen
Comments • Privacy concern in full payload recording • Performance evaluations only for original TM • When coupled with NIDS, the performance of recording and querying become…? • Data volume, retention time, query latency? • NIDS controls TM for deeper inspection, when to stop it? • Where is the critical evidence of attacks? • (TM) For connections, interesting data mostly at beginning • (Gestalt) For connections/associations, interesting data mostly at procedure violation • (My research) For hosts, interesting data mostly at contact activity violation • What else…? Speaker: Li-Ming Chen