300 likes | 425 Views
Mitigating the Insider Threat using High-dimensional Search and Modeling. DARPA IPTO Program: Self Regenerative Systems (SRS) Program Manager: Lee Badger PI: Eric van den Berg. Presenter: Eric van den Berg evdb@research.telcordia.com Wednesday, July 13, 2005 Team:
E N D
Mitigating the Insider Threat using High-dimensional Search and Modeling DARPA IPTO Program: Self Regenerative Systems (SRS) Program Manager: Lee Badger PI: Eric van den Berg Presenter:Eric van den Berg evdb@research.telcordia.com Wednesday, July 13, 2005 Team: Shambhu Udpadhyaya, Hung Ngo (SUNY Buffalo) Muthu Muthukrishnan, Raj Rajagopalan (Rutgers)
Project overview • Project goal: to build a system that defends critical services and resources against insiders, which • Correlates large numbers of sensor measurements • Synthesizes appropriate pro-active responses • What is done today? • Reactive systems: Detect attacks late in cycle • Anomaly detection systems: Few streams for correlation • Human-based systems: not scalable • Collateral damage may be large
Project overview (continued) • Technical Approach • Large network of sensors, to let insider trigger alerts • High dimensional network state description using sensor alerts • Search engine finds top-K past states similar to sensor snapshot • Insider modeler and analyzer tool used to identify attack points, train search engine, guide sensor placement • Response engine to analyze impact on critical services and synthesize reconfiguration response • Technical Challenges • Testing SVD-based search technology in a new domain • New ‘Insider analyzer’ key-challenge graph problem is hard • Training search engine, labeling and annotating states
Jul-Dec 05 Jul-Dec 04 Jan-Jun 05 Task(milestone) Design (document) Prototyping (software) Testing (report) Project overview (continued) • Quantitative Metrics to measure success and overheads • False alarm / detection rate • Test detection for novel variations of known attacks • Major Achievements to date • Initial prototype for sensor network • Initial prototype for SVD-based search engine • Initial prototype for Insider modeler and analyzer tool • First test with independent ‘insiders’
Model Component Abstraction Vertex Hosts, People Edge Connectivity, Reachability Key Information, Capability Key Challenge Access Control Starting Vertex Location of insider Target Vertex Actual target Cost of Attack Threat analysis metric Insider analyzer and modeler • Insider threat manifests in two forms: • Insider abuse while staying within legitimate privileges • Insider abuse while exceeding assigned privileges • Focus on an insider's view of an organization: hosts, reachability and access control • A new threat model called a “key challenge graph” • Similar to attack graphs, less emphasis on details • Allows static analysis of insider threat • More in papers
Network entity rules MAPIT Engine Network topology Key challenge graph Vulnerabilities Cost Rules Authentication mechanism Defense centric analysis Sensitivity analysis Social Eng . Awareness Insider modeler and analyzer MAPIT tool architecture
Sensors to detect insider attacks • Detect changes from user ‘normal behavior’ • Profile anomaly detector • Statistical sequential change point detection • Future: biometrics, e.g. keystroke dynamics? • Detect access to target resources • Pluggable Authentication Module, File integrity checker • Other useful sources: • web, audit logs (e.g. internal website searches) • network intrusion detectors (signature, anomaly)
Network traffic anomaly detector • Streaming data model • Large data volume and speed: in backbone 1 billion packets/hour/router • Large data domain: IPv4: 2^32 addresses, IPv6: 2^128 • Consequences: • Can scan data (at most) once • Need small-space structure to summarize data • Hard to store O(n) data points when n=2^32 • Cannot store at 2^128 • Idea: build synopsis data structure for IP-packets • CM-sketches, deltoid group-testing • Detect attacks based on changes in traffic volume • Currently: traffic to destination IP address (likely targets) • Can detect attacks exhibiting large changes in packet distribution
Example: Network anomaly detector • Based on week 2 of 1999 MITLL data • from inside sniffer • Traffic volume based anomaly detection • Ipsweep, portsweep, phf, httptunnel, etc. • Detects targets of all four above attacks • Does give additional big changes ~1%, not attacks • Search engine to filter out non-attacks
Sensor alert message format • We use IDMEF (Intrusion Detection Message Exchange Format) to transmit and store sensor alerts • Between sensors and database • Between search engine and response engine • Alert storage in mySQL database with IDMEF-based schema
Network state description • Network state is constructed from sensor alerts: • Accommodate heterogeneous sensor types • Account for different sensitivity of sensor types • Tolerate possibly delayed or missing, ‘out of order’ alerts • Alerts are mapped to a high-dimensional vector for search • Coordinates correspond to different sensor-alert types • Some possibilities for mapping values: • Total number of sensor alerts of given type in (sliding) time window • Indicator: sensor alert occurred in (sliding) time window • Network state is labeled: • With Classification e.g. ‘Normal’, ‘Insider’ • With Response for Response Engine
High-dimensional search engine • Goal: Find historical documented network states most similar to the current network state snapshot • Output: Top-K list of ranked/prioritized similar states • Ranking can be based on similarity metric, or • potential impact, e.g. attack ‘risk’ • Impact of historical network states is documented, • impact of current state analyzed with Response engine • Search engine reduces search space dimensionality • Using Singular Value Decomposition, or random projection • Similar states found by nearest neighbor search • distance metric: e.g. cosine similarity, Euclidean distance
Ranking via alert correlation • Combine alert information from network and host sensors • Segment alert state vector to reflect activity by host and user • Reinforce or weaken ‘attack’ hypothesis • Useful as component to detect or visualize specific attack patterns (moving from host to host)
SVD-based anomaly detection • Statistical Methods using ideas from Principal Component Analysis (PCA) • Imagine alarm vectors come from multivariate normal distribution • Compute sample mean, covariance / correlation matrix for training data • Eigenvalue decomposition of covariance matrix to separate data into normalized independent components
anomaly detection (cont.) • Test new vector of alarms • Check for alarms not in training data • Check for fit to training distribution • Status • Code ready • Still to determine thresholds • How far to use normality assumptions vs. switching to nonparametric methods
Early detection of insider attacks • How to represent time evolution in multi-stage attacks? • Like learning attacks from documented historical network states, we can also document attack precursors or attack stages • Full attack now represented as a sequence of network state vectors • Robust against slow attacks: no explicit dependence on time • Would like to make ‘precursor’ annotation (semi-) automatic • Approaches to automatic precursor annotation • Temporal precursors • Spatial precursors
Impact Analysis using Response Engine • Building upon Smart Firewalls technology from Dynamic Coalitions program; Response Engine • Has overview of current network configuration • Logically validates Policies, expressed in terms of end-to-end service availability • Generates candidate reconfigurations to comply with Policies as much as possible • In this project • Detected attack type and location is translated into its effect on the stated policies and current network configuration • E.g. Server failure due to a Denial of Service attack • Response Engine can analyze the impact of both the attack and its candidate responses on the availability of critical resources • E.g. Analyze impact of vulnerability exploit: how widespread is the vulnerability? • Administrator can push response into the network
Policy Correlated Alerts Response Engine Topology High-level Policy Configuration Summarized Configuration Security Policy Adaptors Detailed Configuration Device-level Policy Configuration Routers & Switches Control & Monitor Response using policy-based architecture
First test results • First system test by independent insiders • Goal: extract operations-sensitive military information • Four volunteer ‘insiders’ given • existing account information • starting location and • nature of target • Result: 3 out of 4 attackers detected • Program goal: delay / thwart 10% of insider attacks
Next / future steps • Show effectiveness against wide range of attacks • Measure false positive rate • Adapt detection system to heterogeneous environments
SVD-based search on Test alert set • Attacks from MITLL Scenario Specific datasets • Alerts from NC-State (TIAA) • Generated by Real-Secure IDS on MITLL attack data sets • Scenario 1: • IP sweep • Probe for sadmind vulnerability • Break-in via sadmind exploit • Installation of mstream DDoS software • Launch DDoS attack • Scenario 2: • Probe of DNS server via HINFO query • Break-in via sadmind exploit • FTP upload of mstreamDDoS software and attack script • Initiate attack on other hosts • Launch DDoS attack