450 likes | 557 Views
Bayesian Security Analysis: Opportunities and Challenges ARO Workshop, Nov 14, 2007. Jason Li Intelligent Automation Inc Peng Liu Penn State University. Outline. Introductions Overview of Bayesian Networks Opportunities in Security Analysis Challenges Roadmap. Securing Large Networks.
E N D
Bayesian Security Analysis: Opportunities and ChallengesARO Workshop, Nov 14, 2007 Jason Li Intelligent Automation Inc Peng Liu Penn State University
Outline • Introductions • Overview of Bayesian Networks • Opportunities in Security Analysis • Challenges • Roadmap
Securing Large Networks • A network defender’s primary advantage over an attacker is intimate knowledge of the network • Defender’s Arsenal • Vulnerability scanners • Firewall / Routers / other infrastructure • Databases • Intrusion Detection Systems Network defense must fully leverage that advantage
Attacker start Vuln. 1 Root on Host 1 … Vuln. 4 User on Host 2 … … … … … Connecting the Dots Vuln. 1 NVD … Vuln. 4 Where are the vulnerabilities? What do they mean? ALLOW 10.0.0.4 -> … How can an attacker get to them? Alerts src A dst B attack C What is the situation? It is challenging to do this automatically and quickly
Dream Tools for System Administrator • Automatic tools to assist consistent and secure configuration to enable normal operations • Equipped with sufficient security sensors for a rainy day • No alarms under normal operations: life is beautiful • When the sensors go off, don’t flood me with just alarms • With tons of alarms: don’t know what’s going on; ignore them • Instead, tell me some in-depth knowledge • What’s wrong? (e.g. where, what, scope) • What does this mean? (e.g. severity, impact assessment) • What will happen next? (e.g. downstream) • What can I do? (e.g. suggestions please) • Better yet, tell me all these within several minutes of alarms • Preventive: Is there some layered protection so that most (common) attacks won’t even able to cause damages?
Intelligence From Information to Intelligence • However, today’s technology is far from being capable of reaching such goals. • Current security analysis tools for enterprise networks typically examine only individual firewalls, routers, or hosts separately • Do not comprehensively analyze overall network security. • Certainly not sufficient • Our observations: much work on transforming “data” to “information” (e.g. alarms in IDS), relatively few and insufficient on transforming “information” to “intelligence” (e.g. situational awareness, action planning, etc)
Introduction: Attack Graphs • To address this problem, attack graphs surface as the mainstream technology • Network wide analysis • Multi-stage attacks • General Idea • Nodes represent network security states • Edges represent state transitions via exploits • To make attack graph tools useful, we identify the following requirements
Introduction: Requirements • Automatic generation algorithms • Attack graphs must be scalable • Thousands of nodes • Efficient and Powerful Analysis • The attack graph size must be scalable • The semantics must be rich enough, but not richer • Static analysis, situational awareness, what-if, etc. • Attack graphs must be practical • Attack graph tools that entail laborious manual efforts, poor scalability, and clumsy analysis are considered impractical • Network reachability information (e.g. analyze firewall rules) • Real-time software tool
Review of Prior Art Our goal: Appropriate Semantics and Powerful Analysis
Outline • Introductions • Overview of Bayesian Networks • Opportunities in Security Analysis • Challenges • Roadmap
What is a Bayesian Network? A Bayesian network is a graphical model that represents the problem domain in a probabilistic manner. • Nodes represent interested propositions • Directed links represent immediate influence • The parameters associated with each node represent the strength of such immediate influence Conditional Probability Table (CPT)
Representation: Breaking the Joint A joint distribution can always be broken down into a product of conditional probabilities using repeated applications of the product rule P(A,B,E,J,M) = P(A) P(B|A) P(E|A,B) P(J|A,B,E) P(M|A,B,E,J) We can order the variables however we like P(A,B,E,J,M) = P(B) P(E|B) P(A|B,E) P(J|B,E,A) P(M|B,E,A,J)
Compact Representation A Bayesian network represent the assumption that each node is conditionally independent of all its non-descendants given its parents P(J|B,E,A) = P(J|A) P(M|B,E,A,J) = P(M|A) The joint as a product of CPTs P(A,B,E,J,M) = P(B) P(E) P(A|B,E) P(J|A) P(M|A) So the CPTs determine the full joint distribution
The Basic Inference Problem Given 1. A Bayesian network BN 2. Evidence e - an instantiation of some of the variables in BN (e can be empty) 3. A query variable Q Compute P(Q|e) - the (marginal) conditional distribution over Q Given what we do know, compute distribution over what we don’t
Why Bayesian Networks • Uncertainty management • Local independence structure and d-separation • Compact representation • Efficient inference • General expressiveness • Supporting planning and action modeling: • Provides belief states • Game theory, Markov Decision Processes
Scope • Focus on basic Bayesian networks for insights • Will not discuss other (more advanced) BN models • DBN (Dynamic BN) • MEBN (Multi-entity Bayesian net) • MSBN (Multi-Sectioned Bayesian net) • SLBN (Semantically Linked Bayesian net) • OOBN (Object-oriented Bayesian net) • Deep understanding is necessary • The problem domain • The appropriate BN models • High level security analysis (not alert correlation)
Outline • Introductions • Overview of Bayesian Networks • Opportunities in Security Analysis • Challenges • Roadmap
Visit to Asia (A) Smoking ? (S) Tuberculosis? (T) Lung cancer? (L) Bronchitis? (B) Either tub or cancer ? (E) Dyspnoea? (D) positive X-ray? (X) Powerful Analysis Made Possible • Look at a well-known example in BN community • Our BN model for cyber security analysis will share similar flavor (work in progress)
Support All Kinds of Inference Visit to Asia (A) Smoking ? (S) Tuberculosis? (T) Lung cancer? (L) Bronchitis? (B) Either tub or cancer ? (E) Dyspnoea? (D) positive X-ray? (X) Diagnosis Evidence Query
Support All Kinds of Inference Visit to Asia (A) Smoking ? (S) Tuberculosis? (T) Lung cancer? (L) Bronchitis? (B) Either tub or cancer ? (E) Dyspnoea? (D) positive X-ray? (X) Prediction Evidence Query
Support All Kinds of Inference Visit to Asia (A) Smoking ? (S) Tuberculosis? (T) Lung cancer? (L) Bronchitis? (B) Either tub or cancer ? (E) Dyspnoea? (D) positive X-ray? (X) Mixed Evidence Query
Inference with Intervention • Most probabilistic models (including general Bayesian nets) describe a distribution over possible events but say nothing about what will happen if a certain Intervention occurs • A causal network, adds the property that the parents of each node are its direct causes, and thus go beyond regular probabilistic models • Mechanisms = stable functional relationships = graphs (equations) • Interventions = surgeries on mechanisms
Seeing vs. Doing • Seeing (passive observation): alerts • Would like to know the consequences of, and the possible causes for such observations (via regular inference algorithms) • Doing (active setting): set the value of a node via active experiment “Would the problematic circuit work normally if I replace this suspicious component with a good one?” • External reasons (the human diagnoser) explain why the suspicious component becomes good • All its parent nodes should not count as causes • Delete all links that point to this node • Other belief updating are not influenced
X1 X2 X3 X4 X4 X2 X1 An Example SEASON SEASON SPRINKLER SPRINKLER = ON RAIN RAIN X3 WET WET SLIPPERY X5 SLIPPERY X5
What-if Analysis made possible! • Provide a what-if dialog for the system admin • Execute “graph surgery” • Implement using multi-agent system paradigm for efficient inference • Provide timely results
What Bayesian Networks can do for us • Situational awareness: “what is going on?” • Prediction: “given the current situation, what may happen next most likely?” • What-if analysis: “what will happen if I patch this service?” • Specify additional tests to perform: “which sensors to look first to confirm/rule out?” • Suggest appropriate/cost-effective treatments/actions: “what to do first to obtain the maximized gain?” • Preventive maintenance: “what are the most vulnerable spots?”
Outline • Introductions • Overview of Bayesian Networks • Opportunities in Security Analysis • Challenges • Roadmap
Challenges of Using Bayesian Networks • Representation • Capturing the uncertainty in cyber security domain • From attack graphs to Bayesian networks • Semantics, semantics, semantics • Inference • Powerful and responsive • Learning • Tune the Bayesian networks
Challenges: Representation • Uncertainty management • Alerts themselves • Exploit sequence • Attack consequences • Attack intent • … and so on • Connecting uncertainty management with attack graph models • Semantics compatibility (node and link semantics) • Translation algorithm • Does this make sense?
Challenges: Inference • Tracking dynamic attacks on large scale networks will be a very processor intensive task. • Evaluating what-if-solutions must be done in real-time, in order to allow the human operator time to find and enforce his/hers course of action. • Available standard BN products do not scale • Scalable, (much) faster inference engine is needed
Challenges: Learning • Mining from some dataset • What dataset • Appropriate for mining (relevant information) • Learn the structure • Model selection • Meaning structure for security analysis • Expertise vs. learning • Learn the parameters • From dataset (e.g. EM algorithm) • Subjective nature of the parameters • Do the parameters reflect the situations?
Outline • Introductions • Overview of Bayesian Networks • Opportunities in Security Analysis • Challenges • Roadmap
How do we use Bayesian Nets? • Build Bayesian network models • Capturing uncertainty • Roadmap to build Bayesian network models • Powerful analysis algorithms • Clique tree based message passing algorithms • Multi-agent based approach • Learning (not included in this talk)
p(e2|e1) e2 e1 e3 p(e3|e1) S2 e2 S1 e3 Capturing Uncertainty in Cyber Security • Class 1: uncertainty about alerts • Whether the alert is true, or false positive • Class 2: uncertainty about exploit sequence • Class 3: uncertainty about possible consequences • Misconfigurations • Inconsistent patches
e1 S3 Building Bayesian Networks: Semantics • Nodes • Aggregate exploits • too many specific exploits check each and every infeasible • some exploits have common signatures • Aggregate states • Similar hosts (in terms of network segment, software configuration, etc) are equivalent • May represent some intermediate stage of multi-stage attacks (e.g. gaining a user account, with the goal of root privilege) • Directed links • “lead to” (e.g., exploit e1 leads to aggregate state s3)
Our Approach to Build Bayesian Networks • Structure • From the deterministic attack graph (with too many repetitive structures embodied, sometimes misleading) • Nodes are created based on aggregation techniques (reachability group, same enclave/configurations, etc) • Develop an algorithm to generate links based on nodes and the attack graph • Similar to attack graph structure to some extent • Where do the numbers come from? • Frequency in the logs, subjective • Robust to parameter values • So what is it? • Hybrid model across abstract levels (exploit, state, aggregates, subgoals, goals) what-if questions at such levels • Embeds intelligence from network, attack structures, human
Bayesian Network Inference • Inference is NP-hard on general Bayesian networks • For tree-structured BN, efficient algorithm exists based on message-passing (J. Pearl) • But tree-structure is too limited in practice • For multiply-connected BN (each node can have multiple parent nodes) • This is the most applicable case • Clique tree based message passing algorithms • Shafer-Shenoy algorithm • Laurizen-Spiegelhalter algorithm • Hugin Expert tool • Netica tool
Clique Tree based Inference • From variable elimination algorithms, the nodes can be organized into cliques • Rule 1: each clique node waits to send its message to a given neighbor until it has received messages from all its other neighbors • Rule 2: when a node is ready to send its message to a particular neighbor, it computes the message by collecting all its messages from other neighbors, multiplying its own table by these messages, and marginalizing the product to its intersection with the neighbor to whom it is sending
Opportunities and IAI Unique Expertise • Each clique can be modeled as an autonomous agent • The message passing can be run in parallel • The whole inference process can be modeled as a multi-agent system (MAS) • IAI is a leader in agent technology and MAS • Agent infrastructure: Cybele • Scalable multi-agent system: tens of thousands of agents This unique combination will further improve the scalability and enhance the response time
Distributed Bayesian Network Engine • Why? • Tracking dynamic attacks on large scale networks will be a very processor intensive task. • Evaluating what-if-solutions must be done in real-time, in order to allow the human operator time to find and enforce his/hers course of action. • Available standard BN engines do not scale • Solution: • Create a novel Distributed Bayesian Network engine to accommodate the kind of processing power needed. • Use general software engineering rules and methodology so that the distributed BN engine can be re-used in other domains.
Conclusions • Graphical models can be powerful for cyber security analysis and management in enterprise networks • To make powerful analysis, we look into the potentials of Bayesian networks • Lots of opportunities, full of challenges also • Our approach • Understand the problem domain and BN models • Capture uncertainty • Obtain Bayesian nets from attack graphs • Distributed agent based inference engine The outcome can only be as good as your model …