310 likes | 389 Views
Knowledge Engineering for Probabilistic Models: A Tutorial. Presented at: Fifteenth Conference on Uncertainty in Artificial Intelligence. July 28, 1999. Kathryn Laskey George Mason University and IET, Inc. Suzanne Mahoney Information Extraction and Transport, Inc.
E N D
Knowledge Engineering for Probabilistic Models:A Tutorial Presented at: Fifteenth Conference on Uncertainty in Artificial Intelligence July 28, 1999 Kathryn Laskey George Mason University and IET, Inc. Suzanne Mahoney Information Extraction and Transport, Inc. Thanks to Mike Shwe for significant contribution to an earlier version of this tutorial
Outline • Getting started • KA Process: building a model • Managing the process • An example
What is Knowledge Acquisition? • Objective: Construct a model to perform defined task • Participants: Collaboration between problem expert(s) and modeling expert(s) • Process: Iterate until done • Define task objective • Construct model • Evaluate model
Participants • Naive view • “Put problem experts and modeling experts in a room together and magic will happen” • Realistic view • Pure “problem experts” and pure “modeling experts” will talk past each other • Modeling experts must learn about the problem and problem experts must learn what models can do • This process can be time consuming and frustrating • Team will be more productive if both sides expect and tolerate this process • Training • The most productive way of training modelers and problem experts is to construct very simple models of stylized domain problems • Goal is understanding and NOT realism or accuracy!
The Domain and the Expert • Domains well suited to measurably good performance • Tasks are repeatable • Outcome feedback is available • Problems are decomposable • Phenomena are inherently predictable • Human behavior and/or ”gaming” not involved • If this is not your domain all is not lost! • All models are wrong but some are useful • KNOW YOUR OBJECTIVE! • Characteristics to look for in an expert • Expertise acknowledged by peers • Articulate • Interest and ability to reason about reasoning process • Tolerant of messy model-building process
Selecting a Sub-problem • For initial model or expanding existing model • Characteristics of a good sub-problem • Manageable size • Interesting in its own right • Path to expansion • How to restrict the problem • “Focus” variables • Restrict to subset of variables of interest • Restrict to subset of values • Evidence variables • Restrict to subset of evidence sources • Context variables • Restrict to subset of contextual conditions (sensing conditions, background casual conditions)
Outline • Getting started • KA Process: building a model • Managing the process • An example
KA Process: Construct Model • What are the variables? • What are their states? • What is the graph structure? • What is the structure of the local models? • What are the probability distributions?
What are the Variables? • Begin with “focus variable” and spread out to related variables • Ask about causally related variables • Variables that could cause a state to be true • Variables that could prevent a state from being true • Ask about enabling variables • Conditions that permit, enhance or inhibit operation of a cause • Ask about effects of a variable • Ask about associated variables • Knowing value provides information about another variable • Ask about observables
What are the States? • States must be exclusive and exhaustive • Naive modelers sometimes create separate variables for different states of the same variable • Types of variable • Binary (2-valued) • Qualitative • Numeric discrete • Numeric continuous • Dealing with infinite and continuous state sets • Some Bayes net software requires discretizing continuous variables • Bucket boundaries should represent meaningful differences in effect on related variables
The Clarity Test • Usually begin with loose structure to develop understanding of problem • Final model must have clear operational meaning shared among participants • Modeler, subject matter expert, user • Clarity test: • Could a clairvoyant unambiguously specify value of all nodes and states? • “Fever is high” does not pass clarity test • “Fever ≥ 103° F” passes clarity test
What is the Graph Structure? • Goals in specifying graph structure • Minimize probability elicitation • Fewer nodes, fewer arcs, smaller state spaces • Maximize fidelity of model • Sometimes requires more nodes, arcs, states • Balance benefit against cost of additional modeling • Too much detail can decrease accuracy • Maximize expert comfort with probability assessments • Drawing arcs in causal direction not required but often • Increases conditional independence • Increases ease of probability elicitation
Adding States to Parent • Original model: UTI binary with states {present, absent} • P(Malaise|UTI=present, Fever=present) > P(Malaise | UTI=present) • Redefine UTI states {absent, mild, moderate, severe} • P(Malaise | UTI=severe, Fever=present) P(Malaise | UTI=severe)
Adding Intermediate Variables • “True state” variable creates conditional independence of sensor reports • Intermediate mechanism creates independence among a set of related findings Insufficient salt UTI Excess sugar Excess yeast True abd. Yeast frenzy rebound Soupy overflow Surgeon's Abdominal rebound rpt. Sunken top rebound rpt. Large holes
What is the Local Model Structure? • Local model: set of conditional probability distributions of child given values of parents • One distribution for each combination of values of parent variables • Assessment is exponential in number of parent variables • Assessment can be reduced by exploiting structure within local model • Examples of local model structure • Partitions of parent state space • Independence of causal influence • Contingent substructures • It helps if your software supports local model structure!
Simplifying Local Models:Elicitation by Partition • Partition state set of parents into subsets • Set of subsets is called a partition • Each subset is a partition element • Elicit one probability distribution per partition element • Child is independent of parent given partition element • Examples: • P(reported location | location, sensor-type, weather) independent of sensor type given weather=sunny • P(fever=high | disease) is the same for disease {flu, measles}
Simplifying Local Models:Independence ofCausal Influence (ICI) • Assumption: causal influences operate independently of each other in producing effect • Probability that C1 causes effect does not depend on whether C2 is operating • Excludes synergy or inhibition • Examples • Noisy logic gates (Noisy-OR, Noisy-AND, Noisy-XOR) • Noisy adder • Noisy MAX • General noisy deterministic function
Conditional ICI • Combining ICI with standard conditional probability tables • e.g, Age-dependent localization of pain Age Age Gastritis Pancreatitis Pancreatitis Gastritis Age-Panc-Epig Age-Gastritis-Epig Epigastric pn Epigastric pn MAX
Example: Noisy-MAX In theory: • Idea: effect takes on value of maximum contribution from individual causes • Restrictions: • Each cause must have an off state, which does not contribute to effect • Effect is off when all causes are off • Effect must have consecutive escalating values: e.g., absent, mild, moderate, severe. Causen Cause2 Cause1 Effectn Effect1 Effect2 MAX Observed effect Cause1 Cause2 Causen In modeling tool: Observed effect MAX
Assessing Noisy-MAX • “Assume that the effect X is absent and that all of the causes are absent. Cause Yi changes from absent to yij. What is the probability that X=xk?” • In practice, for example, “What is probability of severe malaise given that fever is moderate (102-104 oF) —all other causes of malaise being absent?”
Contingent Substructures • Parts of model may be contingent on context • Variable • States of variable • Arcs • Sub-structures of model • Examples: • In a model for diagnosing depression, variable “Time in Menstrual Cycle” would not exist for male patients • In a model for predicting the behavior of military forces, a sub-model representing activities in a bridge-building operation would not exist if the unit is not an engineering unit • Modeling contingent substructures • Contingent values and variables can be represented in standard belief network software by special value NA (not applicable)
What are the Probability Distributions? • Discrete variables • Direct elicitation: “p=.7” • Odds (for very small probabilities): “1 in 10,000” • Continuous variables • Bisection method • Elicit median: equally likely to be above and below • Elicit 25th percentile: bisects interval below median • Continue with other percentiles till fine enough discrimination • Often useful to fit standard functional form to expert’s judgments • Need to discretize for most belief net packages • There is a large literature on probability elicitation
The Probability Wheel • Use for probabilities between about .1 and .9 • Ask expert which is more probable • event being assessed? • spinner landing in gray area? • Adjust gray area until expert agrees probabilities are same
Outline • Getting started • KA Process: building a model • Managing the process • An example
Managing Knowledge Elicitation • Record rationale for modeling decisions • Develop “style guide” to maintain consistency across multiple sub-problems • Naming conventions • Variable definitions • Modeling conventions • Enforce configuration management • History of model versions • Protocols for making and logging changes to current model • Rationale for changes • Develop protocol for testing models
Model Evaluation • Elicitation review • Review variable and state definitions • Clarity test, agreement on definitions, consistency • Review graph and local model structure • d-separation, review of partition definitions • Review probabilities • Compare probabilities with each other • Sensitivity analysis • Measures effect of one variable on another • Compare with expert intuition to evaluate model • Evaluate whether additional modeling is needed • Case-based evaluation • Run model on set of test cases • Compare with expert judgment or “ground truth”
Outline • The knowledge acquisition process • Modeling approaches to simplify knowledge acquisition • An example
Modeling Problem (p.1) Stanley Geek is a young recruit to the Islandian Intelligence Service. Geek’s first assignment is to build a model to predict the future course of events in the stormy relationship between Depravia and Impestuostria, who have historically been at odds over a region internationally recognized as being inside Depravia, but which the Impestuostrians regard as rightfully theirs. Geek has spent several weeks gathering evidence from experts, and has summarized his findings as follows in a memo to his superior intelligence officer. • The Impestuostrians are holding an election in two months. A strong candidate for president has ties to the fundamentalist Zappists, who believe the disputed territory was given by God to Impestuostria, the Depravians are occupying it against the will of God, and God is calling the Zappists to a holy war to repossess this part of their rightful homeland. The Zappist-connected candidate has been making inflammatory statements. Islandia is concerned that if he is elected, the new government will be more likely to initiate military action, and the situation will become unstable. • If the new candidate is elected, it is more likely that the Impestuostrian military will be placed on heightened alert, or might even attack Depravia.
Modeling Problem (p.2) • Indicators of heightened alert status include increased communications traffic, increased troop movements, movement of military units to strategically important areas, and increased conscription of recruits into the military. These same indicators would also be observed, but at a more intense level, if an attack was being planned. A real attack would be preceded by massing of troops near the Depravian/Impestuostrian border. • For years it has been suspected that Impestuostria was doing research on chemical weapons. Impestuostria might begin to manufacture chemical weapons in response to increased tensions. If this were the case, the most likely location would be the Crooked Creek chemical plant. If so, increased traffic in and out of the plant would be expected, especially of military vehicles. Islandia has secretly placed sensors in the vicinity of the plant that are sensitive to compounds used in the manufacture of chemical weapons. If the plant is being used to manufacture chemical weapons, it is expected that international observers would observe suspicious activity. The Zappists have always opposed the inspections negotiated under the nonproliferation treaty, and the Islandians are worried that if the new government is elected, the Impestuostrians will pull out of the treaty and expel the inspectors.
Modeling Problem (p.3) • The Crooked Creek facility is near a military base. Being a major industrial facility, Crooked Creek is one of the strategic locations Impestuostria would want to protect. Therefore, they are likely to respond to increased tensions by beefing up the military base near the plant. If the plant is being used as a chemical weapons facility, an increased military presence at the base is even more likely. The military might disguise a protective military presence around the plant as exercises. • If the Zappists are elected, the Impestuostrians are likely to step up terrorist activity inside Depravia. This is even more likely if they are planning an attack. • If the Impestuostrian economy continues to deteriorate, it will make it more likely that the Zappist candidate will win the election. If the Zappist candidate is elected and the economic situation is poor, it is more likely that the Impestuostrians will initiate an attack to distract the populace from its economic problems at home. Draw the graph of a Bayesian network that Stanley might build to model this problem. Define each node and its states clearly. Justify your structure. Assign local probability tables to the variables and justify your probabilities.
To dig deeper… • The knowledge elicitation process • Mahoney, S. and Laskey, K. Network Engineering for Complex Belief Networks, UAI 96 • Morgan, M.G., Henrion, M. and Small, M. Uncertainty: A Guide to Dealing with Uncertainty in Quantitative Risk and Policy Analysis, Cambridge University Press, 1990 • Barclay, S., Brown, R., Kelly, C., Peterson, C., Phillips, L. and Selvidge, J., Handbook for Decision Analysis, Decisions and Designs, Inc., 1977 (a classic but hard to find) • Probability elicitation • Upcoming special issue of IEEE TKDE • Morgan, Henrion, and Small, ibid. • DDI, ibid. • Spetzler, C.S. and Stael von Holstein, C-A.S., Probability Encoding in Decision Analysis, Management Science 22(3), 1975 (a classic) • Independence of causal influence • Pearl, J. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference, Morgan Kaufmann, 1988 (for noisy OR: not the original source, but easily available) • Srinivas, S. A Generalization of the Noisy-OR Model, UAI-93 • Elicitation by partitions • Heckerman, D. Probabilistic Similarity Networks, MIT Press, 1991 • Models and Tools to Support KE • Pradhan, M., Provan, G., Middleton, B. and Henrion, M. Knowledge Engineering for Large Belief Networks, UAI-94 A very limited and idiosyncratic sampling of sources of further information