1 / 26

Yuval Shahar, M.D., Ph.D.

Judgment and Decision Making in Information Systems Computing with Influence Diagrams and the PathFinder Project. Yuval Shahar, M.D., Ph.D. Influence Diagrams. A graphical notation for modeling situations involving multiple decisions, probabilities, and utilities

Download Presentation

Yuval Shahar, M.D., Ph.D.

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Judgment and Decision Making in Information SystemsComputing with Influence Diagrams and the PathFinder Project Yuval Shahar, M.D., Ph.D.

  2. Influence Diagrams • A graphical notation for modeling situations involving multiple decisions, probabilities, and utilities • Computationally: equivalent to decision trees • Advantages relative to decision trees: • conciseness • representation in assessment order • explicit (in)dependencies represented intuitively • Disadvantages relative to decision trees: • Ambiguous timing of decisions • Hiding of internal relationships • Hiding of asymmetry

  3. Influence Diagrams: Node Conventions Chance node Deterministic node Decision node Utility node

  4. Link Semantics in Influence Diagrams Dependence link (possible probabilistic relationship) Information link “No-forgetting” link Influence link

  5. The HIV Example as a Decision Tree Decision node Chance node

  6. The HIV Example as an Influence Diagram

  7. Internal Structure of Influence Diagrams

  8. Evaluating Influence Diagrams:The Shachter Arc Reversal and Node Removal Algorithm • Eliminate all nodes (except Value) that do not point to another node (barren nodes). • As long as there are one or more nodes pointing into a value node, • If there is a decision node D that points into Value, and if all other nodes that point into Value also point to D, remove D by policy determination. Remove any nodes (other than Value) that no longer point into another node. Go to step 2. • If there is a chance node that points into only Value, remove it by averaging. Go to step 2. • Find a chance node C that points into Value and not into a decision. Reverse all arcs that point from C into other chance nodes without creating a cycle. Now C will point only into Value. Go to step 2.

  9. Evaluating Influence Diagrams: Computational Notes • Removal of any type of node involves drawing the arcs from its parent to its child (Value) • In Step 2a, the decision D pointing into Value has observed all the relevant information (chance outcomes); we can chose the best policy and remove D • In Step 2b, the outcomes of C are revealed after all decisions were made, so we can average the values (equivalent to folding back a branch of a decision tree) and remove C • In Step 2b, the outcomes of C are revealed after all decisions were made, but we need to reverse the arcs pointing from it to other chance nodes (by application of Bayes’ theorem) to get an observation order; note that arc reversal might involve adding new arcs, so that the two nodes have the same parents

  10. Evaluating the HIV Example (I) Note: We cannot remove any nodes, so we reverse the arc

  11. Evaluating the HIV Example (II) Note: “HIV Status” is now conditioned on “Obtain PCR?” and “PCR Result” by adding an arc from “Obtain PCR?” and reversing the arcs

  12. Evaluating the HIV Example (III) Note: “HIV Status” has been removed and Value is conditioned on “Obtain PCR?,” “PCR Result,” and “Treat?” which enables us to remove “?Treat?” by policy determination for each case

  13. Evaluating the HIV Example (IV) Note: “Treat?” has been removed and only the maximal values are used (we will usually record the decision direction we actually used for each test result); we can now remove “PCR Result” by averaging Value using the outcomes of “PCR Result”

  14. Evaluating the HIV Example (V) Note: “PCR Result” has been removed, enabling us to remove “Obtain PCR?” by policy determination (maximization of value), recording the decision (Yes) and the resulting expected value (70.2969)

  15. The Pathfinder Project(Heckerman, Horvitz, Nathwani 1992) • Task and domain: Diagnosis of lymph node biopsy, an important medical problem • Large difference between expert and general pathologist opinions (almost 65%!) • Problems in the domain include • Misrecognition of features (information gathering) • Misintegration of evidence (information processing) • The Pathfinder project focused mainly on assistance in information processing • A Stanford/USC collaboration; eventually commercialized as Intellipath, marketed by the ACP, used as early as 1992 by at least 200 pathology sites

  16. Pathfinder Domain • More than 60 diseases • More than 130 findings, such as: • Microscopic • immunological • molecular biology • Laboratory • Clinical • Commercial product extended to at least 10 more medical domains

  17. Pathfinder I/O behavior • Input: set of <Feature, Instance> (<Fi, Ii>) pairs (e.g., <NECROSIS, ABSENT> • Instances are mutually exclusive values of each feature • Prior probability of each disease Dk is known • P(F1I1, F2I2…FtIt | Dk,x) is in acquired knowledge base • Output: P(Dk|F1I1, F2I2…FmIm,x) • x = background knowledge (context) • User can ask what is the next best (cost-effective) feature to investigate or enter • Probabilistic (decision-theoretic) hypothethico-deductive approach • Distribution of each Dk is updated dynamically

  18. Pathfinder Methodology:Probabilities and Utilities • Decision-theoretic computation • Bayesian approach: Probabilities represent beliefs of experts (data can update beliefs) • Utilities represented as a matrix of all diseases • A matrix entry pair < Dj Dk> encodes the (patient) utility of diagnosing Dk when patient really has Dk • Since no therapeutic recommendations are made, the model can use one representative patient (the expert), expressed in micromorts and willingness-to-pay to avoid risk of each outcome

  19. Pathfinder Computation • Normally we would use the general form of Bayes Theorem: • But that involves exponential number of probabilities to be acquired and represented

  20. Pathfinder 1: The Simple Bayes Version • Assuming conditional independence of features (Simple or Naïve Bayes): • Assuming mutual exclusivity and exhaustiveness of diseases the overall computation is tractable:

  21. Pathfinder 2: The Belief Network Version • Mutual exclusivity and exhaustiveness of diseases is reasonable in lymphnode pathology • Single disease per examined lymph node • Large, exhaustive knowledge base • Conditional independence is less reasonable and can lead to erroneous conclusions • The simple Bayes representation of Pathfinder 1 was therefore enhanced to a belief network in Pathfinder 2 which included explicit dependencies between different features, still taking advantage of any explicit global and conditional independencies

  22. Decision-Theoretic Diagnosis • Using the utility matrix and given observations f, the expected diagnostic utility using f is averaged over all diagnoses: • EU(Dk(f)) = SjP(Dj| f)U(Dj,Dk) • Thus, Dx(f) = ARGMAXk [EU(Dk (f)) • However, since the diagnosis is sensitive to the utility model, Pathfinder does not recommend it, only the probabilities P(Dk |f)

  23. Value of Information (VI) • We often need to decide what would be the next best diagnostic test to perform—for example, the next best blood test or even the best next question to ask • Recall: The Value of Information (VI) of feature f is the marginal expected utility of an optimal decision made knowing f, compared to making it without knowing f • The net value of information (NVI) of f = VI(f)-cost(f) • NVI is highly useful for deciding what would be the next test, if any, to perform, in a diagnostic setting

  24. Pathfinder: Gathering Information • Next best feature to observe is recommended using a myopic approximation, which considers only up to one single feature to be observed • The feature chosen maximizes EU given that a diagnosis would be made after observing it • Feature f is chosen that maximizes NVI(f) • Although myopic approximation could backfire, in practice it works well • especially when U(Dj,Dk) =is set to 0 if one of the diseases is malignant and the other benign, and set to 1 if they are both malignant or both benign

  25. Pathfinder 2: Knowledge Acquisition • To facilitate acquisition of multiple probabilities, a Similarity Network model was developed • Using similarity networks, an expert creates multiple small belief networks, representing 2 or more diseases that are difficult to distinguish • The local belief networks are then unified into a global belief network, preserving soundness • The graphical interface also allows partitioning of diseases into sets, relative to each set some feature is independent, thus further assisting in the construction

  26. Pathfinder 1 and 2: Evaluation • Pathfinder 1 was compared to Pathfinder 2 using 53 cases, a new user, and a thorough analysis of each case • Diagnostic accuracy of PF2 is greater than that of PF1 (gold standard: the main domain expert’s distribution and his assessment on a scale of 1 to 10) • Difference is due to better probabilistic representation (better acquisition and inference) • Cost of constructing PF2 rather than PF1 is justified by the improvements, (measure: the utility of the diagnosis) • PF2 is at least as good as the main domain expert, with respect to diagnostic accuracy

More Related