1 / 31

CHAPTER 7

CHAPTER 7. BAYESIAN NETWORK INDEPENDENCE BAYESIAN NETWORK INFERENCE MACHINE LEARNING ISSUES. Review: Alarm Network. Causality?. When Bayesian Networks reflect the true causal patterns:  Often simpler (nodes have fewer parents)  Often easier to think about

nemo
Download Presentation

CHAPTER 7

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CHAPTER 7 BAYESIAN NETWORK INDEPENDENCEBAYESIAN NETWORK INFERENCEMACHINE LEARNING ISSUES

  2. Review: Alarm Network

  3. Causality? • When Bayesian Networks reflect the true causal patterns:  Often simpler (nodes have fewer parents)  Often easier to think about  Often easier to elicit from experts • BNs need not actually be causal  Sometimes no causal net exists over the domain  E.g. consider the variables Traffic and RoofDrips  End up with arrows that reflect correlation, not causation • What do the arrows really mean?  Topology may happen to encode causal structure  Topology really encodes conditional independencies

  4. Creating Bayes’ Nets • Last time: we talked about how any fixed Bayesian Network encodes a joint distribution • Today: how to represent a fixed distribution as a Bayesian Network  Key ingredient: conditional independence  The exercise we did in “causal” assembly of BNs was a kind of intuitive use of conditional independence  Now we have to formalize the process • After that: how to answer queries (inference)

  5. Conditional Independence

  6. Conditional Independence

  7. Independence in a BN

  8. Causal Chains

  9. Common Cause

  10. Common Effect

  11. The General Case

  12. Reachability

  13. Reachability (the Bayes Ball)

  14. Example

  15. Inference

  16. Reminder: Alarm Network

  17. Atomic Inference

  18. Inference by Enumeration

  19. Evaluation Tree

  20. Variable Elimination • Still lots of redundant work in the computation tree! • We can save time if we cache all partial results • This is the basic idea behind the variable elimination algorithm • Compute and store factors over variables which represent results of intermediate computations • All CPDs are factors, but not all factors are CPDs • Thus not always “human interpretable” • Just improves efficiency, doesn’t improve worst case time complexity • Still exponential in the number of variables • That’s all we’ll expect you to know!

  21. Classification

  22. Tuning on Held-Out Data

  23. Confidences from a Classifier

  24. Precision vs. Recall

  25. Precision vs. Recall

  26. Errors, and What to Do

  27. What to Do About Errors? • Need more features: words aren’t enough!  Have you emailed the sender before?  Have 1K other people just gotten the same email?  Is the sending information consistent?  Is the email in ALL CAPS?  Do inline URLs point where they say they point?  Does the email address you by (your) name? • Naïve Bayes models can incorporate a variety of features, but tend to do best in homogeneous cases (e.g. all features are word occurrences)

  28. Features • A feature is a function which signals a property of the input • Examples:  ALL_CAPS: value is 1 iff email in all caps  HAS_URL: value is 1 iff email has a URL  NUM_URLS: number of URLs in email  VERY_LONG: 1 iff email is longer than 1K  SUSPICIOUS_SENDER: 1 iff reply-to domain doesn’t match originating server • Features are anything you can think of code to evaluate on an input  Some cheap, some very very expensive to calculate  Can even be the output of another classifier  Domain knowledge goes here! • In Naïve Bayes, how did we encode features?

  29. Feature Extractors

  30. Generative vs. Discriminative • Generative classifiers:  E.g. Naïve Bayes  We build a causal model of the variables  We then query that model for causes, given evidence • Discriminative classifiers:  E.g. Perceptron (next)  No causal model, no Bayes rule, often no probabilities  Try to predict output directly  Loosely: mistake driven rather than model driven

  31. Some (Vague) Biology

More Related