1 / 35

Final Exam Review

Final Exam Review. Final Exam: May 10 Thursday. Bayesian reasoning. If event E occurs, then the probability that event H will occur is p ( H | E ) IF E ( evidence ) is true THEN H ( hypothesis ) is true with probability p. Bayesian reasoning Example: Cancer and Test.

prem
Download Presentation

Final Exam Review

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Final Exam Review Final Exam: May 10 Thursday

  2. Bayesian reasoning • If event E occurs, then the probability thatevent H will occur is p(H|E) IF E (evidence) is true THEN H (hypothesis) is true with probability p

  3. Bayesian reasoning Example: Cancer and Test • P(C) = 0.01 P(¬C) = 0.99 • P(+|C) = 0.9 P(-|C) = 0.1 • P(+|¬C) = 0.2 P(-|¬C) = 0.8 • P(C|+) = ?

  4. Bayesian reasoning with multiple hypotheses and evidences • Expand the Bayesian rule to work with multiple hypotheses (H1...Hm) and evidences (E1...En) Assuming conditional independence among evidences E1...En

  5. Bayesian reasoning Example • Expert data:

  6. user observes E3E1 E2

  7. Bayesian reasoning Example expert system computes posterior probabilities user observes E2

  8. Propagation of CFs • For a single antecedent rule: • cf(E) is the certainty factor of the evidence. • cf(R) is the certainty factor of the rule.

  9. Single antecedent rule example • IF patient has toothache THENproblem is cavity {cf 0.3} • Patient has toothache {cf 0.9} • What is the cf(cavity, toothache)?

  10. Propagation of CFs (multiple antecedents) • For conjunctive rules: • IF <evidence E1> AND <evidence E2> ... AND <evidence En> THEN <Hypothesis H> {cf} • For two evidences E1 and E2: • cf(E1 AND E2) = min(cf(E1), cf(E2))

  11. Propagation of CFs (multiple antecedents) • For disjunctive rules: • IF <evidence E1> OR<evidence E2> ... OR<evidence En> THEN <Hypothesis H> {cf} • For two evidences E1 and E2: • cf(E1 OR E2) = max(cf(E1), cf(E2))

  12. Exercise • IF (P1 AND P2) OR P3  THEN C1 (0.7) AND C2 (0.3) • Assume cf(P1) = 0.6, cf(P2) = 0.4, cf(P3) = 0.2 • What is cf(C1), cf(C2)?

  13. Defining fuzzy sets with fit-vectors • A can be defined as: • So, for example: • Tall men = (0/180, 1/190) • Short men=(1/160, 0/170) • Average men=(0/165,1/175,0/185)

  14. Qualifiers & Hedges • What about linguistic values with qualifiers? • e.g. very tall, extremely short, etc. • Hedges are qualifying terms that modifythe shape of fuzzy sets • e.g. very, somewhat, quite, slightly, extremely, etc.

  15. Representing Hedges

  16. Representing Hedges

  17. Representing Hedges

  18. Crisp Set Operations

  19. Fuzzy Set Operations • Complement • To what degree do elements not belong to this set? • tall men = {0/180, 0.25/182, 0.5/185, 0.75/187, 1/190}; • Not tall men = {1/180, 0.75/182, 0.5/185, 0.25/187, 1/190}; m¬A(x) = 1 – mA(x)

  20. Fuzzy Set Operations Each element of the fuzzy subset has smaller membership than in the containing set • Containment • Which sets belong to other sets? • tall men = {0/180, 0.25/182, 0.5/185, 0.75/187, 1/190}; • very tall men = {0/180, 0.06/182, 0.25/185, 0.56/187, 1/190};

  21. Fuzzy Set Operations • Intersection • To what degree is the element in both sets? mA∩B(x) = min[ mA(x), mB(x) ]

  22. mA∩B(x) = min[ mA(x), mB(x) ] • tall men = {0/165, 0/175, 0/180, 0.25/182, 0.5/185, 1/190}; • average men = {0/165, 1/175, 0.5/180, 0.25/182, 0/185, 0/190}; • tall men ∩ average men = {0/165, 0/175, 0/180, 0.25/182, 0/185, 0/190}; • or • tall men ∩ average men = {0/180, 0.25/182, 0/185};

  23. Fuzzy Set Operations • Union • To what degree is the element in either or both sets? mAB(x) = max[ mA(x), mB(x) ]

  24. mAB(x) = max[ mA(x), mB(x) ] • tall men = {0/165, 0/175, 0/180, 0.25/182, 0.5/185, 1/190}; • average men = {0/165, 1/175, 0.5/180, 0.25/182, 0/185, 0/190}; • tall men  average men = {0/165, 1/175, 0.5/180, 0.25/182, 0.5/185, 1/190};

  25. Choosing the Best Attribute:Binary Classification • Want a formal measure that returns a maximum value when attribute makes a perfect split and minimum when it makes no distinction • Information theory (Shannon and Weaver 49) • Entropy: a measure of uncertainty of a random variable • A coin that always comes up heads --> 0 • A flip of a fair coin (Heads or tails) --> 1(bit) • The roll of a fair four-sided die --> 2(bit) • Information gain: the expected reduction in entropy caused by partitioning the examples according to this attribute

  26. Formula for Entropy Examples: Suppose we have a collection of 10 examples, 5 positive, 5 negative:H(1/2,1/2) = -1/2log21/2 -1/2log21/2 = 1 bit Suppose we have a collection of 100 examples, 1 positive and 99 negative: H(1/100,99/100) = -.01log2.01 -.99log2.99 = .08 bits

  27. Information gain • Information gain (from attribute test) = difference between the original information requirement and new requirement • Information Gain (IG) or reduction in entropy from the attribute test: • Choose the attribute with the largest IG

  28. Information gain For the training set, p = n = 6, I(6/12, 6/12) = 1 bit Consider the attributes Patrons and Type (and others too): Patrons has the highest IG of all attributes and so is chosen by the DTL algorithm as the root

  29. Example contd. • Decision tree learned from the 12 examples: • Substantially simpler than “true”

  30. Perceptrons X = x1w1 + x2w2 Y = Ystep

  31. Perceptrons • How does a perceptron learn? • A perceptron has initial (often random) weights typically in the range [-0.5, 0.5] • Apply an established training dataset • Calculate the error asexpected output minus actual output: errore= Yexpected – Yactual • Adjust the weights to reduce the error

  32. Perceptrons • How do we adjust a perceptron’s weights to produce Yexpected? • If e is positive, we need to increase Yactual(and vice versa) • Use this formula: , where and • α is the learning rate (between 0 and 1) • e is the calculated error

  33. Use threshold Θ = 0.2 andlearning rate α = 0.1 Perceptron Example – AND • Train a perceptron to recognize logical AND

  34. Use threshold Θ = 0.2 andlearning rate α = 0.1 Perceptron Example – AND • Train a perceptron to recognize logical AND

  35. Use threshold Θ = 0.2 andlearning rate α = 0.1 Perceptron Example – AND • Repeat until convergence • i.e. final weights do not change and no error

More Related