1 / 15

prof. dr. Lambert Schomaker

Bayes and continuous PDFs. prof. dr. Lambert Schomaker. Kunstmatige Intelligentie / RuG. discrete vs continuous. Bayes theory is usually introduced on the basis of discrete PDFs (alarm? true/false) … in a set-theoretic framework

ginata
Download Presentation

prof. dr. Lambert Schomaker

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Bayes and continuous PDFs prof. dr. Lambert Schomaker Kunstmatige Intelligentie / RuG

  2. discrete vs continuous • Bayes theory is usually introduced on the basis of discrete PDFs (alarm? true/false) • … in a set-theoretic framework • but: numbers along a dimension can be considered as points in a set: {x R}

  3. Bayes revisited • P(C|x) = P(x|C) P(C) / P(x) where C is a “class” of observations x is an observed scalar feature P(C) is the prior probability of finding that class P(x) is the likelihood or prior probability of the observable value of x P(x|C) is the probability of finding x in case of C

  4. Bayes & continuous PDFs • P(C|x) = P(x|C) P(C) / P(x) where C is a “class” of observations x is an observed scalar feature • If x is a real number: P(x|C) is the probability density function (PDF) or histogram of feature values observed for class C P(x) is the PDF of x “at all” (all possible classes)

  5. Example: temperature classification Classes C: Cold P(x|C) Normal P(x|N) Warm P(x|W) Hot P(x|H) P(x|C) P(x|N) P(x|W) P(x|H) P(x) P(x) likelihood of x values

  6. Bayes: probability “blow up” P(C|x) P(N|x) P(W|x) P(H|x) Classes C: Cold P(x|C) Normal P(x|N) Warm P(x|W) Hot P(x|H)

  7. in P(x|C) even with an irregular PDF shape … P(C|x) P(C|x) = P(x|C) P(C) / P(x) Bayesian output has a nice plateau out

  8. Puzzle • So if Bayes is optimal and can be used for continuous data too, why has it become popular so late, i.e., much later than neural networks?

  9. P(x) x Why Bayes has become popular so late… • Note: the example was 1-dimensional • A PDF (histogram) with 100 bins for one dimension will cost 10000 bins for two dimensions etc. •  Ncells = Nbinsndims

  10. Why Bayes has become popular so late… •  Ncells = Nbinsndims • Yes… but you could use n-dimensional theoretical distributions (Gauss, Weibull etc.) instead of empirically measured PDFs…

  11. Why Bayes has become popular so late… • … use theoretical distributions instead of empirically measured PDFs… • still the dimensionality is a problem: • 20 samples needed to estimate 1-dim. Gaussian PDF • 400 samples needed to estimate 2-dim. Gaussian!, etc. • massive amounts of labeled data are needed to estimate probabilities reliably!

  12. Labeled (ground truthed) data 0.1 0.54 0.53 0.874 8.455 0.001 –0.111 risk 0.2 0.59 0.01 0.974 8.40 0.002 –0.315 risk 0.11 0.4 0.3 0.432 7.455 0.013 –0.222 safe 0.2 0.64 0.13 0.774 8.123 0.001 –0.415 risk 0.1 0.17 0.59 0.813 9.451 0.021 –0.319 risk 0.8 0.43 0.55 0.874 8.852 0.011 –0.227 safe 0.1 0.78 0.63 0.870 8.115 0.002 –0.254 risk . . . . . . . . Example: client evaluation in insurances

  13. Success of speech recognition • massive amounts of data • increased computing power • cheap computer memory • allowed for the use of Bayes in hidden Markov Models for speech recognition • similarly (but slower): application of Bayes in script recognition

  14. Global Structure: • year • title • date • date and number of entry (Rappt) • redundant lines between paragraphs • jargon-words: • Notificatie • Besluit fiat • imprint with page number •  XML model

  15. Local probabilistic structure: P(“Novb 16 is a date” | “sticks out to the left” & is left of “Rappt ”) ?

More Related