170 likes | 425 Views
Using Probabilistic Information. Read J & M Chapter 5, pages 141 - 156. Why Do We Need Probabilities? (or Why Aren’t We Sure?). Noisy channel:. I haf to go. geneology. Sentences are flat. Knowledge is structured. Joe hit the ball with the bat.
E N D
Using Probabilistic Information Read J & M Chapter 5, pages 141 - 156
Why Do We Need Probabilities?(or Why Aren’t We Sure?) • Noisy channel: I haf to go. geneology • Sentences are flat. Knowledge is structured. Joe hit the ball with the bat. • It would be too inefficient to have to say everything. He bought it. • Our programs still don’t know as much as people do.
Conditional Probability Definition: P(A | B) = P(A B) P(B) Intuition: B A A B A B A B A A B B
Using Conditional Probability for Recognition Our task: find the object (word, structure, or whatever) that is most likely given our observation. w= argmax P(w|O) = argmax P(w O) w V w V P(O) Example: P(word=“have” | sound=“haf”) = P(word=“have” sound=“haf”) P(sound = “haf”) • But what do we actually know: • P(sound = “haf” | word = “have”) • P(word = “have”)
Bayes Theorem P(A | B) = P(A B) P(B) = P(A B) P(A) P(B) P(A) = P(A B) P(A) P(A) P(B) = P(B | A) P(A) P(B)
Using Bayes Theorem P(A | B) = P(B | A) P(A) P(B) Example: P(word=“have” | sound=“haf”) = P(sound=“haf” | word=“have”) P(word=“have”) P(sound = “haf”) But, if we are comparing candidate interpretations for “haf”, we can ignore the denominator since they are all the same.
Spelling Correction: Choices Common assumption: just one mistake (covers about 80% of nonword errors). Four kinds of mistakes: insertion, deletion, transposition, substitution. Example:
Spelling Correction: Priors c= argmax P(c|t) = argmax P(t|c) P(c) c C c C Example: observed word: acress Note: P(c)’s include adding .5 for smoothing.
Spelling Correction: Conditional Probs c= argmax P(c|t) = argmax P(t|c) P(c) w V w V Example: What is P(deleting t following c)? Answer: We need to collect data from a training set and encode them in some useful way: confusion matrices contain counts: del[x,y], ins[x,y], sub[x,y], trans[x,y] From these counts, we can compute probabilities: Deletion: P(t|c) (which involves deleting the i’th character, which happens to be x, where the i-1st character is y = del[ci-1,ci] / count[ci-1ci]
Spelling Correction: All Together Typed word = acress Intended word = ?
Spelling Correction – the Britany Example P(word = “britney” | O = “britne”) = P(O=“britne” | word=“Britney”) P(word=“britney”) P(O = “britne”) The data below shows some of the misspellings detected by our spelling correction system for the query [ britney spears ], and the count of how many different users spelled her name that way. Each of these variations was entered by at least two different unique users within a three month period, and was corrected to [britney spears] by our spelling correction system (data for the correctly spelled query is shown for comparison). From http://www.google.com/jobs/britney.html
488941 britney spears 40134 brittany spears 36315 brittney spears 24342 britany spears 7331 britny spears 6633 briteny spears 2696 britteny spears 1807 briney spears 1635 brittny spears 1479 brintey spears 1479 britanny spears 1338 britiny spears 1211 britnet spears 1096 britiney spears 991 britaney spears 991 britnay spears 811 brithney spears 811 brtiney spears 664 birtney spears 664 brintney spears 664 briteney spears 601 bitney spears 601 brinty spears 544 brittaney spears 544 brittnay spears 364 britey spears 364 brittiny spears 329 brtney spears 269 bretney spears 269 britneys spears 244 britne spears 244 brytney spears
220 breatney spears 220 britiany spears 199 britnney spears 163 britnry spears 147 breatny spears 147 brittiney spears 147 britty spears 147 brotney spears 147 brutney spears 133 britteney spears 133 briyney spears 121 bittany spears 121 bridney spears 121 britainy spears 121 britmey spears 109 brietney spears 109 brithny spears 109 britni spears 109 brittant spears 98 bittney spears 98 brithey spears 98 brittiany spears 98 btitney spears 89 brietny spears 89 brinety spears 89 brintny spears 89 britnie spears 89 brittey spears 89 brittnet spears 89 brity spears 89 ritney spears
80 bretny spears 80 britnany spears 73 brinteny spears 73 brittainy spears 73 pritney spears 66 brintany spears 66 britnery spears 59 briitney spears 59 britinay spears 54 britneay spears 54 britner spears 54 britney's spears 54 britnye spears 54 britt spears 54 brttany spears 48 bitany spears 48 briny spears 48 brirney spears 48 britant spears 48 britnety spears 48 brittanny spears 48 brttney spears 44 birttany spears 44 brittani spears 44 brityney spears 44 brtitney spears 39 brienty spears 39 brritney spears 36 bbritney spears 36 briitany spears
36 britanney spears 36 briterny spears 36 britneey spears 36 britnei spears 36 britniy spears 32 britbey spears 32 britneu spears 2 brtittny spears 2 brttiny spears 2 brtttany spears 2 brydney spears 2 brynty spears 2 brythey spears 2 bryttney spears 2 btiany spears 2 btirtney spears 2 btitiney spears 2 btittny spears 2 btritany spears 2 buttney spears 2 grittney spears 2 prietny spears 2 pritany spears 2 prittany spears
Other Examples of Bayes Theorem - Glasses • We observe Joe wearing glasses. We want to decide whether it is more likely that Joe is a salesman or a librarian. Here are the facts (L means librarian, S means salesman, G means glasses): • P(G) = .1 • P(L) = .0001 • P(S) = .01 • P(G|L) = 1 • P(G|S) = .05 • P(L|G) = P(G|L) × P(L) / P(G) • = 1 × 0.0001/.1 • = 0.001 • P(S|G) = P(G|S) × P(S) / P(G) • = .05 × 0.01/.1 • = 0.005
Other Examples of Bayes Theorem - Drugs • We want to compute the probability that Joe uses heroin given that he tests positive for it. Here are the facts (H means heroin use, E means a positive test for heroin): • Sensitivity = P(E|H) = 0.95 • Specificity = 1 − P(E|~H) = 0.90 • Baseline "prior" probability = P(H) = 0.03. • P(H|E) = P(E|H) × P(H) • P(E) • = 0.95 × 0.03/[0.03×0.95 + 0.97×0.1] • = 0.1255.