1 / 37

Probability theory

Probability theory. LING 570 Fei Xia Week 2: 10/01/07. Misc. Patas account and dropbox Course website, “Collect it”, and GoPost. Mailing list Received message on Thursday? Questions about hw1?. Outline. Quiz #1 Unix commands Linguistics Elementary Probability theory: M&S 2.1.

tryna
Download Presentation

Probability theory

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Probability theory LING 570 Fei Xia Week 2: 10/01/07

  2. Misc. • Patas account and dropbox • Course website, “Collect it”, and GoPost. • Mailing list • Received message on Thursday? • Questions about hw1?

  3. Outline • Quiz #1 • Unix commands • Linguistics • Elementary Probability theory: M&S 2.1

  4. Quiz #1 Five areas: weight ave • Programming: 4.0 (3.74) • Try Perl or Python • Unix commands: 1.2 (0.99) • Probability: 2.0 (1.09) • Regular expression: 2.0 (1.62) • Linguistics knowledge: 0.8 (0.71)

  5. Results • 9.0-10: 4 • 8.0-8.9: 8 • < 8.0: 8

  6. Unix commands • ls (list), cp (copy), rm (remove) • more, less, cat • cd, mkdir, rmdir, pwd • chmod: to change file permission • tar, gzip: to tar/zip files • ssh, sftp: to log on or ftp files • man: to learn a command

  7. Unix commands (cont) • compilers: javac, gcc, g++, perl, … • ps, top, • which • Pipe: cat input_file | eng_tokenizer.sh | make_voc.sh > output_file • sort, unique, awk, grep grep “the” voc | awk ‘{print $2}’ | sort | uniq –c | sort -nr

  8. Examples • Set the permission of foo.pl so it is readable and executable by the user and the group. rwx rwx rwx => 101 101 000 chmod 550 foo.pl • Move a file, foo.pl, from your home dir to /tmp mv ~/foo.pl /tmp

  9. Linguistics: POS tags • Open class: Noun, verb, adjective, adverb • Auxiliary verb/modal: can, will, might, .. • Temporal noun: tomorrow • Adverb: adj+ly, always, still, not, … • Closed class: Preposition, conjunction, determiner, pron, • Conjunction: CC (and), SC (if, although) • Complementizer: that,

  10. Linguistics: syntactic structure • Two kinds: • Phrase structure (a.k.a. parse tree): • Dependency structure • Examples: • John said that he would call Mary tomorrow

  11. Outline • Quiz #1 • Unix commands • Linguistics • Elementary Probability theory

  12. Probability Theory

  13. Basic concepts • Sample space, event, event space • Random variable and random vector • Conditional probability, joint probability, marginal probability (prior)

  14. Sample space, event, event space • Sample space (Ω): the set of all possible outcomes. • Ex: toss a coin three times: {HHH, HHT, HTH, HTT, …} • Event: an event is a subset of Ω. • Ex: an event is {HHT, HTH, THH} • Event space (2Ω): the set of all possible events.

  15. Probability function • A probability function (a.k.a. a probability distribution) distributes a probability mass of 1 throughout the sample space . • It is a function from 2! [0,1] such that: P() = 1 For any disjoint sets Aj2 2, P( Aj) =  P(Aj) - Ex: P({HHT, HTH, HTT}) = P({HHT}) + P({HTH}) + P({HTT})

  16. The coin example • The prob of getting a head is 0.1 for one toss. What is the prob of getting two heads out of three tosses? • P(“Getting two heads”) = P({HHT, HTH, THH}) = P(HHT) + P(HTH) + P(THH) = 0.1*0.1*0.9 + 0.1*0.9*0.1+0.9*0.1*0.1 = 3*0.1*0.1*0.9

  17. Random variable • The outcome of an experiment need not be a number. • We often want to represent outcomes as numbers. • A random variable X is a function: ΩR. • Ex: the number of heads with three tosses: X(HHT)=2, X(HTH)=2, X(HTT)=1, …

  18. The coin example (cont) • X = the number of heads with three tosses • P(X=2) = P({HHT, HTH, THH}) = P({HHT}) + P({HTH}) + P({THH})

  19. Two types of random variables • Discrete: X takes on only a countable number of possible values. • Ex: Toss a coin three times. X is the number of heads that are noted. • Continuous: X takes on an uncountable number of possible values. • Ex: X is the speed of a car

  20. Common trick #1: Maximum likelihood estimation • An example: toss a coin 3 times, and got two heads. What is the probability of getting a head with one toss? • Maximum likelihood: (ML) * = arg max P(data | ) • In the example, • P(X=2) = 3 * p * p * (1-p) e.g., the prob is 3/8 when p=1/2, and is 12/27 when p=2/3 3/8 < 12/27

  21. Random vector • Random vector is a finite-dimensional vector of random variables: X=[X1,…,Xk]. • P(x) = P(x1,x2,…,xn)=P(X1=x1,…., Xn=xn) • Ex: P(w1, …, wn, t1, …, tn)

  22. Notation • X, Y, Xi, Yi are random variables. • x, y, xi are values. • P(X=x) is written as P(x) • P(X=x | Y=y) is written as P(x | y).

  23. Three types of probability • Joint prob: P(x,y)= prob of X=x and Y=y happening together • Conditional prob: P(x | y) = prob of X=x given a specific value of Y=y • Marginal prob: P(x) = prob of X=x for all possible values of Y.

  24. An example • There are two coins. Choose a coin and then toss it. Do that 10 times. • Coin 1 is chosen 4 times: one head and three tails. • Coin 2 is chosen six times: four heads and two tails. • Let’s calculate the probabilities.

  25. Probabilities • P(C=1) = 4/10, P(C=2) = 6/10 • P(X=h) = 5/10, P(X=t) = 5/10 • P(X=h | C=1) = ¼, P(X=h |C=2) =4/6 • P(X=t | C=1) = ¾, P(X=t |C=2) = 2/6 • P(X=h, C=1) =1/10, P(X=h, C=2)= 4/10 • P(X=t, C=1) = 3/10, P(X=t | C=2) = 2/10

  26. Relation between different types of probabilities P(X=h, C=1) = P(C=1) * P(X=h | C=1) = 4/10 * ¼ = 1/10 P(X=h) = P(X=h, C=1) + P(X=h, C=2) = 1/10 + 4/10 = 5/10

  27. Common trick #2:Chain rule

  28. Common trick #3:joint probMarginal prob

  29. Common trick #4:Bayes’ rule

  30. Independent random variables • Two random variables X and Y are independent iff the value of X has no influence on the value of Y and vice versa. • P(X,Y) = P(X) P(Y) • P(Y|X) = P(Y) • P(X|Y) = P(X) • Our previous examples: P(X, C) != P(X) P(C)

  31. Conditional independence Once we know C, the value of A does not affect the value of B and vice versa. • P(A,B | C) = P(A|C) P(B|C) • P(A|B,C) = P(A | C) • P(B|A, C) = P(B |C)

  32. Independence and conditional independence • If A and B are independent, are they conditional independent? • Example: • Burglar, Earthquake • Alarm

  33. Common trick #5:Independence assumption

  34. An example • P(w1 w2 … wn) = P(w1) P(w2 | w1) P(w3 | w1 w2) * … * P(wn | w1 …, wn-1) ¼ P(w1) P(w2 | w1) …. P(wn | wn-1) • Why do we make independence assumption which we know are not true?

  35. Summary of elementaryprobability theory • Basic concepts: sample space, event space, random variable, random vector • Joint / conditional /marginal probability • Independence and conditional independence • Five common tricks: • Max likelihood estimation • Chain rule • Calculating marginal probability from joint probability • Bayes’ rule • Independence assumption

  36. Outline • Quiz #1 • Unix commands • Linguistics • Elementary Probability theory

  37. Next time • J&M Chapt 2 • Formal language and formal grammar • Regular expression • Hw1 is due at 3pm on Wed.

More Related