1 / 59

Probability

Probability. Principles of probability calculations. Probability values range from 0 to 1. Adding all probabilities of the sample yields 1. The probability that an event A will not occur is 1 minus the probability of A.

zion
Download Presentation

Probability

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Probability

  2. Principles of probability calculations • Probability values range from 0 to 1. • Adding all probabilities of the sample yields 1. • The probability that an event A will not occur is 1 minus the probability of A. • If two events are independent, the probability that one or the other event occurs is the sum of their individual probabilities.

  3. Simple probability Sample space: 1,2,3,4,5,6 P(A) = 1/6 = 0.1666

  4. Joint probability P(A,B) = P(A)  P(B) P(5,6) = P(0.166)  P(0.166) = 0.0277

  5. Joint probability (1) keep the dogs on the beach -> V NP PP -> V [NP PP]

  6. Conditional probability VP VP → V NP XP [.15] V NP PP .15 x .81 = .12 keep the dogs on the beach keep: V NP XP [.81]

  7. Conditional probability VP → V NP XP [.15] VP NP NP → NP PP [.14] V NP PP .19 x .39 x 14 = .01 keep the dogs on the beach keep: V NP [.19]

  8. Conditional probability

  9. Conditional probability In a corpus including 12.000 nouns and 3.500 adjectives, 2.000 adjectives precede a noun. What is the likelihood that a noun occurs after an adjective? P(2000) P(ADJ|N) = 0.1666 P(12000)

  10. Conditional probability What is the likelihood that an adjective precedes a noun? P(2000) P(N|ADJ) = 0.5714 P(3500)

  11. Probability distribution

  12. Types of probability distributions • Discrete probability distribution • Continuous probability distribution

  13. Binomialdistribution

  14. Binomialdistribution • two possible outcomes on each trail • the outcomes are independent of each other • the probability ratio is constant across trails Bernoulli trail:

  15. Binomialdistribution T H HH HT TH TT

  16. Binomialdistribution 0 heads = HH 1 head = HT + TH 2 heads = TT

  17. Binomialdistribution HH HT TH TT 0 1 2 Sample space Random variable

  18. Binomialdistribution

  19. H T HH HT TH TT HHH HHT HTH HTT THH THT TTH TTT

  20. Sample space: HHH TTT HHT TTH HTH THT THH HTT Random variables: 0 Head 1 Head 2 Heads 3 Heads 0 head: 1 1 head: 3 2 heads: 3 3 heads: 1 / 8 = 0.125 / 8 = 0.375 / 8 = 0.375 / 8 = 0.125

  21. Binomialdistribution

  22. Poissondistribution

  23. Normaldistribution

  24. Normaldistribution • The center of the curve represents the mean, median, and mode. • The curve is symmetrical around the mean. • The tails meet the x-axis in infinity. • The curve is bell-shaped. • The total under the curve is equal to 1 (by definition).

  25. Normaldistribution

  26. Standard normaldistribution 1.96

  27. z-scores x1 – x SD

  28. z-scores Zwei Kandidaten haben an zwei unterschiedlichen Sprachtests teilgenommen. Kandidat A hat 121 Punkte erzielt, Kandidat B hat 177 Punkte erzielt. Im ersten Test (an dem Kandidat A teilgenommen hat) lag der Mittelwert bei 92 und die Standardabweichung bei 14; im zweiten Test (an dem Kandidat B teilgenommen hat) lag der Mittelwert bei 143 und die Standardabweichung bei 21. Welcher der beiden Kandidaten hat besser abgeschnitten (im Vergleich zu allen übrigen Kandidaten)? ZA = 121 – 92 / 14 = 2.07 ZB = 177 – 143 / 21 = 1.62

  29. Central limit theorem

  30. Central limit theorem 6, 2, 5, 6, 2, 3, 1, 6, 1, 1, 4, 6, 6, 2, 2, 1, 1, 5, 1, 3 = 2.64

  31. Central limit theorem

  32. Central limit theorem

  33. Central limit theorem

  34. Central limit theorem

  35. Central limit theorem

  36. Mean of sample mean 4.75 + 3.0 + 3.0 + 2.75 + 2.5 = 3.2 5

  37. The sample means are normally distributed (even if the phenomenon in the parent population is not normally distributed). Central limit theorem

  38. Central limit theorem • Der Mittelwert der individuellen Mittelwerte nähert sich dem Mittelwert in der wahren Population an. • Die Mittelwerte der Stichproben ist normalverteilt, selbst wenn das Phänomen, das wir untersuchen, in der wahren Population nicht normalverteilt ist. • Alle parametrischen Tests nutzen die Tatsache, dass die Mittelwerte der Stichproben (ab einer bestimmten Anzahl von Stichproben) normalverteilt sind.

  39. population

  40. population sample

  41. population sample mean of this sample

  42. population sample mean of this sample distribution of many sample means

  43. Are your data normally distributed? How many samples do you need to assume that the mean of the sample means is normally distributed?

  44. Are your data normally distributed? • The distribution in the parent population (normal, slightly skewed, heavily skewed). • The number of observations in the individual sample. • The total number of individual samples.

  45. Confidence intervals

  46. Confidence intervals Confidence intervals indicate a range within which the mean (or other parameters) of the true population is located given the values of your sample and assuming a particular degree of certainty.

  47. Confidence intervals • The mean of the sample means • The SDs of the sample means, i.e. the standard error • The degree of certainty with which you want to state the estimation

  48. Standard deviation (xn – x)2 N- 1

  49. Standard error

  50. Standard error

More Related