1 / 17

The Geometric Distribution

The Geometric Distribution. Probability of the 1 st success on the N th trial, given a probability, p, of success. P(Roll 1 st 6 on the i th roll) = (1 - 5/6) i-1 (1/6) P(1 st heads on the i th flip) = (1 - 1/2) i-1 (1/2). To show P(N=j) is a proper pdf:. Example.

irish
Download Presentation

The Geometric Distribution

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Geometric Distribution • Probability of the 1st success on the Nth trial, given a probability, p, of success P(Roll 1st 6 on the ith roll) = (1 - 5/6)i-1 (1/6) P(1st heads on the ith flip) = (1 - 1/2)i-1 (1/2) To show P(N=j) is a proper pdf: Comp 790– Distributions & Coalescence

  2. Example • Difference from “Binomial” distribution • Binomial(k) = P(k successes in N trials) • Geometric(k) = P(1st success after k-1 failures) Comp 790– Distributions & Coalescence

  3. Expected Value Proof • Expected value is value times its probability • Recall the relation: • Substituting gives: Comp 790– Distributions & Coalescence

  4. Other Properties • Markov Property • The probability of the “next step” in a discrete or continuous process depends only on the process's present state • The process is without memory of previous events Comp 790– Distributions & Coalescence

  5. Continuous Generalization • Geometric distributions characterize “discrete” events • Sometimes we’d like to pose questions about continuous variable, for example • Probability that a population will be inbred after T years, rather than after N generations, where T is a real number, and N is an integer • The “continuous” counterpart of the geometric distribution is the “exponential” distribution Comp 790– Distributions & Coalescence

  6. Exponential Distribution • The Exponential density function is characterized by one parameter, a, called the “rate” or “intensity” To show Exp(a,t) is a proper pdf: Comp 790– Distributions & Coalescence

  7. Exponential Properties • Other useful properties of U = Exp(a,t) include: • Markov property, where t2 > t1 • Assuming a second independent exponential process, V = Exp(b,t) Comp 790– Distributions & Coalescence

  8. Approximations • The geometric distribution can be approximated with the exponential distribution in various ways • Consider the following geometric distribution • We can model discrete time as a rational fraction of of some very large number, M, that includes all intervals of interest. (i.e. 1/M, 2/M, … N/M … M/M, rather than 1, 2, 3, …) • Assuming p is small and N is large, we can approximate “continuous” time as t = j/M and a = pM There are at least “j” failures before the first success Comp 790– Distributions & Coalescence

  9. Approximations (cont) • Recalling t = j/M and a = pM, we can rewrite (1-p)j as: • Also note, for large M: • Thus, P(T = t) = a P(N/M ≥ t) is approximately exponential with intensity a. Comp 790– Distributions & Coalescence

  10. The Discrete-Time Coalescent • We consider the N-coalescent, or the coalescent for a sample of N genes (Kingman 1982) • N-coalescent: What is the distribution of the number of generations to find the Most Recent Common Ancestor (MCRA) for a fixed population of 2N genes • We use 2N because we recognize that the diploid case is more realistic, and it is related to the simpler haploid case by a factor of 2 Comp 790– Distributions & Coalescence

  11. MRCA Examples Comp 790– Distributions & Coalescence

  12. Coalescence of two genes • What is the distribution of the number of prior generations for the MCRA (waiting time)? • Probability a common parent (i.e. the MCRA is in the immediately previous generation) is: • Probability that 2 genes have a different parents is The first gene can choose its ancestor freely, but the second must choose the same of the first, thus it has 1 out of 2N choices Comp 790– Distributions & Coalescence

  13. Going back further • Since sampling in successive generations is independent of the past, the probability that two genes find a common ancestor j generations back is: • Which is a geometric distribution with p = 1/2N • Thus, the coalescence time for 2 genes is: In the first, j-1, generations they chose different ancestors, and then in generation j they chose the same ancestor Comp 790– Distributions & Coalescence

  14. MRCA Examples N = 10 Comp 790– Distributions & Coalescence

  15. N-genes, no common parent • The waiting time for k ≤ 2N genes to have fewer than k lineages is: • Manipulating a little • Where, for large N, 1/N2 is negligible The 1st gene can choose it parent freely, but the next k-1 must choose from the remainder Genes without a child Comp 790– Distributions & Coalescence

  16. N-gene Colescence • The probability k-genes have different parents is: • And one or more have a common parent: • Repeated failures for j generations leads to a geometric distribution, with Comp 790– Distributions & Coalescence

  17. Next Time • Finish coalesence of a N-genes • The effect of approximations • The continuous-time coalescent • The effective population size Comp 790– Distributions & Coalescence

More Related