260 likes | 358 Views
A NASA satellite to track carbon dioxide in the Earth ’s atmosphere failed to reach its orbit during launching Tuesday morning, scuttling the $278 million mission. Andrew Lee/U.S. Air Force, via Associated Press
E N D
A NASA satellite to track carbon dioxide in the Earth’s atmosphere failed to reach its orbit during launching Tuesday morning, scuttling the $278 million mission. Andrew Lee/U.S. Air Force, via Associated Press The Orbiting Carbon Observatory lifted off from Vandenberg Air Force Base in California aboard a four-stage Taurus XL rocket on Tuesday morning but failed to reach orbit and fell back to Earth, landing in the ocean just short of Antarctica. New York Times-Feb, 24th 2009
What do we know We have in our tool bag a lot of the probability basics and some ready made special distributions, these are: • Bernoulli • Binomial • Hypergeometric • Geometric, and • Negative Binomial All, in a way or another, are based on Bernoulli experiment structure.
The Poisson distribution Section 3.6 An important probability model that occurs when we are interested in counting the number of successes (S) regardless of the number of failures (F). We can get to the Poisson model in two ways: As an approximation of the Binomial distribution As a model describing the Poisson process
The Poisson distribution Section 3.6 Approximating the Binomial distribution Rules for approximation: The math ones are: If , , and then In practice: If n is large (>50) and p is small such as np < 5, then we can approximate with , where
The Poisson distribution Section 3.6 Approximating the Binomial distribution Identify the experiment of interest and understand it well (including the associated population) A binomial experiment with large n and small p that conforms to the rules above.
The Poisson distribution Section 3.6 Approximating the Binomial distribution Identify the sample space (all possible outcomes) Interested in counting the number of successes (S), so we can go directly to: S = {0, 1, 2, 3, …} Avoiding the tediousness of listing all successes and failures.
The Poisson distribution Section 3.6 Approximating the Binomial distribution Identify an appropriate random variable that reflects what you are studying. It is a one-to-one mapping of the above S! Snew = S = {0, 1, 2, 3, …}
The Poisson distribution Section 3.6 Approximating the Binomial distribution Construct the probability distribution associated with the simple events based on the random variable Notation for the Poisson Poisson random variable X = the number of successes (S). We say X is distributed Poisson with parameter l, pmf:
The Poisson distribution Section 3.6 Approximating the Binomial distribution Using command dpois(x, lambda=8) in R the pmf looks,
The Poisson distribution Section 3.6 Construct the probability distribution associated with the simple events based on the random variable CDF: Tabulated in Table A.2, page 667 Mean: Variance: Standard deviation:
The Poisson distribution Section 3.6 Example: Forensics in evolution! Say we have two virus DNA sequences! We have no idea where these sequences came from though we know that they represent the same gene. The length of these genes is 500bp (base pair). We also know from observing evolution process that the chance that any base pair being different between the two sequences is 0.004 (chosen so as we can use the tables and is usually about 0.00001), if they come from the same viral species (due to mutation). AACTTTTGTTAAACCCTTTT… DNA Sequence 1 AACTTTTGTTAAACCCTGTT… DNA Sequence 2
The Poisson distribution Section 3.6 Identify the experiment of interest and understand it well (including the associated population) One can think of this experiment as obtaining a set of matched base pairs of length n (=500 in this case) out of a large set representing the whole genome (N>6000bp usually for viruses). The mutation rate (mutation probability p = 0.004) is determined based on an entire population of viruses and is independent from this particular genome. So we can justify the use of the Binomial as a model with n = 500 and p = 0.004.
The Poisson distribution Section 3.6 Identify the experiment of interest and understand it well (including the associated population) But n is large (n > 50) and p is small where np < 5, so we can simplify life and approximate using the Poisson with l = 500*0.004 = 2
The Poisson distribution Section 3.6 Identify the sample space (all possible outcomes) Identify an appropriate random variable that reflects what you are studying. Construct the probability distribution associated with the simple events based on the random variable pmf:
The Poisson distribution Section 3.6
The Poisson distribution Section 3.6 As a model describing the Poisson process This is a process of counting events, usually, over time Assumptions of this process: There exists a parameter a > 0 such that, There is a very small chance that 2 or more events will occur in , The number of events observed in is independent from that occurring in any other period.
The Poisson distribution Section 3.6 As a model describing the Poisson process t t
The Poisson distribution Section 3.6 As a model describing the Poisson process Is a very small value such that very fast as
The Poisson distribution Section 3.6 As a model describing the Poisson process Identify the experiment of interest and understand it well (including the associated population) A Poisson process where we are counting the number of successes (S) over a time period t. Rate of success per unit time is a
The Poisson distribution Section 3.6 As a model describing the Poisson process Identify the sample space (all possible outcomes) Interested in counting the number of successes (S) with in a time interval t, so we can go directly to: S = {0, 1, 2, 3, …} Avoiding the tediousness of listing all successes and failures.
The Poisson distribution Section 3.6 As a model describing the Poisson process Identify an appropriate random variable that reflects what you are studying. Within time period t It is a one-to-one mapping of the above S! Snew = S = {0, 1, 2, 3, …}
The Poisson distribution Section 3.6 As a model describing the Poisson process Construct the probability distribution associated with the simple events based on the random variable Poisson random variable X = the number of successes (S) within time period t. We say X is distributed Poisson with parameter at, pmf:
The Poisson distribution Section 3.6 Using command dpois(x, at=8) in R the pmf looks,
The Poisson distribution Section 3.6 Construct the probability distribution associated with the simple events based on the random variable CDF: Tabulated in Table A.2, page 667 Mean: Variance: Standard deviation:
The Poisson distribution Section 3.6 Example: The mean number of cars passing the sixth and Mountain view intersection, close to the edge of Moscow, is 5 per hour. Find the probability of observing more than 15 cars pass by that intersection in 2 hours. Find the chance of observing less than 6 cars pass through in 3 hours. What is the mean number of cars you expect to observe pass through in 4 hours? The standard deviation is?
The Poisson distribution Section 3.6 We can get to the Poisson model in two ways: As an approximation of the Binomial distribution As a model describing the Poisson process From a data perspective: plot the data and if it is count data with the variation increasing with the increase of the count then it is modeled using a Poisson distribution. We’ll keep this in mind tell later.