430 likes | 636 Views
Office. 320. . . . 310. . . . Topics Covered. Probability-Related Concepts How to Assign Probabilities to Experimental Outcomes Probability Rules Discrete Random Variables Continuous Random Variables Probability Distribution & Functions. Concepts.
E N D
Office 320 . . . 310 . . .
Topics Covered • Probability-Related Concepts • How to Assign Probabilities to Experimental Outcomes • Probability Rules • Discrete Random Variables • Continuous Random Variables • Probability Distribution & Functions
Concepts • An event – Any phenomenon you can observe that can have more than one outcome (e.g., flipping a coin) • An outcome – Any unique condition that can be the result of an event (e.g., flipping a coin: heads or tails), a.k.a simple event or sample points • Sample space – The set of all possible outcomes associated with an event • Probability is a measure of the likelihood of each possible outcome
Probability Distributions • The usual application of probability distributions is to find a theoretical distribution • Reflects a process that explains what we see in some observed sample of a geographic phenomenon • Compare the form of the sampled information and theoretical distribution through a test of significance • Geography: discrete random events in space and time (e.g. how often will a tornado occur?)
Discrete Probability Distributions • Discrete probability distributions • The Uniform Distribution • The Binomial Distribution • The Poisson Distribution • Each is appropriately applied in certain situations and to particular phenomena
The Uniform Distribution Source: http://davidmlane.com/hyperstat/A12237.html
A uniform probability mass function 0.50 P(xi) 0.25 0 heads tails xi The Uniform Distribution • Describes the situation where the probability of all outcomes is the same • n outcomes P(xi) = 1/n • e.g. flipping a coin: P(xheads) = 1/2 = P(xtails)
Source: http://en.wikipedia.org/wiki/Uniform_distribution_(discrete) a<=x<=b otherwise x < a a<=x<=b x>b
The Uniform Distribution • A little simplistic and perhaps useless • But actually well applied in two situations • 1. The probability of each outcome is truly equal (e.g. the coin toss) • 2.No prior knowledge of how a variable is distributed (i.e. complete uncertainty), the first distribution we should use is uniform (no assumptions about the distribution)
The Uniform Distribution • However, truly uniformly distributed geographic phenomena are somewhat rare • We often encounter the situation of not knowing how something is distributed until we sample it • When we are resisting making assumptions we usually apply the uniform distribution as a sort of null hypothesis of distribution
0.25 P(xi) 0.125 0 E N S W The Uniform Distribution • Example – Predict the direction of the prevailing wind with no prior knowledge of the weather system’s tendencies in the area • We would have to begin with the idea that P(xNorth) = 1/4 P(xEast) = 1/4 P(xSouth) = 1/4 P(xWest) = 1/4 • Until we had an opportunity to sample and find out some tendency in the wind pattern based on those observations
(Binford 2005) Remote Sensing Supervised classification (Lillesand et al. 2004)
The Binomial Distribution • Provides information about the probability of the repetition of events when there are only two possible outcomes, • e.g. heads or tails, left or right, success or failure, rain or no rain … • Events with multiple outcomes may be simplified as events with two outcomes (e.g., forest or non-forest) • Characterizing the probability of a proportion of the events having a certain outcome over a specified number of events
The Binomial Distribution • A binomial distribution is produced by a set of Bernoulli trials (Jacques Bernoulli) • The law of large numbers for independent trials –at the heart of probability theory • Given enough observed events, the observed probability should approach the theoretical values drawn from probability distributions • e.g. enough coin tosses should approach the P = 0.5 value for each outcome (heads or tails)
How to test the law? • A set of Bernoulli trials is the way to operationally test the law of large numbers using an event with two possible outcomes: • (1)N independent trials of an experiment (i.e. an event like a coin toss) are performed • (2) Every trial must have the same set of possible outcomes (e.g., heads and tails)
Bernoulli Trials • (3) The probability of each outcome must be the same for all trials, i.e. P(xi) must be the same each time for both xivalues • (4) The resulting random variable is determined by the number of successes in the trials (successes one of the two outcomes) • p = the probability of success in a trial • q = (p –1) as the probability of failure in a trial • p + q = 1
Bernoulli Trials • Suppose on a series of successive days, we will record whether or not it rains in Chapel Hill • We will denote the 2 outcomes using R when it rains and N when it does not rain • nPossible Outcomes# of Rain DaysP(# of Rain Days) • R 1 p p • N 0 (1 - p) q • RR 2 p2 p2 • RN NR 1 2[p*(1 – p)] 2pq • NN 0 (1 – p)2 q2 • RRR 3 p3 p3 • RRN RNR NRR 2 3[p2 *(1 – p)] 3p2q • NNR NRN RNN 1 3[p*(1 – p)2] 3pq2 • NNN 0 (1 – p)3 q3
Bernoulli Trials • If we have a value for P(R) = p, we can substitute it into the above equations to get the probability of each outcome from a series of successive samples (e.g. p=0.2) • nPossible Outcomes# of Rain DaysP(# of Rain Days) • R 1 p = 0.2 • N 0 q = 0.8 • RR 2 p2 = 0.04 • RN NR 1 2pq = 0.32 • NN 0 q2 = 0.64 • RRR 3 p3 = 0.008 • RRN RNR NRR 2 3p2q = 0.096 • NNR NRN RNN 1 3pq2 = 0.384 • NNN 0 q3 = 0.512
probability # of successes Source: Earickson, RJ, and Harlin, JM. 1994. Geographic Measurement and Quantitative Analysis. USA: Macmillan College Publishing Co., p. 132. Bernoulli Trials A graphical representation: 1 event, S = (p + q)1 = p + q 2 events, S = (p + q)2 = p2 + 2pq + q2 3 events, S = (p + q)3 = p3 + 3p2q + 3pq2 + q3 4 events, S = (p + q)4 = p4 + 4p3q + 6p2q2 + 4pq3 + q4 • The sum of the probabilities can be expressed using the binomial expansion of (p + q)n, where n = # of events
n! C(n,x) = x! * (n – x)! The Binomial Distribution • A general formula for calculating the probability of x successes (n trials & a probability p of success: • where C(n,x) is the number of possible combinations of x successes and (n –x) failures: P(x) = C(n,x) * px * (1 - p)n - x
4! P(x) = * (0.2)2 *(1 – 0.2)4 - 2 2! * (4 – 2)! 24 P(x) = * (0.2)2 * (0.8)2 2 * 2 The Binomial Distribution – Example • e.g., the probability of 2 successesin 4 trials, given p=0.2 is: P(x) = 6 * (0.04)*(0.64) = 0.1536
The Binomial Distribution – Example • Calculating the probabilities of all possible number of rain days out of four days (p = 0.2): • The chance of having one or more days of rain out of four: P(1) + P(2) + P(3) + P(4) = 0.5904 xP(x)C(n,x)px(1 – p)n –x 0 P(0) 1 (0.2)0 (0.8)4 = 0.4096 1 P(1) 4 (0.2)1 (0.8)3 = 0.4096 2 P(2) 6 (0.2)2 (0.8)2 = 0.1536 3 P(3) 4 (0.2)3 (0.8)1 = 0.0256 4 P(4) 1 (0.2)4 (0.8)0 = 0.0016
xi P(xi) • 0 0.4096 • 1 0.4096 • 0.1536 • 0.0256 • 0.0016 0.50 P(xi) 0.25 0 1 2 3 4 0 xi The Binomial Distribution – Example • Naturally, we can plot the probability mass function produced by this binomial distribution:
The following is the plot of the binomial probability density function for four values of p and n = 100 Source: http://www.itl.nist.gov/div898/handbook/eda/section3/eda366i.htm
Source: http://home.xnet.com/~fidler/triton/math/review/mat170/probty/p-dist/discrete/Binom/binom1.htm
Source: http://www.mpimet.mpg.de/~vonstorch.jinsong/stat_vls/s3.pdf
0.5 P(xi) 0.25 0 1 2 3 4 0 xi Rare Discrete Random Events • Some discrete random events in question happen rarely (if at all), and the time and place of these events are independent and random (e.g., tornados) • The greatest probability is zero occurrences at a certain time or place, with a small chance of one occurrence, an even smaller chance of two occurrences, etc. • heavily peaked and skewed:
The Poisson Distribution • In the 1830s, S.D. Poisson described a distribution with these characteristics • Describing the number of events that will occur within a certain area or duration (e.g. # of meteorite impacts per state, # of tornados per year, # of hurricanes in NC) • Poisson distribution’s characteristics: • 1. It is used to count the number of occurrences of an event within a given unit of time, area, volume, etc. (therefore a discrete distribution)
The Poisson Distribution • 2. The probability that an event will occur within a given unit must be the same for all units (i.e. the underlying process governing the phenomenon must be invariant) • 3. The number of events occurring per unit must be independent of the number of events occurring in other units (no interactions) • 4. The mean or expected number of events per unit (λ) is found by past experience (observations)
e-l * lx P(x) = x! The Poisson Distribution • Poisson formulated his distribution as follows: where e = 2.71828 (base of the natural logarithm) λ = the mean or expected value x = 1, 2, …, n – 1, n # of occurrences x! = x * (x – 1) * (x – 2) * … * 2 * 1 • To calculate a Poisson distribution, you must knowλ
e-l * lx P(x) = x! The Poisson Distribution • Poisson distribution • The shape of the distribution depends strongly upon the value of λ, because as λ increases, the distribution becomes less skewed, eventually approaching a normal-shaped distribution as it gets quite large • We can evaluate P(x) for any value of x, but large values of x will have very small values of P(x)
e-l * lx P(x) = x! The Poisson Distribution • Poisson distribution • The shape of the distribution depends strongly upon the value of λ, because as λ increases, the distribution becomes less skewed, eventually approaching a normal-shaped distribution as l gets quite large • We can evaluate P(x) for any value of x, but large values of x will have very small values of P(x)
e-l * lx P(x) = x! The Poisson Distribution • Poisson distribution • The Poisson distribution can be defined as the limiting case of the binomial distribution: constant
The Poisson Distribution • The Poisson distribution is sometimes known as the Law of Small Numbers, because it describes the behavior of events that are rare • We can observe the frequency of some rare phenomenon, find its mean occurrence, and then construct a Poisson distribution and compare our observed values to those from the distribution (effectively expected values) to see the degree to which our observed phenomenon is obeying the Law of Small Numbers:
Murder rates in Fayetteville # of MurdersDays (Frequency) 0 17 1 9 2 3 3 1 4 1 Total 31 days • Fitting a Poisson distribution to the 24-hour murder rates in Fayetteville in a 31-day month (to ask the question ‘Do murders randomly occur in time?’)
Murder rates in Fayetteville # of MurdersDays# of Murders* # of Days 0 17 0 1 9 9 2 3 6 3 1 3 4 1 4 Total 31 days 22 l= mean murders per day = 22 / 31 = 0.71
Murder rates in Fayetteville l= mean murders per day = 22 / 31 = 0.71
Murder rates in Fayetteville l= mean murders per day = 22 / 31 = 0.71 x (# of Murders) Obs. Frequency (Fobs) x*Fobs Fexp 0 17 0 15.2 1 9 9 10.9 2 3 6 3.7 3 1 3 0.9 4 1 4 0.2 Total 31 22 30.9 • We can compare Fobs to Fexp using a X2 test to see if observations do match Poisson Dist.
The Poisson Distribution • Procedure for finding Poisson probabilities and expected frequencies: • (1) Set up a table with five columns as on the previous slide • (2) Multiply the values of x by their observed frequencies (x * Fobs) • (3) Sum the columns of Fobs (observed frequency) and x * Fobs • (4)Compute λ = Σ (x * Fobs) / Σ Fobs • (5) Compute P(x) values using the equation or a table • (6)Compute the values of Fexp = P(x) * Σ Fobs
Source: http://www.mpimet.mpg.de/~vonstorch.jinsong/stat_vls/s3.pdf