540 likes | 553 Views
Probability Distributions and Dataset Properties. Lecture 2. Likelihood Methods in Forest Ecology October 9 th – 20 th , 2006. Statistical Inference. Data. Inference. Probability Model (Statistical hypothesis). Scientific Model (Scientific hypothesis).
E N D
Probability Distributions and Dataset Properties Lecture 2 Likelihood Methods in Forest Ecology October 9th – 20th , 2006
Statistical Inference Data Inference Probability Model (Statistical hypothesis) Scientific Model (Scientific hypothesis)
Parametric perspective on inference Data Inference Scientific Model (Hypothesis test) Often with linear models Probability Model (Normal typically)
Likelihood perspective on inference Data Inference Probability Model Scientific Model (hypothesis)
An example... The Data: xi = measurements of DBH on 50 trees yi = measurements of crown radius on those trees The Scientific Model: yi = a + b xi + e (linear relationship, with 2 parameters (a, b) and an error term (e) (the residuals)) The Probability Model: e is normally distributed, with E[e] and variance estimated from the observed variance of the residuals...
The triangle of statistical inference: Model • Models clarify our understanding of nature. • Help us understand the importance (or unimportance) of individuals processes and mechanisms. • Since they are not hypotheses, they can never be “correct”. • We don’t “reject” models; we assess their validity. • Establish what’s “true” by establishing which model the data support.
The triangle of statistical inference: Probability distributions • Data are never “clean”. • Most models are deterministic, they describe the average behavior of a system but not the noise or variability. To compare models with data, we need a statistical model which describes the variability. • We must understand the the processes giving rise to variability to select thecorrect probability density function (error structure) that gives rise to the variability or noise.
An example: Can we predict crown radius using tree diameter? Data Data Inference Inference Probability Probability Scientific Scientific Model Model Model The Data: xi = measurements of DBH on 50 trees yi = measurements of crown radius on those trees The Scientific Model: yi = a + b DBHi + e The Probability Model: e is normally distributed.
Why do we care about probability? • Foundation of theory of statistics. • Description of uncertainty (error). • Measurement error • Process error • Needed to understand likelihood theory which is required for: • Estimating model parameters. • Model selection (What hypothesis do data support?).
Error (noise, variability) is your friend! • Classical statistics are built around the assumption that the variability is normally distributed. • But…normality is in fact rare in ecology. • Non-normality is an opportunity to: • Represent variability in a more realistic way. • Gain insights into the process of interest.
The likelihood framework Ask biological question Collect data Ecological Model Model signal Probability Model Model noise Model selection Estimate parameters Estimate support regions Answer questions Bolker, Notes
Probability Concepts • An experimentis an operation with uncertain outcome. • A sample space is a set of all possible outcomes of an experiment. • An eventis a particular outcome of an experiment, a subset of the sample space.
Random Variables Event Random variable • A random variable is a function that assigns a numeric value to every outcome of an experiment (event) or sample. For instance • Tree Growth = f (DBH, light, soil…)
Functions and probability density functions Crown radius = a + b *DBH Used to model noise: e = Y-(a + b *DBH) Function: formula expressing a relationship between two variables. All pdf’s are functions BUT NOT all functions are PDF’s. WE WILL TALK ABOUT THIS LATER Functions = Scientific Model pdf’s
Probability Density Functions: properties • A function that assigns probabilities to ALL the possible values of a random variable (x). Probability density f(x) x
Probability Density Functions: Expectations • The expectation of a random variable x is the weighted value of the possible values that x can take, each value weighted by the probability that x assumes it. • Analogous to “center of gravity”. First moment. -1 0 1 2 p(-1)=0.10 p(0)=0.25 p(1)=0.3 p(2)=0.35
Probability Density Functions: Variance • The variance of a random variable reflects the spread of X values around the expected value. • Second moment of a distribution.
Probability Distributions • A function that assigns probabilities to the possible values of a random variable (X). • They come in two flavors: • DISCRETE: outcomes are a set of discrete possibilities such as integers (e.g, counting). • CONTINUOUS: A probability distribution over a continuous range (real numbers or the non-negative real numbers).
Probability Mass Functions 0.2 0.18 0.16 0.14 0.12 0.1 Probability 0.08 0.06 0.04 0.02 0 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Event (x) For a discrete random variable, X, the probability that x takes on a value x is a discrete density function, f(x) also known as probability mass or distribution function.
Probability Density Functions: Continuous variables Probability density f(x) A probability density function (f(x)) gives the probability that a random variable X takes on values within a range. b ò = < < f ( x ) dx P { a X b } a ³ f ( x ) 0 ¥ ò = f ( x ) dx 1 - ¥ a b
Some rules of probability assuming independence A B
Real data: Histograms 4 0.4 15 0.3 40 0.4 n = 50 n = 100 n = 10 3 0.3 30 0.3 10 0.2 Count Count Count 2 0.2 20 0.2 Proportion per Bar Proportion per Bar Proportion per Bar 5 0.1 1 0.1 10 0.1 0 0.0 0 0.0 0 0.0 -5 -4 -3 -2 -1 0 1 2 3 -10 -5 0 5 -10 -5 0 5 10 VARIABLE VARIABLE VARIABLE TEN FIFTY HUNDRED 120 150 n = 500 n = 1000 0.14 100 0.2 0.12 80 100 0.10 Count Count 0.08 60 Proportion per Bar Proportion per Bar 0.1 0.06 40 50 0.04 20 0.02 0 0.0 0 0.00 -10 -5 0 5 10 -10 -5 0 5 10 FIVEHUND THOUS VARIABLE VARIABLE
150 0.14 n = 1000 0.12 100 0.10 0.08 Probability 0.06 50 0.04 0.02 0 0.00 -10 -5 0 5 10 VARIABLE Histograms and PDF’s Probability density functions approximate the distribution of finite data sets.
Uses of Frequency Distributions • Empirical (frequentist): • Make predictions about the frequency of a particular event. • Judge whether an observation belongs to a population. • Theoretical: • Predictions about the distribution of the data based on some basic assumptions about the nature of the forces acting on a particular biological system. • Describe the randomness in the data.
Some useful distributions • Discrete • Binomial : Two possible outcomes. • Poisson: Counts. • Negative binomial: Counts. • Multinomial: Multiple categorical outcomes. • Continuous • Normal. • Lognormal. • Exponential • Gamma • Beta
An example: Seed predation x =no seeds taken t2 ( ) t1 0 to N Assume each seed has equal probability (p) of being taken. Then: Normalization constant
Binomial distribution: Discrete events that can take one of two values 0.2 0.18 0.16 0.14 0.12 Probability 0.1 0.08 0.06 0.04 0.02 0 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Event (x) n =20 p = 0.5 E[x] = np Variance =np(1-p) n = number of sites p = prob. of survival Example: Probability of survival derived from pop data
Poisson Distribution: Counts (or getting hit in the head by a horse) 500 0.5 400 0.4 300 0.3 Count Proportion per Bar k = number of seedlings λ= arrival rate 200 0.2 100 0.1 0 0.0 0 1 2 3 4 5 6 7 **Alt param=λ=rt POISSON Number of Seedlings/quadrat
Example: Number of seedlings in census quad. 60 0.4 Alchornea latifolia 50 0.3 40 Count 30 Proportion per Bar 0.2 20 0.1 10 0 0.0 0 10 20 30 40 50 60 70 80 90 100 Number of seedlings/trap (Data from LFDP, Puerto Rico)
Clustering in space or time Poisson process E[X]=Variance[X] Negative binomial? Poisson process E[X]<Variance[X] Overdispersed Clumped or patchy
Negative binomial:Table 4.2 & 4.3 in H&M Bycatch Data E[X]=0.279 Variance[X]=1.56 Suggests temporal or spatial aggregation in the data!!
Negative Binomial: Counts 0.2 100 90 80 70 60 Count 0.1 50 Proportion per Bar 40 30 20 10 0.0 0 0 10 20 30 40 50 NEGBIN Number of Seeds
Negative Binomial: Counts 0.2 100 90 80 70 60 Count 0.1 50 Proportion per Bar 40 30 20 10 0.0 0 0 10 20 30 40 50 NEGBIN Number of Seeds
Negative Binomial: Count data 30 0.2 Prestoea acuminata 20 Count 0.1 Proportion per Bar 10 0 0.0 0 10 20 30 40 50 60 70 80 90 100 No seedlings/quad. (Data from LFDP, Puerto Rico)
Normal Distribution 1 Var = 0.25 0.8 Var = 0.5 0.6 Var = 1 Prob(x) Var = 2 0.4 Var = 5 0.2 Var = 10 0 -5 -4 -3 -2 -1 0 1 2 3 4 5 X E[x] = m Variance = δ2 Normal PDF with mean = 0
Lognormal: One tail and no negative values x is always positive 0.8 0.7 0.6 f(x) 0.5 0.4 0.3 0.2 0.1 0 0 10 20 30 40 50 60 70 x
Lognormal: Radial growth data 150 40 0.2 Hemlock Red cedar 0.2 30 100 Count Count 20 0.1 Proportion per Bar Proportion per Bar 0.1 50 10 0 0.0 0 0.0 0 1 2 3 4 0 1 2 3 HEMLOCK REDCEDAR Growth (cm/yr) Growth (cm/yr) (Data from Date Creek, British Columbia)
Exponential 80 0.4 70 60 0.3 50 40 0.2 Proportion per Bar 30 20 0.1 10 0 0.0 0 1 2 3 4 5 6 Count Variable
Exponential: Growth data (negatives assumed 0) 1200 0.7 1000 Beilschemedia pendula 0.6 800 0.5 0.4 Count 600 Proportion per Bar 0.3 400 0.2 200 0.1 0 0.0 0 1 2 3 4 5 6 7 8 Growth (mm/yr) (Data from BCI, Panama)
Gamma: “raw” growth data 200 1000 800 150 600 100 Count 400 50 200 0 0 10 20 30 0 0 1 2 3 4 5 6 7 8 9 Growth (mm/yr) Alseis blackiana Cordia bicolor Growth (mm/yr) (Data from BCI, Panama)
Beta: Light interception by crown trees (Data from Luquillo, PR)
Mixture models • What do you do when your data don’t fit any known distribution? • Add covariates • Mixture models • Discrete • Continuous