Selecting Input Probability Distributions

SelectingInputProbabilityDistributions

Introduction • Part of modeling—whatinputprobabilitydistributionstouse as inputtosimulationfor: • Interarrivaltimes • Service/machiningtimes • Demand/batchsizes • Machineup/downtimes • Inappropriateinputdistribution(s) can leadtoincorrectoutput, baddecisions • Givenobserved data on inputquantities, we can usethem in differentways

Data Usage

Parameterization of Distributions - 1 • Therearealternativewaystoparameterizemostdistributions • Typically, parameters can be classified as one of: • Locationparameterγ(alsocalledshiftparameter): specifies an abscissa (x axis) locationpoint of a distribution’srange of values, oftensomekind of midpoint of thedistribution • Example: μfor normal distribution • As γchanges, distributionjustshiftsleftorrightwithoutchangingits spread orshape • IfX has locationparameter 0, thenX + γhas locationparameterγ

Parameterization of Distributions - 2 • Scaleparameterβ: determinesscale, orunits of measurement, or spread, of a distribution • Example: σ for normal distribution, βfor exponential distribution • As βchanges, thedistribution is compressedorexpandedwithoutchangingitsshape • IfX has scaleparameter 1, thenβX has scaleparameterβ

Parameterization of Distributions - 3 • Shapeparameterα: determines, separatelyfromlocationandscale, thebasic form orshape of a distribution • Examples: normal andexponentialdistribution do not haveshapeparameter; αfor Gamma andWeibulldistributions • May havemorethanoneshapeparameter (Beta distribution has twoshapeparameters) • Change in shapeparameter(s) altersdistribution’sshapemorefundamentallythanchanges in scaleorlocationparameters

ContinuousandDiscreteDistributions • Compendium of 13 continuousand6 discretedistributionsgiven in thetextbookwithdetails on • Possibleapplications • Densityanddistributionfunctions (whereapplicable) • Parameterdefinitionsandranges • Range of possiblevalues • Mean, variance, mode • Maximum-likelihoodestimatorformulaormethod • General comments, includingrelationshipstootherdistributions • Plots of densities

Summary Measures from Moments • Mean and variance • Coefficient of Variation is a measure of variability relative to the mean: CV(X)=sX/mX. • Higher moments also give useful information • Skewness coefficient gives information about the shape. • Kurtosis coefficient gives information about the tail weight (likelihood of extreme-value).

Example • Find: • Mean • Variance • Coefficient of variation • Median • Skewness coefficient

Exponential Expo(β)

Exponential Expo(β) Expo(1) density function

Exponential: Properties • Coefficient of Variation is a measure of variability relative to the mean: CV(X)=sX/mX. • Its Coefficient of Variation is 1 (unless it is shifted). • The density function is monotonically decreasing (at an exponential rate). • Times of events: most likely to be small but can be large with small probabilities. • Skewnessg= 2, Kurtosis (tail weight) k=9.

Poisson(λ) Bimodal: Two modes

Poisson(λ)

Poisson: Properties • Counts the number of events over time. • If arrivals occur according to a Poisson process with rate l, times between arrivals are exponential with mean 1/l. • Its Coefficient of Variation is 1/Sqrt(l). • Events (i.e.) are generated by a large potential population where each customer chooses to arrive at a given small interval with a very small probability. • Number of outbreaks of war over time, number of goals scored in World Cup games.

Normal Distribution: Properties • Supported by Central Limit Theorem: the random variable is a sum of several small random variables (i.e. total consumer demand). • It is symmetrical (skewness = 0, mean=median). • Kurtosis=3. • It’s usually not appropriate for modeling times between events (can take negative values).

Gamma Distribution: Properties • Shape parameter: a>0, scale parameter b>0 • A special case: sum of exponential random variables (a=1, corresponds to exponential (b). • In general, skewness is positive. • The CV is less than one if shape parameter a > 1. Scale = 1, shape=2 Scale = 1, shape=20

Weibull Distribution: Properties • Shape parameter: a>0, scale parameter b>0 • Very versatile Scale = 1, shape=1.5 Scale = 1, shape=10

Lognormal Distribution: Properties • Y=ln(X) is Normal(m,s). • Models product of several independent random factors (X=X1X2…Xn). • Very versatile: like gamma and Weibull but can have a spike near zero. Scale = 1, shape=0.5 Scale = 2, shape=0.1

Empirical Distributions • There may be no standard distribution that fits the data adequately: use observed data themselves to specify directly an empirical distribution • There are many different ways to specify empirical distributions, resulting in different distributions with different properties.

ContinuousEmpiricalDistributions • If original individual data points are available (i.e., data are not grouped) • Sort data X1, X2, ..., Xninto increasing order: X(i) is ith smallest • Define F(X(i)) = (i– 1)/(n – 1), approximately (for large n) the proportion of the data less than X(i), and interpolate linearly between observed data points:

ContinuousEmpiricalDistributions Rises most steeply over regions where observations are dense, as desired. Sample: 3,5,6,7,9,12 F(3)=0, F(5)=1/5, F(6)=2/5, F(7)=3/5, F(9)=4/5, F(12)=1,

ContinuousEmpiricalDistributions • Potential disadvantages: • Generated data will be within range of observed data • Expected value of this distribution is not the sample mean • There are other ways to define continuous empirical distributions, including putting an exponential tail on the right to make the range infinite on the right • If only grouped data are available • Don’t know individual data values, but counts of observations in adjacent intervals • Define empirical distribution function G(x) with properties similar to F(x) above for individual data points

DiscreteEmpiricalDistributions • If original individual data points are available (i.e., data are not grouped) • For each possible value x, define p(x) = proportion of the data values that are equal to x • If only grouped data are available • Define a probability mass function such that the sum of the p(x)’s for the x’s in an interval is equal to the proportion of the data in that interval • Allocation of p(x)’s for x’s in an interval is arbitrary

Selecting Input Probability Distributions

Selecting Input Probability Distributions

Presentation Transcript

Probability Distributions

Probability Distributions

Probability Distributions

Selecting Input Probability Distributions

PROBABILITY DISTRIBUTIONS

Probability Distributions

Probability Distributions

Probability Distributions

Probability Distributions

Probability Distributions

Probability Distributions

Probability Distributions

Probability Distributions

Probability Distributions

Probability Distributions

Probability Distributions

Probability Distributions

Probability Distributions

Probability Distributions

Probability Distributions

PROBABILITY DISTRIBUTIONS

Probability Distributions