240 likes | 288 Views
Selecting Input Probability Distributions. Introduction. Part of modeling — what input probability distributions to use as input to simulation for : Interarrival times Service/ machining times Demand / batch sizes Machine up / down times
E N D
Introduction • Part of modeling—whatinputprobabilitydistributionstouse as inputtosimulationfor: • Interarrivaltimes • Service/machiningtimes • Demand/batchsizes • Machineup/downtimes • Inappropriateinputdistribution(s) can leadtoincorrectoutput, baddecisions • Givenobserved data on inputquantities, we can usethem in differentways
Parameterization of Distributions - 1 • Therearealternativewaystoparameterizemostdistributions • Typically, parameters can be classified as one of: • Locationparameterγ(alsocalledshiftparameter): specifies an abscissa (x axis) locationpoint of a distribution’srange of values, oftensomekind of midpoint of thedistribution • Example: μfor normal distribution • As γchanges, distributionjustshiftsleftorrightwithoutchangingits spread orshape • IfX has locationparameter 0, thenX + γhas locationparameterγ
Parameterization of Distributions - 2 • Scaleparameterβ: determinesscale, orunits of measurement, or spread, of a distribution • Example: σ for normal distribution, βfor exponential distribution • As βchanges, thedistribution is compressedorexpandedwithoutchangingitsshape • IfX has scaleparameter 1, thenβX has scaleparameterβ
Parameterization of Distributions - 3 • Shapeparameterα: determines, separatelyfromlocationandscale, thebasic form orshape of a distribution • Examples: normal andexponentialdistribution do not haveshapeparameter; αfor Gamma andWeibulldistributions • May havemorethanoneshapeparameter (Beta distribution has twoshapeparameters) • Change in shapeparameter(s) altersdistribution’sshapemorefundamentallythanchanges in scaleorlocationparameters
ContinuousandDiscreteDistributions • Compendium of 13 continuousand6 discretedistributionsgiven in thetextbookwithdetails on • Possibleapplications • Densityanddistributionfunctions (whereapplicable) • Parameterdefinitionsandranges • Range of possiblevalues • Mean, variance, mode • Maximum-likelihoodestimatorformulaormethod • General comments, includingrelationshipstootherdistributions • Plots of densities
Summary Measures from Moments • Mean and variance • Coefficient of Variation is a measure of variability relative to the mean: CV(X)=sX/mX. • Higher moments also give useful information • Skewness coefficient gives information about the shape. • Kurtosis coefficient gives information about the tail weight (likelihood of extreme-value).
Example • Find: • Mean • Variance • Coefficient of variation • Median • Skewness coefficient
Exponential Expo(β) Expo(1) density function
Exponential: Properties • Coefficient of Variation is a measure of variability relative to the mean: CV(X)=sX/mX. • Its Coefficient of Variation is 1 (unless it is shifted). • The density function is monotonically decreasing (at an exponential rate). • Times of events: most likely to be small but can be large with small probabilities. • Skewnessg= 2, Kurtosis (tail weight) k=9.
Poisson(λ) Bimodal: Two modes
Poisson: Properties • Counts the number of events over time. • If arrivals occur according to a Poisson process with rate l, times between arrivals are exponential with mean 1/l. • Its Coefficient of Variation is 1/Sqrt(l). • Events (i.e.) are generated by a large potential population where each customer chooses to arrive at a given small interval with a very small probability. • Number of outbreaks of war over time, number of goals scored in World Cup games.
Normal Distribution: Properties • Supported by Central Limit Theorem: the random variable is a sum of several small random variables (i.e. total consumer demand). • It is symmetrical (skewness = 0, mean=median). • Kurtosis=3. • It’s usually not appropriate for modeling times between events (can take negative values).
Gamma Distribution: Properties • Shape parameter: a>0, scale parameter b>0 • A special case: sum of exponential random variables (a=1, corresponds to exponential (b). • In general, skewness is positive. • The CV is less than one if shape parameter a > 1. Scale = 1, shape=2 Scale = 1, shape=20
Weibull Distribution: Properties • Shape parameter: a>0, scale parameter b>0 • Very versatile Scale = 1, shape=1.5 Scale = 1, shape=10
Lognormal Distribution: Properties • Y=ln(X) is Normal(m,s). • Models product of several independent random factors (X=X1X2…Xn). • Very versatile: like gamma and Weibull but can have a spike near zero. Scale = 1, shape=0.5 Scale = 2, shape=0.1
Empirical Distributions • There may be no standard distribution that fits the data adequately: use observed data themselves to specify directly an empirical distribution • There are many different ways to specify empirical distributions, resulting in different distributions with different properties.
ContinuousEmpiricalDistributions • If original individual data points are available (i.e., data are not grouped) • Sort data X1, X2, ..., Xninto increasing order: X(i) is ith smallest • Define F(X(i)) = (i– 1)/(n – 1), approximately (for large n) the proportion of the data less than X(i), and interpolate linearly between observed data points:
ContinuousEmpiricalDistributions Rises most steeply over regions where observations are dense, as desired. Sample: 3,5,6,7,9,12 F(3)=0, F(5)=1/5, F(6)=2/5, F(7)=3/5, F(9)=4/5, F(12)=1,
ContinuousEmpiricalDistributions • Potential disadvantages: • Generated data will be within range of observed data • Expected value of this distribution is not the sample mean • There are other ways to define continuous empirical distributions, including putting an exponential tail on the right to make the range infinite on the right • If only grouped data are available • Don’t know individual data values, but counts of observations in adjacent intervals • Define empirical distribution function G(x) with properties similar to F(x) above for individual data points
DiscreteEmpiricalDistributions • If original individual data points are available (i.e., data are not grouped) • For each possible value x, define p(x) = proportion of the data values that are equal to x • If only grouped data are available • Define a probability mass function such that the sum of the p(x)’s for the x’s in an interval is equal to the proportion of the data in that interval • Allocation of p(x)’s for x’s in an interval is arbitrary