140 likes | 400 Views
Terminating Statistical Analysis. By Dr. Jason Merrick. Statistical Analysis of Output Data: Terminating Simulations. Random input leads to random output (RIRO) Run a simulation (once) — what does it mean? Was this run “typical” or not? Variability from run to run (of the same model)?
E N D
Terminating Statistical Analysis By Dr. Jason Merrick
Statistical Analysis of Output Data: Terminating Simulations • Random input leads to random output (RIRO) • Run a simulation (once) — what does it mean? • Was this run “typical” or not? • Variability from run to run (of the same model)? • Need statistical analysis of output data • Time frame of simulations • Terminating: Specific starting, stopping conditions • Steady-state: Long-run (technically forever) • Here: Terminating Simulation with Arena — Intermediate Modeling and Terminating Statistical Analysis
Point and Interval Estimation • Suppose we are trying to estimate an output measure E[Y] = based upon a simulated sample Y1,…,Yn • We come up with an estimate • For instance • How good is this estimate? • Unbiased • Low Variance (possibly minimum variance) • Consistent • Confidence Interval Simulation with Arena — Intermediate Modeling and Terminating Statistical Analysis
T-distribution • The t-statistic is given by • If the Y1,…,Ynare normally distributed and then the t-statistic is t-distributed • If the Y1,…,Ynare not normally distributed, but then the t-statistic is approximately t-distributed thanks to the Central Limit Theorem • requires a reasonably large sample size n • We require an estimate of the variance of denoted Simulation with Arena — Intermediate Modeling and Terminating Statistical Analysis
T-distribution Confidence Interval • An approximate confidence interval for is then • The center of the confidence interval is • The half-width of the confidence interval is • is the 100(/2)% percentile of a t-distribution with f degrees of freedom. Simulation with Arena — Intermediate Modeling and Terminating Statistical Analysis
T-distribution Confidence Interval • Case 1: Y1,…,Ynare independent • This is the case when you are making n independent replications of the simulations • Terminating simulations • Try and force this with steady-state simulations • Compute your estimate and then compute the sample variance • s2 is an unbiased estimator of the population variance, so s2/n is an unbiased estimator of with f = n-1 degrees of freedom Simulation with Arena — Intermediate Modeling and Terminating Statistical Analysis
T-distribution Confidence Interval • Case 2: Y1,…,Ynare not independent • This is the case when you are using data generated within a single simulation run • sequences of observations in long-run steady-state simulations • s2/n is a biased estimator of • Y1,…,Ynis an auto-correlated sequence or a time-series • Suppose that our point estimator for is , a general result from mathematical statistics is Simulation with Arena — Intermediate Modeling and Terminating Statistical Analysis
T-distribution Confidence Interval • Case 2: Y1,…,Ynare not independent • For n observations there are n2 covariances to estimate • However, most simulations are covariance stationary, that is for all i, j and k • Recall that k is the lag, so for a given lag, the covariance remains the same throughout the sequence • If this is the case then there are n-1 lagged covariances to estimate, denoted k and Simulation with Arena — Intermediate Modeling and Terminating Statistical Analysis
Time-Series Examples Positively correlated sequence with lag 1 Positively correlated sequence with lags 1 & 2 Positively correlated, covariance non-stationary sequence Negatively correlated sequence with lag 1 Simulation with Arena — Intermediate Modeling and Terminating Statistical Analysis
T-distribution Confidence Interval • Case 2: Y1,…,Ynare not independent • What is the effect of this bias term? • For primarily positively correlated sequences B < 1, so the half-width of the confidence interval will be too small • Overstating the precision => make conclusions you shouldn’t • For primarily negatively correlated sequences B > 1, so the half-width of the confidence interval will be too large • Underestimating the precision => don’t make conclusions you should Simulation with Arena — Intermediate Modeling and Terminating Statistical Analysis
Strategy for Terminating Simulations • For terminating case, make IID replications • Simulate module: Number of Replications field • Check both boxes for Initialization Between Reps. • Get multiple independent Summary Reports • Different random seeds for each replication • How many replications? • Trial and error (now) • Approximate no. for acceptable precision • Sequential sampling • Save summary statistics (e.g. average, variance) across replications • Statistics Module, Outputs Area, save to files Simulation with Arena — Intermediate Modeling and Terminating Statistical Analysis
Half Width and Number of Replications • Prefer smaller confidence intervals — precision • Notation: • Confidence interval: • Half-width = Want this to be “small,” say < h where h is prespecified Simulation with Arena — Intermediate Modeling and Terminating Statistical Analysis
Half Width and Number of Replications • To improve the half-width, we can • Increase the length of each simulation run and so increase the mi • What does increasing the run length do? • Increase the number of replications Simulation with Arena — Intermediate Modeling and Terminating Statistical Analysis
Half Width and Number of Replications (cont’d.) • Set half-width = h, solve for • Not really solved for n (t, s depend on n) • Approximation: • Replace t by z, corresponding normal critical value • Pretend that current s will hold for larger samples • Get • Easier but different approximation: s = sample standard deviation from “initial” number n0 of replications n grows quadratically as h decreases. h0 = half width from “initial” number n0 of replications Simulation with Arena — Intermediate Modeling and Terminating Statistical Analysis