Lecture 10: Probability and Statistics (part 2)

COMP155Computer Simulation October 1, 2008 Lecture 10:Probability and Statistics (part 2)

This Week • Review of probability and statistics needed to understand simulation • follow Appendix C in Arena text • Outline • Monday (C.1, C.2): • Probability – basic ideas, terminology • Random variables, joint distributions • Today (C.3-C.5): • Sampling • Statistical inference – point estimation, confidence intervals, hypothesis testing

Sampling • Statistical analysis: purpose is to estimate or infer something about a large population • population is a set of data points • population is too large to look at completely,so we only look at a sample from the population • if the sample is randomly selected from the population, the distribution of the sample should be the same as the distribution of the population • in practice: determine a PMF or PDF for a sampleand assume that distribution holds for the entire population

Sampling • Random sampleis a set of independent and identically distributed (IID) observationsX1, X2, …, Xnfrom the population • Input modeling: • observations come from the real world • Arena’s input analyzer can be used to determine distribution function • Output analysis: • observations are the results of multiple runs/replications of the simulation • Arena’s output analyzer can be used to characterize the output population from the observations.

Sampling: Simulation Output • Random sampleis a set of independent and identically distributed (IID) observationsX1, X2, …, Xnfrom the population • Input modeling: • observations come from the real world • Arena’s input analyzer can be used to determine distribution function • Output analysis: • observations are the results of multiple runs/replications of the simulation • Arena’s output analyzer can be used to characterize the output population from the observations.

Estimating Distribution from Samples • Samples: X1, X2, …, Xnassuming a normal distribution, compute: • sample mean • sample variance • These statistics have their own sampling distribution, which is generally normal

Sampling Distributions • If • If underlying distribution of X is normal, then the distribution of is also normal.

Point Estimation • Point estimates are estimates of population distribution parameters (m, s2, …) • Properties of point estimates • Unbiased: E(estimate) = parameter • Efficient: Var(estimate) is lowest among competing point estimators • Consistent: Var(estimate) decreases (usually to 0) as the sample size increases

Confidence Intervals • A confidence interval quantifies the likely imprecision in a point estimator • An interval that contains (covers) the unknown population parameter some specified probability • Called a 100 (1 – a)% confidence interval for the parameter • Example: 87 < m < 123 with probability 95% • The value of m is in (87, 123) with 95% confidence • We’ll leave the computation of confidence intervals to a statistics course … or to Arena’s output analyzer tool.

Confidence Intervals in Simulation • Run simulation replications, get results • View each replication of the simulation as a data point • Form a confidence interval • The confidence interval tells you how close you are to getting the “true” expected output (what you’d get by averaging an infinite number of replications)

Hypothesis Tests • A hypothesis test is used to test some assertion about the population or its parameters • With sampling, we don’t get true/false result, only get evidence that points one way or another • Null hypothesis(H0) – what is to be tested • Alternate hypothesis(H1 or HA) – denial of H0 H0: m = 6 vs. H1: m 6 H0: s < 10 vs. H1: s 10 H0: m1 = m2 vs. H1: m1m2 • Develop a decision rule to decide on H0 or H1 based on sample data

Errors in Hypothesis Testing 1-α is the probability of your confidence interval

Hypothesis Testing in Simulation • Input side • Specify input distributions to drive the simulation • Collect real-world data on corresponding processes • “Fit” a probability distribution to the observed real-world data • Test H0: the data are well represented by the selected distribution • Output side • Have two or more “competing” designs modeled • Test H0: all designs perform the same on output, or test H0: one design is better than another • Selection of a “best” model scenario

Lecture 10: Probability and Statistics (part 2)

Lecture 10: Probability and Statistics (part 2)

Presentation Transcript

Statistics 221

Conditional Probability

COUNTER: making statistics useful

Chapter 4

TR 555 Statistics “Refresher” Lecture 2: Distributions and Tests

TR 555 Statistics “Refresher” Lecture 3: Models

Statistics and Probability

What is the probability that the spinner will land on blue?

Chapter 10 Data Analysis and Probability

STATISTICS Random Variables and Probability Distributions

Lecture Slides

Lecture #3: Wireless Comm. for ENS - Part II The Higher Layers

Nursing Research

Introduction to Statistics

Patrick's Casino

III Modeling Random Behavior Probability Overview

Prepared by Lloyd R. Jaisingh

Introduction to Pattern Recognition for Human ICT Review of probability and statistics

Lecture Slides

Statistics and Modelling Course

Statistics in Medicine