210 likes | 299 Views
Lecture 2 Basic Experimental Statistics. What does this mean?. From Snee , R.D. (1983) “Graphical Analysis of Process Variation Studies,” Journal of Quality Technology, 15, 76-88. Are the specimen different?. Sp #5. Sp #6. Sp #1. Sp #2. Sp #3. Are the operators different?. Operator #3.
E N D
What does this mean? From Snee, R.D. (1983) “Graphical Analysis of Process Variation Studies,” Journal of Quality Technology, 15, 76-88
Are the specimen different? Sp #5 Sp #6 Sp #1 Sp #2 Sp #3
Are the operators different? Operator #3 Operator #1 Operator #2
Statistics in Experimentation • Project Planning Phase • What is to be measured? • How large is the likely variation? • What are the influential factors? • Experimental Design Phase • Control known sources of variation • Estimate the size of uncontrolled variation • Investigate suitable models • Statistical Analysis Phase • Next few weeks…
Population • All possible items or units that determine an outcome of a well-defined experiment are collectively called a “population”. • Examples: • All 30-ohm resistors produced by a certain manufacturer during a fixed time period. • All measurements of the fracture strength of one-inch-thick underwater welds on a steel alloy base plate that is 200ft deep in salt-water.
Process • A repeatable series of actions that result in an observable characteristic or measurement. • Examples: • The effect of aspirin on blood pressure.
Sample • A group of observations taken from a population or a process. • We usually take a “convenience sample” (i.e., easy to obtain), but these can be of dubious value because they may not be representative of the variation in the population!! • Goal: random sample
Other definitions • Variable : A property or characteristic on which information is obtained in an experiment. • Distribution: a tabular or graphical description of the values of a variable using some measure of how frequently they occur in a population, process, or sample. (think: histogram)
Example Population or a Sample?
Example Convenience Sample or Random Sample?
Example Is this a Normal Distribution?
Beware the Normal Distribution • aka, Gaussian Distribution • Approximately normal distributions occur in many situations, as explained by the central limit theorem. When there is reason to suspect the presence of a large number of small effects acting additively and independently, it is reasonable to assume that observations will be normal. • Good example: laser light intensity
Beware the Normal Distribution • Bad example: Black–Scholes model • Changes in the logarithm of exchange rates, price indices, and stock market indices; these variables behave like compound interest, not like simple interest, and so are multiplicative; • While the Black–Scholes model assumes normality, in reality these variables exhibit heavy tails, as seen in stock market crashes (see The Black Swan)
Beware the Normal Distribution • Concrete example: S&P 500, daily change: m = +0.035% ± 0.9% (from W. Egan) • Based on this, you conclude: • 68% of the time, changes will be less than 0.9% • 95% of the time, changes will be less than 1.8% • 99.7% of the time, changes will be less than 2.7% • 99.994% of the time, changes will be less than 3.6% • (1 in every 17,000 trading days or once every 70 years) • A 4.5% change should almost never happen, but it has happened multiple times in your lives!
Sample Mean or Average • Sample median: • M=x(q) if n is odd, where q=(n+1)/2. • M=[x(q) + x(q+1)]/2 if n is even where q=n/2 • Do not use text for equations (like I just did) in reports! It looks like crap!
Sample Standard Deviation • In your reports, all values should be reported as: or • INCLUDE UNITS!!!!
Sample Standard Error • Some people like to use standard errors because the smaller values seem to imply better results. You need to be careful with such people!
Std. Err. vs. Std. Dev. Thanks to Wikipedia…. • standard error is an estimate of how close to the population mean your sample mean is likely to be – it behaves like a confidence interval. • standard deviation is the degree to which individuals within the sample differ from the sample mean (“Spread”). • Standard error should decrease with larger sample sizes, as the estimate of the population mean improves. Standard deviation will be unaffected by sample size.
Percent Error • Note: The order of terms in the numerator can vary. • Percent Difference • Note: There is no universally accepted choice for the value that should go in the denominator. • If one value is believed to be closer to the true value, that should be in the denominator.
Which to use? • We almost always need to give an indication of how accurate our experimental measurements are. Which measure should we use? • When should we use s? • When should we use % error or % diff?