Introduction to Basic Statistical Concepts in Industrial Management

Some Basic Statistical Concepts Dr. Tai-Yue Wang Department of Industrial and InformationManagement National Cheng Kung University Tainan, TAIWAN, ROC

Outline • Introduction • Basic Statistical Concepts • Inferences about the differences in Means, Randomized Designs • Inferences about the Differences in Means, Paired Comparison Designs • Inferences about the Variances of Normal Distribution

Introduction • Formulation of a cement mortar (水泥漿) • Original formulation and modified formulation • 10 samples for each formulation • One factor formulation • Two formulations:  two treatments  two levels of the factor formulation

Introduction Results:

Introduction Dot diagram

Basic Statistical Concepts • Experiences from above example • Run – each of above observations • Noise, experimental error, error – the individual runs difference • Statistical error– arises from variation that is uncontrolled and generally unavoidable • The presence of error means that the response variable is a random variable • Random variable could be discrete or continuous

Basic Statistical Concepts • Describing sample data • Graphical descriptions • Dot diagram—central tendency, spread • Box plot – • Histogram

Basic Statistical Concepts

Basic Statistical Concepts • Discrete vs continuous

Basic Statistical Concepts • Probability distribution • Discrete • Continuous

Basic Statistical Concepts • Probability distribution • Mean—measure of its central tendency • Expected value –long-run average value

Basic Statistical Concepts • Probability distribution • Variance —variability or dispersion of a distribution

Basic Statistical Concepts • Probability distribution • Properties: c is a constant • E(c) = c • E(y)= μ • E(cy)=cE(y)=cμ • V(c)=0 • V(y)= σ2 • V(cy)=c2 σ2 • E(y1+y2)= μ1+ μ2

Basic Statistical Concepts • Probability distribution • Properties: c is a constant • V(y1+y2)=V(y1)+V(y2)+2Cov(y1, y2) • V(y1-y2)=V(y1)+V(y2)-2Cov(y1, y2) • If y1and y2are independent, Cov(y1, y2) =0 • E(y1*y2)=E(y1)*V(y2)= μ1* μ2 • E(y1/y2) is not necessary equal toE(y1)/V(y2)

Basic Statistical Concepts • Sampling and sampling distribution • Random samples -- if the population contains N elements and a sample of n of them is to be selected, and if each of N!/[(N-n)!n!] possible samples has equal probability being chosen • Random sampling – above procedure • Statistic – any function of the observations in a sample that does not contain unknown parameters

Basic Statistical Concepts • Sampling and sampling distribution • Sample mean • Sample variance

Basic Statistical Concepts • Sampling and sampling distribution • Estimator – a statistic that correspond to an unknown parameter • Estimate – a particular numerical value of an estimator • Point estimator: to μ and s2 to σ2 • Properties on sample mean and variance: • The point estimator should be unbiased • An unbiased estimator should have minimum variance

Basic Statistical Concepts • Sampling and sampling distribution • Sum of squares, SS in • Sum of squares, SS, can be defined as

Basic Statistical Concepts • Sampling and sampling distribution • Degree of freedom, v, number of independent elements in a sum of square in • Degree of freedom, v , can be defined as

Basic Statistical Concepts • Sampling and sampling distribution • Normal distribution, N

Basic Statistical Concepts • Sampling and sampling distribution • Standard Normal distribution, z, a normal distribution with μ=0 andσ2=1

Basic Statistical Concepts • Sampling and sampling distribution • Central Limit Theorem– If y1, y2, …, ynis a sequence of n independent and identically distributed random variables with E(yi)=μand V(yi)=σ2and x=y1+y2+…+yn, then the limiting form of the distribution of as n∞, is the standard normal distribution

Basic Statistical Concepts • Sampling and sampling distribution • Chi-square, χ2 , distribution– If z1, z2, …, zkare normally and independently distributed random variables with mean 0 and variance 1, NID(0,1), the random variable follows the chi-square distribution with k degree of freedom.

Basic Statistical Concepts • Sampling and sampling distribution • Chi-square distribution– example If y1, y2, …, ynare random samples from N(μ, σ2), distribution, • Sample variance from NID(μ, σ2),

Basic Statistical Concepts • Sampling and sampling distribution • t distribution– If z and are independent standard normal and chi-square random variables, respectively, the random variable follows t distribution with k degrees of freedom

Basic Statistical Concepts • Sampling and sampling distribution • pdf of t distribution– μ =0, σ2=k/(k-2) for k>2

Basic Statistical Concepts

Basic Statistical Concepts • Sampling and sampling distribution • If y1, y2, …, ynare random samples from N(μ, σ2), the quantity is distributed as t with n-1 degrees of freedom

Basic Statistical Concepts • Sampling and sampling distribution • F distribution— If and are two independent chi-square random variables with u and v degrees of freedom, respectively follows F distribution with u numerator degrees of freedom and v denominator degrees of freedom

Basic Statistical Concepts • Sampling and sampling distribution • pdf of F distribution–

Basic Statistical Concepts • Sampling and sampling distribution • F distribution– example Suppose we have two independent normal distributions with common variance σ2 , if y11, y12, …, y1n1 is a random sample of n1 observations from the first population and y21, y22, …, y2n2 is a random sample of n2 observations from the second population

The Hypothesis Testing Framework • Statistical hypothesis testing is a useful framework for many experimental situations • Origins of the methodology date from the early 1900s • We will use a procedure known as the two-sample t-test

Two-Sample-t-Test • Suppose we have two independent normal, if y11, y12, …, y1n1 is a random sample of n1 observations from the first population and y21, y22, …, y2n2 is a random sample of n2 observations from the second population

Two-Sample-t-Test • A model for data ε is a random error

Two-Sample-t-Test • Sampling from a normal distribution • Statistical hypotheses:

Two-Sample-t-Test • H0 is called the null hypothesis and H1 is call alternative hypothesis. • One-sided vs two-sided hypothesis • Type I error, α: the null hypothesis is rejected when it is true • Type II error, β: the null hypothesis is not rejected when it is false

Two-Sample-t-Test • Power of the test: • Type I error  significance level • 1- α = confidence level

Two-Sample-t-Test • Two-sample-t-test • Hypothesis: • Test statistic: where

Two-Sample-t-Test

Example --Summary Statistics Formulation 2 “Original recipe” Formulation 1 “New recipe”

Two-Sample-t-Test--How the Two-Sample t-Test Works:

Two-Sample-t-Test--How the Two-Sample t-Test Works: • Values of t0 that are near zero are consistent with the null hypothesis • Values of t0 that are very different from zero are consistent with the alternative hypothesis • t0 is a “distance” measure-how far apart the averages are expressed in standard deviation units • Notice the interpretation of t0 as a signal-to-noiseratio

The Two-Sample (Pooled) t-Test

Two-Sample-t-Test • P-value– The smallest level of significance that would lead to rejection of the null hypothesis. • Computer application Two-Sample T-Test and CI Sample N Mean StDev SE Mean 1 10 16.760 0.316 0.10 2 10 17.040 0.248 0.078 Difference = mu (1) - mu (2) Estimate for difference: -0.280 95% CI for difference: (-0.547, -0.013) T-Test of difference = 0 (vs not =): T-Value = -2.20 P-Value = 0.041 DF = 18 Both use Pooled StDev = 0.2840

William Sealy Gosset (1876, 1937) Gosset's interest in barley cultivation led him to speculate that design of experiments should aim, not only at improving the average yield, but also at breeding varieties whose yield was insensitive (robust) to variation in soil and climate. Developed the t-test (1908) Gosset was a friend of both Karl Pearson and R.A. Fisher, an achievement, for each had a monumental ego and a loathing for the other. Gosset was a modest man who cut short an admirer with the comment that “Fisher would have discovered it all anyway.”

The Two-Sample (Pooled) t-Test t0 = -2.20 • So far, we haven’t really done any “statistics” • We need an objective basis for deciding how large the test statistic t0 really is • In 1908, W. S. Gosset derived the referencedistribution for t0 … called the t distribution • Tables of the t distribution – see textbook appendix

The Two-Sample (Pooled) t-Test t0 = -2.20 • A value of t0 between –2.101 and 2.101 is consistent with equality of means • It is possible for the means to be equal and t0 to exceed either 2.101 or –2.101, but it would be a “rareevent” … leads to the conclusion that the means are different • Could also use the P-value approach

Introduction to Basic Statistical Concepts in Industrial Management

Introduction to Basic Statistical Concepts in Industrial Management

Presentation Transcript

Review of Basic Statistical Concepts

Basic Statistical Concepts

Some Basic Probability Concepts

Some Basic Statistical Concepts

Some Basic Concepts

Statistical Concepts Basic Principles

Basic Statistical Concepts

Some Basic Concepts

Basic Statistical Concepts

BASIC STATISTICAL CONCEPTS

Basic Statistical Concepts

Basic Statistical Concepts

Review Some Basic Statistical Concepts

Some basic concepts

Review of Basic Statistical Concepts

Introducing Some Basic Concepts

Reengineering: some basic concepts

Basic statistical concepts

Reengineering: some basic concepts

Basic statistical concepts

Some Basic Concepts of Energy