Dr. Jerrell T. Stracener

EMIS 7370 STAT 5340 Department of Engineering Management, Information and Systems Probability and Statistics for Scientists and Engineers Estimation Basic Concepts & Estimation of Proportions Dr. Jerrell T. Stracener

Estimation • Types of estimates and methods of estimation • Estimation - Binomial distribution • - Estimation of a Proportion • - Estimation of the difference between two • proportions • Estimation - Normal distribution • - Estimation of the mean • - Estimation of the standard deviation • - Estimation of the difference between two • means

Estimation • Estimation - Normal distribution (continued) • - Estimation of the ratio of the two standard • deviations • - Tolerance intervals • Estimation - Lognormal distribution • Estimation - Weibull distribution • Estimation - Unknown distribution Types • - Continuous populations • - Finite populations

Estimation Types of Estimates & Methods of Estimation

A statistic is a function of only the values of a random sample, X1, X2, …, Xn. For example is a statistic Definition - Statistic

A statistic is said to be an unbiased estimator of the parameter  if If we consider all possible unbiased estimators of some parameter , the one with the smallest variance is called the most efficient estimator of  Properties of Estimates

Types of Estimates • Point Estimate • A function of the values of a random sample that • yields a single value, i.e., a point • Interval Estimate • An interval, whose end points are functions of the • values of a random sample, for which one can assert • with a specified confidence that the interval contains • the parameter being estimated

Types of Estimates & Methods of Estimation If we use a sample mean to estimate the mean of a population, a sample proportion to estimate the probability of success on an individual trial, or a sample variance to estimate the variance of a population, we are in each case using a point estimate of the parameter in question. These estimates are called point estimates since they are single numbers, single points, used, respectively, to estimate , , and 2. Since we can hardly expect the point estimates based on samples to hit the parameters they are supposed to estimate exactly ‘on the nose’, it is often desirable to give an interval rather than a single number.

Types of Estimates & Methods of Estimation We can then assert with a certain probability (or degree of confidence) that such an interval contains the parameter it is intended to estimate. For instance, when estimating the average IQ of all college students in the US, we might arrive at a point estimate of 117, or we might arrive at an interval estimate to the effect that the interval from 113 to 121 contains the ‘true’ average IQ of all college students in the US.

Interval Estimates of  for Different Samples

Given independent observations x1, x2, ..., xn from a probability density function (continuous case) f(x; ) or probability mass function (discrete case) p(x; ) the maximum likelihood estimator is that which maximizes the likelihood function. L(X1, X2, ..., Xn; ) = f(X1; )·f(X2; )·...·f(Xn; ), if x is continuous = p(X1; )·p(X2; )·...·p(Xn; ), if x is discrete Method of Maximum Likelihood

Method of Maximum Likelihood Let x1, x2, ..., xn denote observed values in a sample. In the case of a discrete random variable the interpretation is very clear. The quantity L(x1, x2, ..., xn; ), the likelihood of the sample, is the following joint probability: P(X1 = x1, X2 = x2, ... , Xn = xn) This is the probability of obtaining the sample values x1, x2, ..., xn. For the discrete case the maximum likelihood estimator is one that results in a maximum value for this joint probability, or maximizes the likelihood of the sample.

Estimation of Proportions

Estimation of Proportions • Estimation of the proportion, p, based on a random sample of the Binomial Distribution B(n,p) • Point Estimation • Interval Estimation • Approximate Method • Sample Size • Exact Method • Estimation of the difference in two proportions P1 - P2 based on random samples from • B(n, P1) and B(n, p2).

Estimation of Proportions • Point Estimation • Interval Estimation

Estimation - Binomial Distribution • Estimation of a Proportion, p • X1, X2, …, Xn is a random sample of size n from • B(n, p), where • Point estimate of p: • where fs = # of successes _ ^

Estimation - Binomial Distribution • Approximate (1 - )·100% confidence interval • for p: • where and • where , • and is the value of the standard normal • random variable Z such that ^ ^ ^ ^

Note When n is small and the unknown proportion p is believed to be close to 0 or to 1, the approximate confidence interval procedure established here is unreliable and, therefore, should not be used. To be on the safe side, one should require or ^ ^

error ^ ^ ^ ^ ^ ^ ^ ^ Error in Estimating p by p

^ ^ ^ Error in Estimating p by p • If p is used as an estimate of p, we can be • (1 - )·100% confident that the error will not • exceed • If p is used as an estimate of p, we can be • (1 - )·100% confident that the error will be less • than a specified amount e when the sample size is ^ ^ ^ ^

If p is not used as an estimate of p, we can be • at least (1 - )·100% confident that the error will • not exceed a specified amount e when the sample • size is ^ Error in Estimating p by p ^

Example In a random sample of n = 500 families owning television sets in the city of Hamilton, Canada, it is found that x = 340 subscribed to HBO. a. How large a sample is required if we want to be 95% confident that our estimate of p, p, is within 0.02? b. How large a sample is required if we want to be 95% confident that our estimate of p is within 0.02? ^

a. Let us treat the 500 families as a preliminary sample providing an estimate p = 0.68. ^ Therefore, if we base our estimate of p on a random sample of size 2090, we can be 95% confident that our sample proportion will not differ from the true proportion by more than 0.02 Example - Solution

Example - Solution b. We shall now assume that no preliminary sample has been taken to provide an estimate of p. Consequently, we can be at least 95% confident that our sample proportion will not differ from the true proportion by more than 0.02 if we choose a sample of size

Example: Estimation of Binomial parameter p In a random sample of n = 500 families owning television sets in the city of Hamilton, Canada, it was found that fS = 340 owned color sets. Estimate the population proportion of families with color TV sets and determine a 95% confidence interval for the actual proportion of families in this city with color sets.

Example: solution The point estimate of p is = 340/500 = 0.68. Then, an approximate 95% confidence interval for p is where and and so that ^ ^ ^ ^ ^

Example: solution and an approximate 95% confidence interval for p is (0.63911, 0.72089). Therefore, our “best” (point estimate) of p is 0.68 and we are about 95% confident that p is between 0.64 and 0.72.

( ) P , P L U • Estimation - Binomial Population • Exact (1 - )·100% Confidence Interval for p: • , where F1 = • and • , where F2 = , • and is the value of x for which • P(X> )= • NOTE: Use the ‘FINV’ function in Excel to get thevalues of F1 and F2

Example A random sample of 25 vehicle records are selected for audit from a large number of county records. It is found that 5 have errors. Estimate the population proportion of vehicle records having errors in terms of a point estimate and 95% confidence interval.

An approximate 95% confidence interval for p is where  = 0.05 and ^ Example - solution

Example - solution Then ^ ^ ^

^ ^ ^ Example - solution

Example - solution An exact 95% confidence interval for p is where

Example - solution and

Estimation - Binomial Populations • Estimation of the difference between two • proportions • Let X11, X12, …, , and X21, X22, …, , • be random samples from B(n1, p1) and • B(n2, p2) respectively • Point estimation of p1 – p2 = p ^ ^ ^

Estimation - Binomial Populations • Approximate (1 - )·100% confidence interval • for • where • and ^ ^ ^ ^ ^ ^ ^ ^ ^ ^

Example: Estimation of P1 - P2 A certain change in a manufacturing process for component parts is being considered. Samples are taken using both the existing and the new procedure in order to determine if the new procedure results in an improvement. If 75 of 1500 items from the existing procedure were found to be defective and 80 of 2000 items from the new procedure were found to be defective, find a confidence interval for the true difference in the fraction of defectives between the existing and the new process.

Example: solution Let p1 and p2 be the true proportions of defectives for the existing and new procedures, respectively. Hence and and the point estimate of p = p1 - p2 is = 0.05 - 0.04 = 0.01 ^ ^ ^ ^ ^

An approximate 90% confidence interval for p = p1 - p2 is where Example: solution

Example: solution ^ ^ ^ ^ ^ ^

Example: solution Then Therefore an approximate 90% confidence interval for  = p1 - p2 is (-0.0017, 0.0217). or about 93% of the length of the confidence Interval favors the new procedure

Dr. Jerrell T. Stracener