640 likes | 1.49k Views
Inferential Statistics: Parameter Estimation. Inferential Statistics. Inferential statistics is a body of quantitative techniques that enable the scientist to make appropriate generalization from limited observations. (Frank & Althoen , “Statistics: Concepts and applications”, 1994)
E N D
Inferential Statistics Inferential statistics is a body of quantitative techniques that enable the scientist to make appropriate generalization from limited observations. (Frank & Althoen, “Statistics: Concepts and applications”, 1994) Inferential statistics (or statistical inference or statistical induction) is used to describe systems of procedures that can be used to draw conclusions from datasets arising from systems affected by random variation, such as observational errors, random sampling, or random experimentation.
Inferential Statistics http://en.wikipedia.org/wiki/Parametric_statistics http://en.wikipedia.org/wiki/Non-parametric_statistics • From sample data, inferential statistics are used for • Parameter estimation • Hypothesis testing • Parametric statistics is a branch of statistics that assumes that the data has come from a type of probability distribution and makes inferences about the parameters of the distribution. • Non-parametric statistics refers to statistics that do not assume the data or population have any characteristic structure or parameters.
Inferential Statistics • Estimation • Estimate population means • Estimate population proportion • Estimate population variance • Hypothesis testing • Testing population means • Testing categorical data / proportion • Testing population variances • Hypothesis about many population means • One-way ANOVA • Two-way ANOVA
Estimation • To infer a parameter of population from a statistic of sample • From x (mean of sample) to μ (mean of population) • From proportion of sample to proportion of population • From SD2 (sample variance) to σ2 (population variance) • Point estimation • Interval estimation
Estimator http://en.wikipedia.org/wiki/Estimator • An estimator is a rule for calculating an estimate of a given quantity based on observed data • An estimator should • Be unbiased • Be consistent with increasing sample size • Yield minimum variance from population
Point Estimation “A point estimator is a single-valued statistic that approximates the value of a population parameter.” Usually use the same (unbiased) value as statistics of sample Point estimation of μ Point estimation of population proportion p Point estimation of σ2
Interval Estimation “A confidence interval is a range of values that is expected to include the population parameter.” Involve calculating a +-value from the statistics of sample depending on a confidence level Point estimation is similar to the usual calculation Interval estimation of mean μ Interval estimation of population proportion p Interval estimation of variance σ2
Interval Estimation of Mean • Single population • Two independent samples • Known population variances, large sample • Unknown population variances, small sample • Two dependent samples
Interval Estimation of μ Confidence level 95% (α = 0.05) Single population x = mean of sample σ = SD of population n = number of sample z = z-score from table at certain α(or in this case α/2)
Example A water supplier want to estimate the water volume in its 1 gallon container, the manager reports that the standard deviation of the container is 0.02 gallon. From 50 sample containers, the mean volume is 0.995 gallon. Assuming normal distribution and 99% confidence level, find the confidence interval of the container.
Example x = 0.995, σ = 0.02 Confidence level 99% (α = 0.01) Z(0.01/2) = Z(0.005) = 2.575 The confidence interval is between 0.988 and 1.002
Interval Estimation of μ • Two independent samples • Known population variances, large sample (n >= 30) • Estimate the delta of means of the two samples • x1 and x2 = mean of sample 1 and 2 • σ1 and σ2 = SD of population 1 and 2 • n1 and n2 = number of sample 1 and 2 • z = z-score from table
Interval Estimation of μ • Two independent samples • Unknown unequal population variances, small sample (n< 30) • t = t-score from table at given α and degree of freedom ν(nu) • S = SD of sample
Example Estimate the 95% confidence interval of mean difference between the credit card payments of 10 male employees and 12 female employees from the data collected in the table below Assume normal distribution in populations of unequal variance
Example • μ1 = average credit card payment of male employees • μ2 = average credit card payment of female employees • Male group • n1 = 10,
Example Female group
Interval Estimation of μ Two dependent samples (paired samples) d = delta of each pair d = average delta sd = standard deviation of delta
Example Find the 90% confidence interval of the difference between the following posttest scores and pretest scores
Point Estimation of Proportion • Sample proportion is the proportion of interest per total sample • Population proportion is the proportion of interest per total population • Use the same value from sample • If we study 1000 sample and are interested in 450 persons who have used Skytrain then P = 450 / 1000 = 0.45
Interval Estimation of P Single population Two populations
Interval Estimation of P • Single population • Analogous to interval estimation of mean of single population but requires the size of sample to be sufficiently large by np >=5 • Interval estimation of population proportion: • n = size of sample • p = proportion in sample • pq = proportion variance (q = 1 – p) • Assume normal distribution of population
Example A marketing company hold a campaign for a beverage product involving a taste test and a 10% discount. After one week, 450 individuals have tested the product, 50 of which have bought the product. Find the 95% confidence interval of the proportion of the customers who bought the product. p = 50 / 450 = 0.11 q = 1 - p = 1 - 0.11 = 0.89 zα/2 = z0.025 = 1.96
Example The 95% confidence interval is then between 0.08 % to 0.14%
Interval Estimation of P Two populations : difference of proportion Require large sample size (n>=30)
Example From a sample of 98 students of faculty of IT, 48 students use smart phones. From a sample of 127 students of faculty of art, 21 students use smart phones. Find the 95% confidence interval of the difference in proportion between the two populations pit = 48 / 98 = 0.49 part = 21 / 127 = 0.165
Example The 95% confidence interval of the difference in proportion between the two populations is then between 0.207 to 0.443
Estimation of Variance • Point estimation -> unbiased variance • Interval estimation • Single population • Two populations
Interval Estimation of σ2 Single population Use chi-square distribution
Example From a sample of 15 tablets of a drug, standard deviation of its chemical compound is 0.80. Find the 90% confidence interval of the variance of the tablet population. n = 15, S = 0.80, α = 0.10, α/2 = 0.05,ν = n-1 = 14 (degree of freedom)
Example The 90% confidence interval of variance of the drug chemical compound is then between 0.378 and 1.364
Interval Estimation of σ2 Two population : interval of variance ratio of two populations Use F-distribution
Interval Estimation of σ2 Confidence interval Degree of freedom (ν1, ν2) where ν1 = n1 – 1, ν2 = n2 - 1
Example Study the two group of bankers in terms of revenue generation. The first group of 6 bankers increases revenue by 9.972% with SD 7.470. The second group of 9 bankers increase revenue by 2.098% with SD 10.834. Assuming normal distribution, find the 90% confidence interval of the variance ratio of the increased revenue. = variance of increased revenue of the first group = variance of increased revenue of the second group
Example The 90% confidence interval of the variance ratio of the increased revenue is between 0.149 and 2.65