Inferential Statistics: Parameter Estimation

Inferential Statistics: Parameter Estimation

Inferential Statistics Inferential statistics is a body of quantitative techniques that enable the scientist to make appropriate generalization from limited observations. (Frank & Althoen, “Statistics: Concepts and applications”, 1994) Inferential statistics (or statistical inference or statistical induction) is used to describe systems of procedures that can be used to draw conclusions from datasets arising from systems affected by random variation, such as observational errors, random sampling, or random experimentation.

Inferential Statistics http://en.wikipedia.org/wiki/Parametric_statistics http://en.wikipedia.org/wiki/Non-parametric_statistics • From sample data, inferential statistics are used for • Parameter estimation • Hypothesis testing • Parametric statistics is a branch of statistics that assumes that the data has come from a type of probability distribution and makes inferences about the parameters of the distribution. • Non-parametric statistics refers to statistics that do not assume the data or population have any characteristic structure or parameters.

Inferential Statistics • Estimation • Estimate population means • Estimate population proportion • Estimate population variance • Hypothesis testing • Testing population means • Testing categorical data / proportion • Testing population variances • Hypothesis about many population means • One-way ANOVA • Two-way ANOVA

Estimation • To infer a parameter of population from a statistic of sample • From x (mean of sample) to μ (mean of population) • From proportion of sample to proportion of population • From SD2 (sample variance) to σ2 (population variance) • Point estimation • Interval estimation

Estimator http://en.wikipedia.org/wiki/Estimator • An estimator is a rule for calculating an estimate of a given quantity based on observed data • An estimator should • Be unbiased • Be consistent with increasing sample size • Yield minimum variance from population

Point Estimation “A point estimator is a single-valued statistic that approximates the value of a population parameter.” Usually use the same (unbiased) value as statistics of sample Point estimation of μ Point estimation of population proportion p Point estimation of σ2

Interval Estimation “A confidence interval is a range of values that is expected to include the population parameter.” Involve calculating a +-value from the statistics of sample depending on a confidence level Point estimation is similar to the usual calculation Interval estimation of mean μ Interval estimation of population proportion p Interval estimation of variance σ2

Interval Estimation of Mean • Single population • Two independent samples • Known population variances, large sample • Unknown population variances, small sample • Two dependent samples

Interval Estimation of μ Confidence level 95% (α = 0.05) Single population x = mean of sample σ = SD of population n = number of sample z = z-score from table at certain α(or in this case α/2)

Common Confidence Level

Example A water supplier want to estimate the water volume in its 1 gallon container, the manager reports that the standard deviation of the container is 0.02 gallon. From 50 sample containers, the mean volume is 0.995 gallon. Assuming normal distribution and 99% confidence level, find the confidence interval of the container.

Example x = 0.995, σ = 0.02 Confidence level 99% (α = 0.01) Z(0.01/2) = Z(0.005) = 2.575 The confidence interval is between 0.988 and 1.002

Interval Estimation of μ • Two independent samples • Known population variances, large sample (n >= 30) • Estimate the delta of means of the two samples • x1 and x2 = mean of sample 1 and 2 • σ1 and σ2 = SD of population 1 and 2 • n1 and n2 = number of sample 1 and 2 • z = z-score from table

Interval Estimation of μ • Two independent samples • Unknown unequal population variances, small sample (n< 30) • t = t-score from table at given α and degree of freedom ν(nu) • S = SD of sample

Example Estimate the 95% confidence interval of mean difference between the credit card payments of 10 male employees and 12 female employees from the data collected in the table below Assume normal distribution in populations of unequal variance

Example • μ1 = average credit card payment of male employees • μ2 = average credit card payment of female employees • Male group • n1 = 10,

Example Female group

Example

t-score table

Interval Estimation of μ Two dependent samples (paired samples) d = delta of each pair d = average delta sd = standard deviation of delta

Example Find the 90% confidence interval of the difference between the following posttest scores and pretest scores

Example

Point Estimation of Proportion • Sample proportion is the proportion of interest per total sample • Population proportion is the proportion of interest per total population • Use the same value from sample • If we study 1000 sample and are interested in 450 persons who have used Skytrain then P = 450 / 1000 = 0.45

Interval Estimation of P Single population Two populations

Interval Estimation of P • Single population • Analogous to interval estimation of mean of single population but requires the size of sample to be sufficiently large by np >=5 • Interval estimation of population proportion: • n = size of sample • p = proportion in sample • pq = proportion variance (q = 1 – p) • Assume normal distribution of population

Example A marketing company hold a campaign for a beverage product involving a taste test and a 10% discount. After one week, 450 individuals have tested the product, 50 of which have bought the product. Find the 95% confidence interval of the proportion of the customers who bought the product. p = 50 / 450 = 0.11 q = 1 - p = 1 - 0.11 = 0.89 zα/2 = z0.025 = 1.96

Example The 95% confidence interval is then between 0.08 % to 0.14%

Interval Estimation of P Two populations : difference of proportion Require large sample size (n>=30)

Example From a sample of 98 students of faculty of IT, 48 students use smart phones. From a sample of 127 students of faculty of art, 21 students use smart phones. Find the 95% confidence interval of the difference in proportion between the two populations pit = 48 / 98 = 0.49 part = 21 / 127 = 0.165

Example The 95% confidence interval of the difference in proportion between the two populations is then between 0.207 to 0.443

Estimation of Variance • Point estimation -> unbiased variance • Interval estimation • Single population • Two populations

Interval Estimation of σ2 Single population Use chi-square distribution

Example From a sample of 15 tablets of a drug, standard deviation of its chemical compound is 0.80. Find the 90% confidence interval of the variance of the tablet population. n = 15, S = 0.80, α = 0.10, α/2 = 0.05,ν = n-1 = 14 (degree of freedom)

Example The 90% confidence interval of variance of the drug chemical compound is then between 0.378 and 1.364

Interval Estimation of σ2 Two population : interval of variance ratio of two populations Use F-distribution

Interval Estimation of σ2 Confidence interval Degree of freedom (ν1, ν2) where ν1 = n1 – 1, ν2 = n2 - 1

Example Study the two group of bankers in terms of revenue generation. The first group of 6 bankers increases revenue by 9.972% with SD 7.470. The second group of 9 bankers increase revenue by 2.098% with SD 10.834. Assuming normal distribution, find the 90% confidence interval of the variance ratio of the increased revenue. = variance of increased revenue of the first group = variance of increased revenue of the second group

Example The 90% confidence interval of the variance ratio of the increased revenue is between 0.149 and 2.65

Inferential Statistics: Parameter Estimation

Inferential Statistics: Parameter Estimation

Presentation Transcript

Cost Estimation

Chapter 6 Probability Distributions

Basic Descriptive and Inferential Statistics

Inferential statistics

BASICS OF WET STATISTICS

Sampling Distributions and Point Estimation of Parameters

Chapter 1 Exploring Data

Introduction to Statistics

Motion Detail Preserving Optical Flow Estimation

Hemoglobin estimation by Sahli's method ( Sahli’s haemoglobinometer )

Data and Statistics

Statistics And Application

Software Engineering Software Cost Estimation

TECO Servo Drives JSDA Series parameter description

Angle of Arrival Estimation (AOA)

Chapter 6 Probability Distributions

Descriptive Statistics

SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Educational Research: Data analysis and interpretation – 2 Inferential statistics

XI. Estimation of linear model parameters