Minimal sufficient statistic

Minimal sufficient statistic Department of Computer and Information Science (IDA) Linköpings universitet, Sweden

A statistic defines a partition of the sample space of (X1, … , Xn ) into classes satisfying T(x1, … , xn ) = t for different values of t. If such a partition puts the sample x =(x1, … , xn) and y = (y1, … , yn) into the same class if and only if then T is minimal sufficient for  Department of Computer and Information Science (IDA) Linköpings universitet, Sweden

Example Department of Computer and Information Science (IDA) Linköpings universitet, Sweden

Rao-Blackwell theorem Department of Computer and Information Science (IDA) Linköpings universitet, Sweden

The Exponential family of distributions • A random variable X belongs to the (k-parameter) exponential family of probability distributions if the p.d.f. of X can be written • What about • N(,  2 ) ? • Po( ) ? • U(0, ) ? Department of Computer and Information Science (IDA) Linköpings universitet, Sweden

For a random sample x = (x1, … , xn ) from a distribution belonging to the exponential family Department of Computer and Information Science (IDA) Linköpings universitet, Sweden

Exponential family written on the canonical form: Department of Computer and Information Science (IDA) Linköpings universitet, Sweden

Completeness • Let x1, … , xn be a random sample from a distribution with p.d.f. f (x; )and T = T (x1, … , xn ) a statistic • Then T is complete for if wheneverhT (T ) is a function of T such that E[hT (T )] = 0 for all values of  then Pr(hT (T )  0) = 1 • Important lemmas from this definition: • Lemma 2.6: If T is a complete sufficient statistic for  and h (T ) is a function of T such that E[h (T ) ] =  , then h is unique (there is at most one such function) • Lemma 2.7: If there exists a Minimum Variance Unbiased Estimator (MVUE) for  and h (T ) is an unbiased estimator for  , where T is a complete minimal sufficient statistic for  , then h (T ) is MVUE • Lemma 2.8: If a sample is from a distribution belonging to the exponential family, then (B1(xi ) , … , Bk(xi) ) is complete and minimal sufficient for 1 , … , k Department of Computer and Information Science (IDA) Linköpings universitet, Sweden

Maximum-Likelihood estimation Consider as usual a random sample x = x1, … , xnfrom a distribution with p.d.f. f (x;  ) (and c.d.f. F(x;  ) ) The maximum likelihood point estimatorof  is the value of  that maximizes L( ; x) or equivalently maximizes l( ; x) Useful notation: With a k-dimensional parameter: Department of Computer and Information Science (IDA) Linköpings universitet, Sweden

Complete sample case: If all sample values are explicitly known, then Censored data case: If some ( say nc)of the sample values are censored , e.g. xi< k1 or xi> k2 , then where Department of Computer and Information Science (IDA) Linköpings universitet, Sweden

When the sample comes from a continuous distribution the censored data case can be written In the case the distribution is discrete the use of F is also possible: If k1 and k2 are values that can be attained by the random variables then we may write where Department of Computer and Information Science (IDA) Linköpings universitet, Sweden

Department of Computer and Information Science (IDA) Linköpings universitet, Sweden

Too complicated to find an analytical solutions. Solve by a numerical routine! Department of Computer and Information Science (IDA) Linköpings universitet, Sweden

Exponential family distributions: Use the canonical form (natural parameterization): Let Then the maximum likelihood estimators (MLEs) of 1, … , k are found by solving the system of equations Department of Computer and Information Science (IDA) Linköpings universitet, Sweden

Computational aspects • When the MLEs can be found by evaluating • numerical routines for solving the generic equation g( ) = 0 can be used. • Newton-Raphson method • Fisher’s method of scoring (makes use of the fact that under regularity conditions: • ) • This is the multidimensional analogue of Lemma 2.1 ( see page 17) Department of Computer and Information Science (IDA) Linköpings universitet, Sweden

When the MLEs cannot be found the above way other numerical routines must be used: • Simplex method • EM-algorithm • For description of the numerical routines see textbook. • Maximum Likelihood estimation comes into natural use not for handling the standard case, i.e. a complete random sample from a distribution within the exponential family , but for finding estimators in more non-standard and complex situations. Department of Computer and Information Science (IDA) Linköpings universitet, Sweden

Properties of MLEs Invariance: Consistency: Under some weak regularity conditionsall MLEs are consistent Department of Computer and Information Science (IDA) Linköpings universitet, Sweden

Efficiency: Under the usual regularity conditions: (Asymptotically efficient and normally distributed) Sufficiency: Department of Computer and Information Science (IDA) Linköpings universitet, Sweden

Invariance property  Department of Computer and Information Science (IDA) Linköpings universitet, Sweden

i.e. the two MLEs are asymptotically uncorrelated (and by the normal distribution independent) Department of Computer and Information Science (IDA) Linköpings universitet, Sweden

Modifications and extensions Ancillarity and conditional sufficiency: Department of Computer and Information Science (IDA) Linköpings universitet, Sweden

Profile likelihood: This concept has its main use in cases where  1 contains the parameters of “interest” and  2 contains nuisance parameters. The same ML point estimator for  1 is obtained by maximizing the profile likelihood as by maximizing the full likelihood function Department of Computer and Information Science (IDA) Linköpings universitet, Sweden

Marginal and conditional likelihood: Again, these concepts have their main use in cases where  1 contains the parameters of “interest” and  2 contains nuisance parameters. Department of Computer and Information Science (IDA) Linköpings universitet, Sweden

Penalized likelihood: MLEs can be derived subjected to some criteria of smoothness. In particular this is applicable when the parameter is no longer a single value (one- or multidimensional), but a function such as an unknown density function or a regression curve. The penalized log-likelihood function is written Department of Computer and Information Science (IDA) Linköpings universitet, Sweden

Method of moments estimation (MM ) Department of Computer and Information Science (IDA) Linköpings universitet, Sweden

The method of moments point estimator of  = ( 1, … ,  k ) is obtained by solving for  1, … ,  k the systems of equations Department of Computer and Information Science (IDA) Linköpings universitet, Sweden

Method of Least Squares (LS) First principles: Assume a sample xwhere the random variable Xi can be written The least-squares estimator of  is the value of  that minimizes i.e. Department of Computer and Information Science (IDA) Linköpings universitet, Sweden

A more general approach Assume the sample can be written (x, z ) where xirepresents the random variable of interest (endogenous variable) and zi represent either an auxiliary random variable (exogenous) or a given constant for sample point i The least squares estimator of  is then Department of Computer and Information Science (IDA) Linköpings universitet, Sweden

Special cases The ordinary linear regression model: The heteroscedastic regression model: Department of Computer and Information Science (IDA) Linköpings universitet, Sweden

The first-order auto-regressive model: Department of Computer and Information Science (IDA) Linköpings universitet, Sweden

The conditional least-squares estimator of  (given ) is Department of Computer and Information Science (IDA) Linköpings universitet, Sweden

Minimal sufficient statistic

Minimal sufficient statistic

Presentation Transcript

Statistic Graphs

Statistic Leaders

“Sufficient Quality”

Industry Statistic

Statistic Leaders

Statistic

Descriptive Statistic

Sufficient Statistics

Statistic Descriptive

Statistic Process Control

Student’s t statistic

Statistic Warmups

t-statistic

Descriptive Statistic

t-statistic

Statistic

Minimal