590 likes | 1.28k Views
Minimal sufficient statistic. A statistic defines a partition of the sample space of (X 1 , … , X n ) into classes satisfying T(x 1 , … , x n ) = t for different values of t. If such a partition puts the sample x = ( x 1 , … , x n ) and
E N D
Minimal sufficient statistic Department of Computer and Information Science (IDA) Linköpings universitet, Sweden
A statistic defines a partition of the sample space of (X1, … , Xn ) into classes satisfying T(x1, … , xn ) = t for different values of t. If such a partition puts the sample x =(x1, … , xn) and y = (y1, … , yn) into the same class if and only if then T is minimal sufficient for Department of Computer and Information Science (IDA) Linköpings universitet, Sweden
Example Department of Computer and Information Science (IDA) Linköpings universitet, Sweden
Rao-Blackwell theorem Department of Computer and Information Science (IDA) Linköpings universitet, Sweden
The Exponential family of distributions • A random variable X belongs to the (k-parameter) exponential family of probability distributions if the p.d.f. of X can be written • What about • N(, 2 ) ? • Po( ) ? • U(0, ) ? Department of Computer and Information Science (IDA) Linköpings universitet, Sweden
For a random sample x = (x1, … , xn ) from a distribution belonging to the exponential family Department of Computer and Information Science (IDA) Linköpings universitet, Sweden
Exponential family written on the canonical form: Department of Computer and Information Science (IDA) Linköpings universitet, Sweden
Completeness • Let x1, … , xn be a random sample from a distribution with p.d.f. f (x; )and T = T (x1, … , xn ) a statistic • Then T is complete for if wheneverhT (T ) is a function of T such that E[hT (T )] = 0 for all values of then Pr(hT (T ) 0) = 1 • Important lemmas from this definition: • Lemma 2.6: If T is a complete sufficient statistic for and h (T ) is a function of T such that E[h (T ) ] = , then h is unique (there is at most one such function) • Lemma 2.7: If there exists a Minimum Variance Unbiased Estimator (MVUE) for and h (T ) is an unbiased estimator for , where T is a complete minimal sufficient statistic for , then h (T ) is MVUE • Lemma 2.8: If a sample is from a distribution belonging to the exponential family, then (B1(xi ) , … , Bk(xi) ) is complete and minimal sufficient for 1 , … , k Department of Computer and Information Science (IDA) Linköpings universitet, Sweden
Maximum-Likelihood estimation Consider as usual a random sample x = x1, … , xnfrom a distribution with p.d.f. f (x; ) (and c.d.f. F(x; ) ) The maximum likelihood point estimatorof is the value of that maximizes L( ; x) or equivalently maximizes l( ; x) Useful notation: With a k-dimensional parameter: Department of Computer and Information Science (IDA) Linköpings universitet, Sweden
Complete sample case: If all sample values are explicitly known, then Censored data case: If some ( say nc)of the sample values are censored , e.g. xi< k1 or xi> k2 , then where Department of Computer and Information Science (IDA) Linköpings universitet, Sweden
When the sample comes from a continuous distribution the censored data case can be written In the case the distribution is discrete the use of F is also possible: If k1 and k2 are values that can be attained by the random variables then we may write where Department of Computer and Information Science (IDA) Linköpings universitet, Sweden
Example Department of Computer and Information Science (IDA) Linköpings universitet, Sweden
Department of Computer and Information Science (IDA) Linköpings universitet, Sweden
Example Department of Computer and Information Science (IDA) Linköpings universitet, Sweden
Too complicated to find an analytical solutions. Solve by a numerical routine! Department of Computer and Information Science (IDA) Linköpings universitet, Sweden
Exponential family distributions: Use the canonical form (natural parameterization): Let Then the maximum likelihood estimators (MLEs) of 1, … , k are found by solving the system of equations Department of Computer and Information Science (IDA) Linköpings universitet, Sweden
Example Department of Computer and Information Science (IDA) Linköpings universitet, Sweden
Department of Computer and Information Science (IDA) Linköpings universitet, Sweden
Computational aspects • When the MLEs can be found by evaluating • numerical routines for solving the generic equation g( ) = 0 can be used. • Newton-Raphson method • Fisher’s method of scoring (makes use of the fact that under regularity conditions: • ) • This is the multidimensional analogue of Lemma 2.1 ( see page 17) Department of Computer and Information Science (IDA) Linköpings universitet, Sweden
When the MLEs cannot be found the above way other numerical routines must be used: • Simplex method • EM-algorithm • For description of the numerical routines see textbook. • Maximum Likelihood estimation comes into natural use not for handling the standard case, i.e. a complete random sample from a distribution within the exponential family , but for finding estimators in more non-standard and complex situations. Department of Computer and Information Science (IDA) Linköpings universitet, Sweden
Example Department of Computer and Information Science (IDA) Linköpings universitet, Sweden
Department of Computer and Information Science (IDA) Linköpings universitet, Sweden
Properties of MLEs Invariance: Consistency: Under some weak regularity conditionsall MLEs are consistent Department of Computer and Information Science (IDA) Linköpings universitet, Sweden
Efficiency: Under the usual regularity conditions: (Asymptotically efficient and normally distributed) Sufficiency: Department of Computer and Information Science (IDA) Linköpings universitet, Sweden
Example Department of Computer and Information Science (IDA) Linköpings universitet, Sweden
Department of Computer and Information Science (IDA) Linköpings universitet, Sweden
Department of Computer and Information Science (IDA) Linköpings universitet, Sweden
Invariance property Department of Computer and Information Science (IDA) Linköpings universitet, Sweden
Department of Computer and Information Science (IDA) Linköpings universitet, Sweden
Department of Computer and Information Science (IDA) Linköpings universitet, Sweden
i.e. the two MLEs are asymptotically uncorrelated (and by the normal distribution independent) Department of Computer and Information Science (IDA) Linköpings universitet, Sweden
Modifications and extensions Ancillarity and conditional sufficiency: Department of Computer and Information Science (IDA) Linköpings universitet, Sweden
Profile likelihood: This concept has its main use in cases where 1 contains the parameters of “interest” and 2 contains nuisance parameters. The same ML point estimator for 1 is obtained by maximizing the profile likelihood as by maximizing the full likelihood function Department of Computer and Information Science (IDA) Linköpings universitet, Sweden
Marginal and conditional likelihood: Again, these concepts have their main use in cases where 1 contains the parameters of “interest” and 2 contains nuisance parameters. Department of Computer and Information Science (IDA) Linköpings universitet, Sweden
Penalized likelihood: MLEs can be derived subjected to some criteria of smoothness. In particular this is applicable when the parameter is no longer a single value (one- or multidimensional), but a function such as an unknown density function or a regression curve. The penalized log-likelihood function is written Department of Computer and Information Science (IDA) Linköpings universitet, Sweden
Method of moments estimation (MM ) Department of Computer and Information Science (IDA) Linköpings universitet, Sweden
Department of Computer and Information Science (IDA) Linköpings universitet, Sweden
The method of moments point estimator of = ( 1, … , k ) is obtained by solving for 1, … , k the systems of equations Department of Computer and Information Science (IDA) Linköpings universitet, Sweden
Example Department of Computer and Information Science (IDA) Linköpings universitet, Sweden
Department of Computer and Information Science (IDA) Linköpings universitet, Sweden
Method of Least Squares (LS) First principles: Assume a sample xwhere the random variable Xi can be written The least-squares estimator of is the value of that minimizes i.e. Department of Computer and Information Science (IDA) Linköpings universitet, Sweden
A more general approach Assume the sample can be written (x, z ) where xirepresents the random variable of interest (endogenous variable) and zi represent either an auxiliary random variable (exogenous) or a given constant for sample point i The least squares estimator of is then Department of Computer and Information Science (IDA) Linköpings universitet, Sweden
Special cases The ordinary linear regression model: The heteroscedastic regression model: Department of Computer and Information Science (IDA) Linköpings universitet, Sweden
The first-order auto-regressive model: Department of Computer and Information Science (IDA) Linköpings universitet, Sweden
The conditional least-squares estimator of (given ) is Department of Computer and Information Science (IDA) Linköpings universitet, Sweden