150 likes | 288 Views
DATA ANALYSIS. Module Code: CA660 Lecture Block 6: Alternative estimation methods and their implementation. MAXIMUM LIKELIHOOD ESTIMATION. Recall general points: Estimation, definition of Likelihood function for a vector of parameters and set of values x .
E N D
DATA ANALYSIS Module Code: CA660 Lecture Block 6: Alternative estimation methods and their implementation
MAXIMUM LIKELIHOOD ESTIMATION • Recallgeneral points: Estimation, definition of Likelihood function for a vector of parameters and set of values x. • Find most likely value of = maximise the Likelihood fn. • Also defined Log-likelihood(Support fn. S()) and its derivative, the Score, together with Information content per observation, which for single parameter likelihood is given by • Why MLE? (Need to know underlying distribution). • Properties: Consistency; sufficiency; asymptotic efficiency(linked to variance);unique maximum; invarianceand, hence most convenientparameterisation; usually MVUE; amenableto conventional optimisation methods.
VARIANCE, BIAS & CONFIDENCE • Variance of an Estimator - usual form or • for k independent estimates • For a large sample, variance of MLE can be approximated by • can also estimate empirically, using re-sampling*techniques. • Variance of a linear function (of several estimates) – (common need in genomics analysis, e.g. heritability), in risk analysis • RecallBiasof the Estimator • then the Mean Square Erroris defined to be: • expands to • so we have the basis for C.I.andtests of hypothesis.
COMMONLY-USED METHODS of obtaining MLE • Analytical - solvingor when simple solutions exist • Grid search or likelihood profile approach • Newton-Raphson iteration methods • EM (expectation and maximisation) algorithm • N.B. Log.-likelihood, because max. same value as Likelihood • Easier to compute • Close relationship between statisticalproperties of MLE • and Log-likelihood
MLE Methods in outline • Analytical : - recall Binomial example earlier • Example : For Normal, MLE’s of mean and variance, (taking derivatives w.r.t mean and variance separately), and equivalent to sample mean and actual variance (i.e. /N), • - unbiased if mean known, biased if not. • Invariance : One-to-one relationships preserved • Used: whenMLE has a simple solution
MLE Methods in outline contd. • Grid Search – Computational • Plot likelihood or log-likelihood vs parameter. Various features • Relative Likelihood=Likelihood/Max. Likelihood (ML set =1). • Peak of R.L. can be visually identified /sought algorithmically. e.g. • Plot likelihood and parameter space range - gives 2 peaks, symmetrical around ( likelihood profile for e.g. well-known mixed linkage analysis problem. Or for similar example of populations following known proportion splits). • If now constrain MLE solution unique e.g.= R.F. between genes (possible mixed linkage phase).
MLE Methods in outline contd. • Graphic/numericalImplementation - initial estimate of . Directionofsearch determined by evaluating likelihood to both sides of . Search takes direction giving increase, because looking for max. Initial search increments large, e.g. 0.1, then when likelihood change starts to decrease or become negative, stop and refineincrement. Issues: • Multiple peaks– can miss global maximum, computationally intensive ; see e.g. http://statgen.iop.kcl.ac.uk/bgim/mle/sslike_1.html • Multiple Parameters- grid search. Interpretation of Likelihood profiles can be difficult, e.g. http://blogs.sas.com/content/iml/2011/10/12/maximum-likelihood-estimation-in-sasiml/
Example in outline • Data e.g used to show a linkagerelationship (non-independence) between e.g. marker and a given disease gene, or (e.g. between sex and purchase) of computer games. • Escapes = individuals who are susceptible, but show no disease phenotype under experimental conditions: (express interest but no purchase record). So define as proportion of escapes and R.F. respectively. • is penetrancefor disease trait or of purchasing, i.e. • P{ that individual with susceptible genotype has disease phenotype}. • P{individual of given sex and interested who actually buys} • Purpose of expt.-typically to estimate R.F. between marker and gene or proportion of a sex that purchases • Use: Support function = Log-Likelihood. Often quite complex, e.g. for above example, might have
Example contd. • Setting 1st derivatives (Scores) w.r.t and w.r.t. • Expected value of Score (w.r.t. is zero, (see analogies in classical sampling/hypothesis testing). Similarly for . Here, however, Nosimple analytical solution, so can not solve directly for either. • Using grid search, likelihood reaches maximum at e.g. • In general, this type of experiment tests H0: Independence between the factors (marker and gene), (sex and purchase) • and H0: no escapes • Uses LikelihoodRatioTeststatistics. (M.L.E. 2 equivalent)
MLE Methods in outline contd. • Newton-Raphson Iteration • Have Score () = 0 from previously.N-R consists of replacing Score by linear terms of its Taylor expansion, so if ´´ a solution, ´=1st guess • Repeatwith ´´replacing´ • Each iteration - fits a parabolato • Likelihood Fn. • Problems- Multiple peaks, zero Information, extreme estimates • Multiple parameters– need matrix notation, where S matrix e.g. has elements = derivatives of S(, ) w.r.t. and respectively. Similarly, Information matrix has terms of form • Estimates are L.F. 2nd 1st Variance of Log-L i.e.S()
MLE Methods in outline contd. • Expectation-Maximisation Algorithm- Iterative. Incompletedata • (Much genomic, financial and other data fit this situation e.g.linkage analysis with marker genotypes of F2 progeny. Usually 9 categories observed for 2-locus, 2-allele model, but 16 = complete info., while 14 give info. on linkage. Some hidden, but if linkage parameter known, expected frequencies can be predicted and the complete data restored using expectation). • Steps: (1)Expectationestimates statistics of complete data, given observed incomplete data. • -(2) Maximisationuses estimated complete data to give MLE. • Iterate till converges (no further change)
E-M contd. • Implementation • Initial guess, ´, chosen (e.g. =0.25 say = R.F.). • Taking this as “true”, complete data is estimated, by distributional statements e.g. P(individual is recombinant, given observed genotype) for R.F. estimation. • MLE estimate ´´ computed. • This, for R.F. sum of recombinants/N. • Thus MLE, for fi observed count, • Convergence ´´ = ´ or
LIKELIHOOD : C.I. and H.T. • Likelihood Ratio Tests– c.f. with 2. • Principal Advantage of GisPower, as unknown parameters involved in hypothesis test. • Have : Likelihood of taking a value Awhich maximises • it, i.e. its MLE and likelihood under H0 : N , (e.g.N = 0.5) • Form of L.R. Test Statistic • or, conventionally • - choose; easier to interpret. • Distribution of G~ approx. 2 (d.o.f. = difference in dimension of parameter spaces for L(A), L(N) ) • Goodness of Fit:notation as for 2 , G ~ 2n-1 : • Independence:notation again as for2
Likelihood C. I.’s – graphical method • Example: Consider the following Likelihood function • is the unknown parameter ; a, b observed counts • For 4 data sets observed, • A: (a,b) = (8,2), B: (a,b)=(16,4) C: (a,b)=(80, 20) D: (a,b) = (400, 100) • Likelihood estimates can be plotted vs possible parameter values, with MLE = peak value. • e.g.MLE = 0.2, Lmax=0.0067 for A, and Lmax=0.0045 for B etc. • SetA: Log Lmax- Log L=Log(0.0067) - Log(0.00091)= 2gives 95% C.I. • so =(0.035,0.496) corresponding to L=0.00091, 95% C.I. for A. • Similarly, manipulating this expression, Likelihood value corresponding to 95% confidence interval given as L = (7.389)-1Lmax • Note: Usually plot Log-likelihood vs parameter, rather than Likelihood. • As sample size increases, C.I. narrowerand symmetric
Maximum Likelihood Benefits • Strong estimator properties – sufficiency, efficiency, consistency, non-bias etc. as before • Good Confidence Intervals • Coverage probability realised and intervals meaningful • MLE Good estimator of a CI • MSEconsistent • Absence of Bias • - does not “stand-alone” – minimum variance important • Asymptotically Normal • Precise – large sample • Inferences valid, ranges realistic