280 likes | 371 Views
Parametric Families of Distributions and Their Interaction with the Workshop Title. Chris Jones The Open University, U.K. How the talk will pan out …. it will start as a talk in distribution theory concentrating on generating one family of distributions
E N D
Parametric Families of Distributions and Their Interaction with the Workshop Title Chris Jones The Open University, U.K.
How the talk will pan out … • it will start as a talk in distribution theory • concentrating on generating one family of distributions • then will continue as a talk in distribution theory • concentrating on generating a different family of distributions • but in this second part, the talk will metamorphose through links with kernels and quantiles … • … and finally get on to a more serious application to smooth (nonparametric) QR • the parts of the talk involving QR are joint withKeming Yu
Set Starting point: simple symmetric gHow might we introduce (at most two) shape parameters a and b which will account for skewness and/or “kurtosis”/tailweight (while retaining unimodality)?Modelling data with such families of distributions will, inter alia, afford robust estimation of location (and maybe scale).
FAMILY 1 g
Actual density of order statistic: (i,n integer) Generalised density of order statistic: (a,b>0 real)
Roles of a and b • a=b=1: f = g • a=b: family of symmetric distributions • a≠b: skew distributions • a controls left-hand tail weight, b controls right • the smaller a or b, the heavier the corresponding tail
Properties of (Generalised) Order Statistic Distributions • Distribution function: • Tail behaviour. For large x>0: • power tails: • exponential tails: • Limiting distributions: • a and b large: normal distribution • one of a or b large, appropriate extreme value distribution Other properties such as moments and modality need to be examined on a case-by-case basis For more, see Jones (2004, Test)
Tractable Example 1 Jones & Faddy’s (2003, JRSSB)skew t density When a=b, Student t density on 2a d.f.
Tractable Example 2 • : The (order statistics of the) logistic distribution generate the ??? • : Log F distribution • This has exponential tails
These examples, seen before, are therefore log F distributions
The simple exponential tail property is shared by: • the log F distribution • the asymmetric Laplace distribution • the hyperbolic distribution Is there a general form for such distributions?
FAMILY 2: distributions with simple exponential tails Starting point: simple symmetric g with distribution function G and General form for density is:
Special Cases • G is point mass at zero, G^[2]=xI(x>0) • f is asymmetric Laplace • G is logistic, G^[2]=log(1+exp(x)) • f is log F • G is t_2, G^[2]=½(x+√(1+x^2)) • f is hyperbolic • G is normal, G^[2]= xΦ(x)+φ(x) • G uniform, G^[2]=½(1+x)I(-1<x<1)+I(x>1)
solid line: log F dashed line: hyperbolic dotted line: normal-based
Practical Point 1 • the asymmetric Laplace is a three parameter distribution; other members of family have four; • fourth parameter is redundant in practice: (asymptotic) correlations between ML estimates of σ and either of a or b are very near 1; • reason: σ, a and b are all scale parameters, yet you only need two such parameters to describe main scale-related aspects of distribution [either (i) a left-scale and a right-scale or (ii) an overall scale and a left-right comparer]
Practical Point 2Parametrise by μ, σ, a=1-p, b=p.Then, score equation for μ reads: This is kernel quantile estimation, with kernel G and bandwidth σ
Includes bandwidth selection by choosing σ to solve the second score equation: But its simulation performance is variable:
And so to Quantile Regression: The usual (regression) log-likelihood, is kernel localised to point x by
Writing and this (version of) DOUBLE KERNEL LOCAL LINEAR QUANTILE REGRESSIONsatisfies Contrast this with Yu & Jones (1998, JASA) version of DKLLQR: where
The ‘vertical’ bandwidth σ=σ(x) can also be estimated by ML: solve Compare 3 versions of DKLLQR: Yu & Jones (1998) including r-o-t σ and h; new version including r-o-t σ and h; new version including above σ and r-o-t h.
Based on this limited evidence: • Clear recommendation: • replace Yu & Jones (1998) DKLLQR method by (gently but consistently improved) new version • Unclear non-recommendation: • use new bandwidth selection?