140 likes | 151 Views
Computing the Posterior Probability. The posterior probability distribution contains the complete information concerning the parameters, but need often need to integrate it E.g., normalizing posterior requires an n-dimensional integral for an n-parameter posterior
E N D
Computing the Posterior Probability • The posterior probability distribution contains the complete information concerning the parameters, but need often need to integrate it • E.g., normalizing posterior requires an n-dimensional integral for an n-parameter posterior • Rarely can analytically integrate posterior • Numerical integration is also difficult when n > several
Markov-Chain Monte Carlo Instead of integrating, sample from posterior in a way that results in a set of parameters values that is identically-distributed as the posterior The histogram of chain values for a parameter is a visual representation of the (marginalized) probability distribution for that parameter • Can then easily compute confidence intervals: • Sum histogram from best-fit value (often peak of histogram) in both directions • Stop when x% of values summed for an x% confidence interval
MCMC Algorithms MCMC is similar to a random-walk approach Most common techniques are Metropolis-Hastings and Gibbs Sampling (see Wikipedia entries on these) Metropolis-Hastings is simplest algorithm since it only requires evaluations of the posterior for a given set of parameter values
Gibbs Sampling • Derive conditional probability of each parameter given values of the other parameters • Pick parameter at random • Draw from conditional probability of that parameter given values of all other parameters from previous iteration • Repeat until chain converges Example: Case of a two-variable posterior p(x,y), for each MCMC draw i: xi ~ p(x|y=yi-1) yi ~ p(y|x=xi-1) “~” means “distributed as” A good approach is to form “hierarchical” models that result in simple conditional probabilities.
Metropolis-Hastings • Can be visualized as similar to the rejection method of random number generation • Use a “proposal” distribution that is similar in shape (especially width) to the expected posterior distribution to generate new parameter values • Accept new step when probability of new values is higher, occasionally accept new step otherwise (to go “up hill”, avoiding relative minima)
M-H with a Gaussian Proposal Distr. • For each parameter θ estimate/determine “step” size σ • For each chain iteration, new proposal value θ’= θi-1 + N(σ), N = normal deviate with st. dev. = σ • If p(θ’)/p(θi-1) > 1, θi = θ’ • If p(θ’)/p(θi-1) < 1, accept θi = θ’ with probability p(θ’)/p(θi-1), θi = θi-1 otherwise
M-H Issues • Can be very slow to converge, especially when there are correlated variables • Use multivariate proposal distributions (done in XSPEC approach) • Transform correlated variables: • If x and y are correlated, instead set y = ax + b, fit for x, a, b, compute y from chain values afterwards (may also need to also fix a or b with a tight prior) • Convergence • Run multiple chains, compute convergence statistics
MCMC Example • In Ptak et al. (2007) we used MCMC to fit the X-ray luminosity functions of normal galaxies in the GOODS area (see poster) • Tested code first by fitting published FIR luminosity function • Key advantages: • visualizing full probability space of parameters • ability to derive quantities from MCMC chain value (e.g., luminosity density)
Sanity Check: Fitting local 60 mm LF Φ Fit Saunders et al (1990) LF assuming Gaussian errors and ignoring upper limits Param. S1990 MCMC α 1.09 ± 0.12 1.04 ± 0.08 σ 0.72 ± 0.03 0.75 ± 0.02 Φ* 0.026 ± 0.008 0.026 ± 0.003 log L* 8.47 ± 0.23 8.39 ± 0.15 log L/L○
(Ugly) Posterior Probabilities z< 0.5 X-ray luminosity functions Early-type Galaxies Late-type Galaxies Red crosses show 68% confidence interval
Marginalized Posterior Probabilities Dashed curves show Gaussian with same mean & st. dev. as posterior Dotted curves show prior log φ* log L* s a a s Note: α and σ tightly constrained by (Gaussian) prior, rather than being “fixed”
XSPEC MCMC is based on the Metropolis-Hastings algorithm. The chain proposal command is used to set the proposal distribution. The basic options are multivariate gaussian or cauchy although user-defined distributions can be entered. The covariance matrix for these distributions can be calculated from the best-fit, entered from the command line, or calculated from previous MCMC chain(s). MCMC is integrated into other XSPEC commands. If chains are loaded then these are used to generate confidence regions on parameters, fluxes and luminosities. This is more accurate than the current method for estimating errors on fluxes and luminosities. The tclout simpars command returns a set of parameter values drawn from the probability distribution defined by the currently loaded chain(s). Chains can be saved either as ascii or FITS files. MCMC in XSPEC
XSPEC MCMC Output Histogram and probability density plot (2-d histogram) of spectral fit parameters from an XSPEC MCMC run produced by fv (see https://astrophysics.gsfc.nasa.gov/XSPECwiki)
Future • Use “physical” priors… have posterior from previous work be prior for current work • Use observed distribution of photon indices of nearby AGN when fitting for NH in deep surveys • Incorporate calibration uncertainty into fitting (Kashyap AISR project) • XSPEC has a plug-in mechanism for user-defined proposal distributions… would be good to also allow user-defined priors • Code repository/WIKI for MCMC analysis in astronomy