On the Statistical Analysis of Dirty Pictures

On the Statistical Analysis of Dirty Pictures Julian Besag

Image Processing • Required in a very wide range of practical problems • Computer vision • Computer tomography • Agriculture • Many more… • Picture acquisition techniques are noisy

Problem Statement • Given a noisy picture • And 2 source of information (assumptions) • A multivariate record for each pixel • Pixels close together tend to be alike • Reconstruct the true scene

Notation • S – 2D region, partitioned into pixels numbered 1…n • x = (x1, x2, …, xn) – a coloring of S • x* (realization of X) – true coloring of S • y = (y1, y2, …, yn) (realization of Y) – observed pixel color

Assumption #1 • Given a scene x, the random variables Y1, Y2, …, Yn are conditionally independent and each Yi has the same known conditional density function f(yi|xi), dependent only on xi. • Probability of correct acquisition

Assumption #2 • The true coloring x* is a realization of a locally dependant Markov random field with specified distribution {p(x)}

Locally Dependent M.r.f.s • Generally, the conditional distribution of pixel i depends on all other pixels, {S\i} • We are only concerned with local dependencies

Previous Methodology • Maximum Probability Estimation • Classification by Maximum Marginal Probabilities

Maximum Probability Estimation • Chose an estimate x such that it will have the maximum probability given a record vector y. • In Bayesian framework x is MAP estimate • In decision theory – 0-1 loss function

Maximum Probability Estimation • Iterate over each pixel • Chose color xi at pixel i from probability • Slowly decreasing T will guarantee convergence

Classification by Maximum Marginal Probabilities • Maximize the proportion of correctly classified pixels • Note that P(xi | y) depends on all records • Another proposal: use a small neighborhood for maximization • Still computationally hard because P is not available in closed form

Problems • Large scale effects • Favors scenes of single color • Computationally expensive

Estimation by Iterated Conditional Modes • The previously discussed methods have enormous computational demands, and undesirable large-scale properties. • We want a faster method with good large-scale properties.

Iterated Conditional Modes • When applied to each pixel in turn, this procedure defines a single cycle of an iterative algorithm for estimating x*

Examples of ICM • Each example involves: • c unordered colors • Neighborhood is 8 surrounding pixels • A known scene x* • At each pixel i, a record yi is generated from a Gaussian distribution with mean and variance κ.

The hillclimbing update step

Extremes of β • β = 0 gives the maximum likelihood classifier, with which ICM is initialized • β = ∞, xi is determined by a majority vote of its neighbors, with yi records only used to break ties.

Example 1 • 6 cycles of ICM were applied, with β = 1.5

Example 2 • Hand-drawn to display a wide array of features • yi records were generated by superimposing independent Gaussian noise, √κ = .6 • 8 cycles, β increasing from .5 to 1.5 over the 1st 6

Models for the true scene • Most of the material here is speculative, a topic for future research • There are many kinds of images possessing special structures in the true scene. • What we have seen so far in the examples are discrete ordered colors.

Examples of special types of images • Unordered colors • These are generally codes for some other attribute, such as crop identities • Excluded adjacencies • It may be known that certain colors cannot appear on neighboring pixels in the true scene.

More special cases… • Grey-level scenes • Colors may have a natural ordering, such as intensity. The authors did not have the computing equipment to process, display, and experiment with 256 grey levels. • Continuous intensities • {p(x)} is a Gaussian M.r.f. with zero mean

More special cases… • Special features, such as thin lines • Author had some success reproducing hedges and roads in radar images. • Pixel overlap

Parameter Estimation • This may be computationally expensive • This is often unnecessary • We may need to estimate θ in l(y|x; θ) • Learn how records result from true scenes. • And we may need to estimate Φ in p(x;Φ) • Learn probabilities of true scenes.

Parameter Estimation, cont. • Estimation from training data • Estimation during ICM

Example of Parameter Estimation • Records produced with Gaussian noise, κ = .36 • Correct value of κ , gradually increasing β gives 1.2% error • Estimating β = 1.83and κ = .366 gives 1.2% error • κknown but β = 1.8 estimated gives 1.1% error

Block reconstruction • Suppose the Bs form 2x2 blocks of four, with overlap between blocks • At each stage, the block in question must be assigned one of c4 colorings, based on 4 records, and 26 direct and diagonal adjacencies:

Block reconstruction example • Univariate Gaussian records with κ = .9105 • Basic ICM with β = 1.5 gives 9% error rate • ICM with β = ∞ estimated gives 5.7% error

Conclusion • We began by adopting a strict probabilistic formulation with regard to the true scene and generated records. • We then abandoned these in favor of ICM, on grounds of computation and to avoid unwelcome large-scale effects. • There is a vast number of problems in image processing and pattern recognition to which statisticians might usefully contribute.

On the Statistical Analysis of Dirty Pictures