Retraining maximum likelihood classifiers using a low-rank model

Retraining maximum likelihood classifiers using a low-rank model Arnt-Børre Salberg Norwegian Computing Center Oslo, Norway IGARSS July 25, 2011

Introduction • Challenge: Dataset shift problem: • Training data match the test data poorly due to atmospherical, geographical, botanical and phenological variations in the image data → reduced classification performance • Class-dependent data distribution varies • between training images • between test and training images • Goal: Develop a method that re-estimates the parameters such that classifier possess a good fit to the test data

Introduction • Many surface reflectance algorithms often requires data from external sources • LEDAPS (Landsat): • ozone and water vapor measurements • Phenological, botanical and geographical variation in addition to atmospherical makes the calibration problem even harder

An existing method… • Models the test image as a mixture distribution and estimates all parameters using the EM-algorithm, with estimated parameters from training data as initial values • To many degrees of freedom. Statistic fit is excellent, but class labels get mixed.

Low-rank parameter modeling • Training image k: • Class mean vector and covariance matrix (class i) • Class mean vector and covariance matrix model for the test image • a and b are unknown parameter vectors to be estimated from the data

Low-rank data modeling • The proposed method for modeling the test data is a low-rank approach since the number of parameters in ais L<D. • This is much less than estimating all C·D parameters i mi, i=1,…,C • By using a low-rank estimation of the class mean vectors of the test data, the spectral differences between the classes is in larger degree maintained

Parameter estimation • Procedure for estimating a and b: • Select N random samples {y1, y2,… yN}from the test image

Parameter estimation • Procedure for estimating a and b: • Select N random samples {y1, y2,… yN}from the test image • Model them using a Gaussian mixture distribution • Estimate the parameters by solving the likelihood

Experiment 1:Cloud detection in optical images • 15 different QuickBird and WorldView-2 images covering 7 different scenes in Norway • Features • Band 2 (green) • Band 3 (red) • Classes • clouds, cloud shadows, vegetation, concrete/asphalt/etc., haze and water • Resolution down-sampled to 19.2 m (16.0 m) • 4 different training (sub)images

Experiment 1:Cloud detection in optical images • Model di is the eigenvector corresponding to the largest eigenvalue ni of the matrix eigenvector Test average

Experiment 1:Cloud detection in optical images • Parameter estimation. At iteration l+1: where

Results:Cloud detection in optical images Without retraining With retraining

Results:Cloud detection in optical images

Experiment 2:Tree cover mapping of tropical forest • 13 different Landsat TM images covering an area nearby Amani, Tanzania (path/row 166/063) • Features • Band 1-5 and 7 • Classes • Forest, spares forest, grass and soil • Two training images (1986-10-06 and 2010-02-10)

Experiment 2:Tree cover mapping of tropical forest • Model a constrained to contain only positive elements • Solution found using non-negative least-squares in combination with iterative maximum-likelihood estimation

Experiment 2:Tree cover mapping of tropical forest • Parameter estimation: At iteration l+1 where

Results:Tree cover mapping of tropical forest • * July 2009 February 2010 Without retraining With retraining

Summary and conclusion • Proposed a simple method for handling the dataset shift between training and test data • Cloud detection: Evaluated successfully on a many different Quickbird and WorldView-2 images. • Haze versus clouds • Confuses snow and clouds • Guidelines on how to select the low-rank modeling functions is needed • EM-algorithm and local minima problem • More testing and evalidation of the method is necessary

Retraining maximum likelihood classifiers using a low-rank model

Retraining maximum likelihood classifiers using a low-rank model

Presentation Transcript

Maximum likelihood estimation

Maximum Likelihood Estimation

Maximum likelihood (ML)

Maximum Likelihood

Maximum Likelihood

4. Maximum Likelihood

Maximum Likelihood

Maximum Likelihood Estimation

Phylogenetic Estimation using Maximum Likelihood

Maximum Likelihood

Maximum likelihood

Maximum likelihood decoding

Maximum likelihood (cont.)

Maximum Likelihood Estimation

Parallel Maximum Likelihood Fitting Using MPI

Maximum likelihood (cont.)

Maximum Likelihood

Maximum Likelihood

Maximum likelihood testing model

Maximum Likelihood

Maximum Likelihood Estimation

Maximum Likelihood Estimation