Chapter 7 Maximum likelihood classification of remotely sensed imagery 遥感影像分类的最大似然法

Chapter 7 Maximum likelihood classification of remotely sensed imagery 遥感影像分类的最大似然法 ZHANG Jingxiong (张景雄) School of Remote Sensing Information Engineering Wuhan University

Introduction • Thematic classification: an important technique in remote sensing for provision of spatial information • For example, land cover information key input to biophysical parameterization, environmental modeling, resources management, and other applications.

Historically, visual interpretation (目视判读) was applied to identify homogeneous areal-classes based on aerial photographs. • With digital images, it is possible to substantially automate this process by using two methods.

Unsupervised （非监督）

Supervised （监督）

The maximum likelihood classification (极大似然分类) - MLC • The most common method for supervised classification • Classes (类别): {k, k=1,…,c} • Measurement vector（观测向量/数据）: z(x) • Likelihood（似然）: p(k|z(x)) • Classification rule（分类规则）: xi, ifp(i|z(x)) > p(j|z(x)) for all ji

Bayes’ theorem （贝叶斯定理）

Bayes’ theorem at work … • Candidate cover types {k, k=1,…,c}, c being the total number of classes considered such as {urban, forest, agriculture} • Bayes‘ theorem then allows for evaluation of the posterior probability (验后概率) p(k|z(x)) the posteriori probability of class k conditional to data z(x))

Based on: 1) prior probability (先验概率) – expected occurrence of candidate classes k p(k) or pk (before measurement) 2) class-conditional probability density – the occurrence of measurement vector z(x) conditional to class k p(z(x) | k) or pk(z(x)), (derived from training data)

Bayes‘ theorem then allows for evaluation of the posterior probability (验后概率) p(k|z(x)) the posteriori probability of class k conditional to data z(x)) • p(k|z(x))= p(k, z(x)) / p(z(x)) = pk pk(z(x)) / p(z(x)) where p(z(x))= sum( p(k,z(x)) = sum(pk pk(z(x)) )

To assign observation z(x) to class k that has the highest posteriori probability p(k|z(x)), or • that returns the highest product pk pk(z(x)), as p(z(x)) won’t change the relative magnitude of p(k|z(x)) across k.

Computing p(z(x) | k) with normally distributed data • The occurrence of measurement vector z(x) conditional to class k, assuming multivariate normal distribution: pk(z(x)) = (2)-b/2 det|covZ|k|-1/2 exp(-dis2/2)

The squared Mahalanobis distance between an observation z(x) and the means of class-specific Z variable mZ|k: • covZ|k : variance and covariance matrix of variable Z conditional to class k • mZ|k: the means of class-specific Z variable • b: the number of features (e.g., spectral bands)

Mathematical detail • The objective/criterion (准则): to minimize the conditional average loss: mis(i, j): the cost resulting from labeling a pixel actually belonging to class j to class i.

“0-1 loss function” : 0 - no cost for correct classification 1 - unit cost for misclassification. • the expected loss incurred if pixel x with observation z(x) is classified as i:

a decision rule that will lead to minimized loss is: decide z(x) iif and only if: • Bayesian classification rule works actually as a kind of maximum likelihood classification.

Examples • two classes 1 and 2 • means and dispersion matrices: m1 = [4 2], m2 = [3 3] COV1 = |3 4| COV2 = |4 5| |4 6| |5 7| COV1-1 = |3 -2| COV2 -1 = |7/3 -5/3| |-2 1.5| |-5/3 4/3|

To decide to which class the measure vector z = (4, 3) belongs. • squared Mahalanobis distance between this measurement vector and the two class means: dis12 = 3/2 dis22 = 4/3 • class-conditional probability densities: p1(z) = 1/(2pi*1.414) exp (-3/4) = 0.0532 p2(z) = 1/(2pi*1.732) exp (-2/3) = 0.0472

Assume equal prior class probabilities: • p1 = 1/2 and p2 = 1/2 • the posterior probabilities are: p(1|z) = 0.0532 * (1/2) / (0.0266 + 0.0236) = 0.53 p(2|z) = 0.0472 * (1/2) / (0.0266 + 0.0236) = 0.47 • to classify the measurement z into class 1

When p1 = 1/3 and p2 = 2/3, • the posterior probabilities are: p(1|z) = 0.0532 * (1/3) / (0.0177 + 0.0315) = 0.36 p(2|z) = 0.0472 * (2/3) / (0.0177 + 0.0315) = 0.64 • Then, class 2 is favored for z over class 1.

Looking back … • Pixel-based • Parcel-based methods adaptations to the MLC? object-oriented approaches?

Extraction of Impervious Surfaces Using Object-Oriented Image Segmentation Impervious surfaces USGS NAPP 1 × 1 m DOQQ of an area in North Carolina

Methods for thematic mapping • Parametric MLC  • Non-parametric Artificial neural networks • Non-metric • Expert systems • Decision-tree classifiers • Machine learning

References • Tucker, C. J. and Townshend, J. R. G. and Goff, T. E. (1985) African Land-Cover Classification Using Satellite Data. Science, 227(4685): 369-375. • Anyamba, A. and Eastman, J. R. (1996) Interannual Variability of NDVI over Africa and its relation to El Niño / Southern Oscillation. International Journal of Remote Sensing 17(13) : 2533-2548.

Questions 1. Discuss the importance of statistics in thematic classification of remotely sensed imagery.

Chapter 7 Maximum likelihood classification of remotely sensed imagery 遥感影像分类的最大似然法

Chapter 7 Maximum likelihood classification of remotely sensed imagery 遥感影像分类的最大似然法

Presentation Transcript

Chapter 6 Classification Scientific name Heirarchy Taxonomic theory Nutrition Autotroph vs heterotroph Cell types proka

ENERGY CONVERSION ONE (Course 25741)

Glimcher Decision Making

Vector Space Text Classification

Maximum Flow Applications

Class 6 Qualitative Dependent Variable Models

A2 Sport Psychology

CHAPTER 2

Classification, nomenclature, taxonomy,identification

Classification of Living Things

Chapter 11 Supervised Learning: STATISTICAL METHODS

Introduction and Kinematics

How to estimate phylogenies? On parsimony, likelihood and probability.

CS 59000 Statistical Machine learning Lecture 3

Classification and Prediction

Paul Medvedev Michael Brudno

Maximum Likelihood

Data Mining: Classification and Prediction

Chapter 11 MULTIWAY TREES

Evolution and Diversity of Life

Data Mining Classification: Basic Concepts, Decision Trees, and Model Evaluation

Chapter 6