Introduction to Unsupervised Image Classification in Remote Sensing

Introduction to Remote SensingLecture 12

Unsupervised Classification

Unsupervised Image Classification • Definition:identification of natural groups, or structures within multispectral data. • does NOT use training data for individual information classes as the basis for classification; • image pixels are examined and aggregated into a number of spectral classes based on natural clustering in multi-dimensional space; • UC is the definition, identification, labeling and mapping of natural spectral classes.

Unsupervised Image Classification • Assumption • natural spectral groupings exist within a scene (inherently uniform in respect to brightness in several spectral channels; • spectral classes within a given cover class should "cluster" close together whereas data in different classes should be well separated. • Supervised: define information categories and then examine their spectral separability versus • Unsupervised: determine spectrally separable classes and then define their informational usefulness

Stages to Unsupervised Classification • Definition of minimum and maximum number of categories to be generated by the particular classification algorithm (based on an analyst’s knowledge or user requirements). • Random selection of pixels to form cluster centres. • Algorithm then finds distances between pixels and forms initial estimates of cluster centers as permitted by user defined criteria. • As pixels are added to the initial estimates, new class means are calculated. This is an iterative process until the mean does not change significantly from one iteration to the next.

Natural Clusters in 2-band data

We can visualise natural clusters within 3-band RS data with the aid of this diagram, taken from Sabins, "Remote Sensing: Principles and Interpretation." 2nd Edition, for four classes: A = Agriculture; D= Desert; M = Mountains; W = Water.

Assignment of Spectral Categories to Information Categories • The output from this process is a map of the uniform groupings of pixels. • As such they only become useful if they can be matched to one or more ground/information classes in order to produce a final product (e.g. land-use map). • Sometimes the assignment of identifiers to classes can be made on a purely spectral basis (e.g. water). • However, we can only rarely use spectral properties in isolation - other field information is necessary.

True Composite Bands 4, 3, 2 Churn Farm

Nine Clusters Broad detail

Nine Clusters Fine detail

All Clusters

Unsupervised Image Classification • Advantages (relative to supervised classification) • no extensive/detailed a priori knowledge of the region is required (nature of the knowledge required for UC differs from that required for supervised classification) • minimize human errors/biases (fewer decisions by analyst) • produces more uniform classes • spectrally distinct classes present in the data may not have initially been apparent to the analyst.

Unsupervised Image Classification • Disadvantages (relative to supervised classification) • spectral grouping may not correspond to information classes of interest to the analyst • limited control over the "menu" of classes; • spectral properties of specific classes will change over time (relationships between information classes and spectral classes are not constant.

Supervised Classification

The Classification Stage • Numerous mathematical approaches to spectral pattern recognition have been developed. 1. Minimum distance to mean 2. Parallelepiped classifier 3. Maximum likelihood classifier • In order to demonstrate some of these methods, it is important to look at the relationship between the spectral response of selected cover-types in relation to the spectral band-widths of the sensor.

Clusters of data in feature-space corresponding to different surfaces

Paralellpiped Classifier • In this classifier, the range of spectral measurements are taken into account. The range is defined as the highest and lowest digital numbers assigned to each band from the training data • An unknown pixel is therefore classified according to its location within the class range. However, difficulties occur when class ranges overlap. This can occur when classes exhibit a high degree of correlation or covariance. • This can be partially overcome by introducing stepped borders to the class ranges.

Simple parallelepiped classification

Parallelepiped classification with more precise boundaries

Reference Image Parallelepiped result

Minimum distance to means classifier 1.Calculate of the mean spectral value in each band and for each category. 2. Relate each mean value by a vector function 3. A pixel of unknown identity is calculated by computing the distance between the value of the unknown pixel and each of the category means. 4. After computing the distances, the unknown pixel is then assigned to the closest class. • Limitations of this process include the fact that it is insensitive to different degrees of variance within spectral measurements.

Min-distance result Reference Image

Minimum distance to means classification method

Maximum Likelihood Classifier • This classifier quantitatively evaluates both the variance and covariance of the trained spectral response patterns when deciding the fate of an unknown pixel. • To do this the classifier assumes that the distribution of points for each cover-type are normally distributed • Under this assumption, the distribution of a category response can be completely described by the mean vector and the covariance matrix. • Given these values, the classifier computes the probability that unknown pixels will belong to a particular category.

Maximum likelihood classification method

Reference Image Max-likelihood result

Accuracy Assessment • This is effectively the detailed assessment of agreement between two maps at specific locations. • This is commonly referred to as a sort of Classification Error. • In this case, the units of assessment are simply pixels derived from remote sensing data, and errors are defined as misidentification of the identities of these individual pixels • The standard form of reporting site specific errors is an error matrix,

Advantages of Supervised Classification • Analyst has control • Processing is tied to specific areas of known identity • Analyst not faced with the problem of matching categories on the final map with field information • Operator can detect errors, and often remedy them

Disadvantages of Supervised Classification • The analyst imposes a structureon the data, which may not match reality. (may be over-simple) • Training classes are generally based on field identification, and not on spectral properties (signatures are forced). • Training data selected by the analyst, may not be representative of conditions encountered throughout the image (heterogeneity in classes is common). • Training data can be time-consuming and costly(iterative process) • Unable to recognise and represent specialor uniquecategories not represented in the training data.

Final Technicality Selection of Correct Classification Algorithm • There are many classification algorithms available for land-cover mapping. Selection of the appropriate classifier should be made on the basis of local experience. • Unsupervised and supervised methods are appropriate for sites where either very little or very complete field records are available. • However, even when the accuracy of a classification is determined, it is difficult to anticipate the balance between the effects of the choice of classifier, selection of data, characteristics of landscape, and other factors.

Mixed Pixels

Fuzzy Classifiers • Fuzzy classifiers are a simple evolution of traditional hard classifiers. • In hard classification each pixel is assigned to a class based on some measure of distances in feature-space - from the point representing it to each of the class means • Pixels are assigned to the class to which they are nearest

Fuzzy Classifiers • The distances contain all of the information necessary to estimate sub-pixel proportions of land-cover • The closer a point is to a given class mean, the greater the proportion of the pixel that is assigned to that class • The exact proportions are determined from the full set of distances for each class.

Mixture Modelling • Mixture modelling aims to map the proportions of each components within each pixel of an image (Settle and Drake, 1993). • Assumption 1: linear (additive) mixing • components are distributed as patches that are large enough to allow the light to interact only with a specific patch of each component. • Assumption 2: All materials within the image have sufficient spectral contrast to allow their separation. • Assumption 3: the number and identity of each component can be defined in some way. • Assumption 4: pure examples of each component (end-member pixels) in the mixture must be known in order to fit the spectra of these pixels to the imagery and estimate their proportions in each pixel.

* X = b a 1 ? 2 ? 3 ? 4 5 ? Image Channel 6 ? 7 ? 8 9 10 1 2 3 4 5 6 Proportions In-pixel Reflectance In-pixel End-member

EM1 EM2 EM6 A PC1 B EM3 EM5 EM4 PC2

Case Study

LCM90 - 2000 • The Land Cover Map of Great Britain (1990) is a digital dataset, providing classification of land cover types into 25 classes, at a 25m (or greater) resolution. • Data from the map provides: • the first complete map of the land cover of Great Britain since the 1960s • the first time the land cover of Great Britain has been comprehensively mapped from satellite information • the first digital map of national land cover • accuracy to the field scale, checked against ground survey

LCM90 Methods • The Land Cover Map of Great Britain (LCMGB) was produced using supervised maximum likelihood classifications of Landsat Thematic Mapper data (Fuller et al. 1994a). • The map, based on a 25 m grid, records 25 cover types, consisting of sea and inland water, beaches and bare ground, developed and arable land, and 18 types of semi-natural vegetation - these are described more fully below. • By combining summer and winter data, classification accuracies were substantially improved over single-date analyses (Fuller et al. 1994b).

The Isle of Wight, a landscape dominated by mixed farming

A three-dimensional land cover map of a part of the North York Moors, created by draping LCM2000 data over a digital terrain model.

Introduction to Unsupervised Image Classification in Remote Sensing

Introduction to Unsupervised Image Classification in Remote Sensing

Presentation Transcript

Introduction to Remote Sensing

Introduction to Remote Sensing Lecture 11

Introduction to Microwave Remote Sensing

An Introduction to Remote Sensing

Introduction to Remote Sensing

Introduction to Thermal Remote Sensing

INTRODUCTION TO REMOTE SENSING

Introduction to Remote Sensing

Introduction to Remote Sensing

Introduction to Remote Sensing Lecture 1

INTRODUCTION REMOTE SENSING

Introduction to Remote Sensing

Introduction to Remote Sensing

Introduction to Remote Sensing Images

Introduction to Remote Sensing

Introduction to Satellite Remote Sensing

Formerly Lecture 12 now Lecture 10: Introduction to Remote Sensing and Atmospheric Correction*

Introduction to Remote Sensing

Introduction to Remote Sensing

INTRODUCTION TO REMOTE SENSING

Introduction to quantitative Remote Sensing

Introduction to Remote-Sensing