290 likes | 1.19k Views
Digital Image Processing: A Remote Sensing Perspective FW 5560 Per- Pixel Classification Algorithms. Per-pixel Supervised Classification Algorithms. Parallelepiped Classification Algorithm Minimum Distance to Mean.
E N D
Digital Image Processing: A Remote Sensing Perspective FW 5560 Per- Pixel Classification Algorithms Per-pixel Supervised Classification Algorithms Parallelepiped Classification Algorithm Minimum Distance to Mean
Using a one-standard deviation threshold a PARALLELEPIPED algorithm decides BVijk is in class c if, and only if: • where • c = 1, 2, 3, …, m, number of classes, and • k = 1, 2, 3, …, n, number of bands. • Therefore, if the low and high decision • boundaries are defined as: • and • the parallelepiped algorithm becomes
Minimum Distance to Means Classification Algorithm The minimum distance to means is computationally simple. User provides the mean vectors for each class in each band µck from the training data. To perform a minimum distance classification, the program must calculate the distance to each mean vector µck from each unknown pixel (BVijk).
Mahalanobis Distance Classification • Is similar to minimum distance, except that the covariance matrix is utilized. Variance and covariance are figured in so that clusters that are highly varied lead to similarly varied classes, and vice versa. For example, when classifying urban areas—typically a class whose pixels vary widely—correctly classified pixels may be farther from the mean than those of a class for water, which is usually not a highly varied class T. • The equation is: • D = (X-Mc)T(Covc-1) (X-Mc) • Where: • D = Mahalanobis distance • c = a particular class • X = the measurement vector of the candidate pixel • Mc= the mean vector of the signature of class c • Covc= the covariance matrix of the pixels in the signature of class c • Covc-1= inverse of Covc • The pixel is assigned to the class, c, for which D is the lowest.
Maximum Likelihood Classification Algorithm • The aforementioned classifiers are based on identifying decision boundaries in feature space using multispectral distance measurements. The maximum likelihood decision rule is based on probability. • It assigns each pixel having pattern measurements or features X to the class i whose units are most probable or likely to have given rise to feature vector X. • In other words, the probability of a pixel belonging to each of a predefined set of m classes is calculated, and the pixel is then assigned to the class for which the probability is the highest. • The maximum likelihood decision rule is one of the most widely used supervised classification algorithms.
The maximum likelihood procedure assumes that the training data statistics for each class in each band are normally distributed (Gaussian). Training data with bi- or n-modal histograms in a single band are not ideal. The individual modes probably represent unique classes that should be trained upon individually and labeled as separate training classes. Produces unimodal, Gaussian training class statistics that fulfill the normal distribution requirement.
For example, consider the hypothetical histogram (data frequency distribution) of forest training data obtained in band k. We could choose to store the values contained in this histogram in the computer, but a more elegant solution is to approximate the distribution by a normal probability density function (curve), as shown superimposed on the histogram.
The estimated probability density function for class wi with one band of data is computed using the equation: Where: exp [ ] is e (the base of the natural logarithms) raised to the computed power, x is one of the brightness values on the x-axis, is the estimated mean of all the values in the forest training class, is the estimated variance of all the measurements in this class.
If the training data consists of multiple bands of remote sensor data for the classes of interest, we compute an n-dimensional multivariate normal density function using: Where: is the determinant of the covariance matrix, is the inverse of the covariance matrix, is the transpose of the vector The mean vectors (Mi) and covariance matrix (Vi) for each class are estimated from the training data.
ISODATA – Unsuperived Classification Algorithm The Iterative Self-Organizing Data Analysis Technique (ISODATA) - a comprehensive set of heuristic (rule of thumb) procedures that have been incorporated into an iterative classification algorithm. The ISODATA algorithm includes a) merging clusters if their separation distance in multispectral feature space is below a user-specified threshold and b) rules for splitting a single cluster into two clusters.
ISODATA is iterative- makes a large number of passes through the dataset. ISODATA- initial arbitrary assignment of all Cmax clusters takes place along an n-dimensional vector that runs between very specific points in feature space. The region in feature space is defined using the mean, µk,and standard deviation, sk,of each band in the analysis. This method of automatically seeding the original Cmax vectors makes sure that the first few lines of data do not bias the creation of clusters.
ISODATA is self-organizing because it requires relatively little human input. Normally requires the analyst to specify the following criteria: Cmax: the maximum number of clusters to be identified by the algorithm (e.g., 20 clusters). T: the maximum percentage of pixels whose class values are allowed to be unchanged between iterations. When this number is reached, the ISODATA algorithm terminates. M: the maximum number of times ISODATA is to classify pixels and recalculate cluster mean vectors. The ISODATA algorithm terminates when this number is reached.
Minimum members in a cluster (%): If a cluster contains less than the minimum percentage of members, it is deleted and the members are assigned to an alternative cluster. Maximum standard deviation (smax): When the standard deviation for a cluster exceeds the specified maximum standard deviation and the number of members in the class is greater than twice the specified minimum members in a class, the cluster is split into two clusters. The mean vectors for the two new clusters are the old class centers ±1s. Minimum distance between cluster means (C): Clusters with a weighted distance less than this value are merged.