180 likes | 319 Views
Mathematical Analysis of MaxEnt for Mixed Pixel Decomposition. Lidan Miao AICIP Group Meeting Feb. 23, 2006. Motivation. Why do we choose maximum entropy? What role does the maximum entropy play in the algorithm?
E N D
Mathematical Analysis of MaxEnt for Mixed Pixel Decomposition Lidan Miao AICIP Group Meeting Feb. 23, 2006
Motivation • Why do we choose maximum entropy? • What role does the maximum entropy play in the algorithm? • In what sense the algorithm converges to the optimal solution? (Is there anything else beyond the maximum entropy?)
Decomposition Problem • Decompose a mixture into constituent materials (assumed to be known in this presentation) and their proportions. • Mixing model • Given A, x, find s • Linear regression problem • Physical constraints • Non-negativity • Sum of s’ components is 1
QP & FCLS[1] • Quadratic programming (QP) • Nonlinear optimization • Computationally expensive • Fully constrained least squares (FCLS) • Integrates SCLS and NCLS • NCLS is based on standard algorithm NNLS Is the least square estimation the best in all cases?
Geometric Illustration • Mixing model & convexcombination • Unconstrained least squares • Solve linear combination problem • Constrained least squares (QP and FCLS) • Sum-to-one: on the line connecting a1 and a2 • Nonnegative: in the cone C determined by a1 and a2 • Feasible set of convex combination: line segment a1a2
MaxEnt[2] • Objective function • Maximize entropy • Optimization method • Penalty function method • Limitation • Low convergence rate • Theoretically, Kk needs to go to infinity • For each Kk, s has no closed form solution, numerical method is needed (gradient descent) • Low performance when SNR is high • It can never fits the measurement model as Kk can not be infinity Negative relative entropy
Gradient Descent MaxEnt • Optimization formulation • Minimize negative entropy • Optimization method • Lagrange multiplier method • Gradient descent learning
Convergence Analysis of GDME • Initialization: (neg-entropy) • Lambda will warp the original objective function to fit the data measurement model. • Searching in the feasible set True s1: 0.1619 Estimation: 0.1638 Like MaxEnt[2], the solution is obtained by warping the objective function. Unlike MaxEnt[2], the force is a vector instead of a scalar, which is not necessary to be infinity to fit the measurement model.
The key is the inner product Convergence Analysis (cont) • Take the first iteration as an example • The multiplier is the scaled error vector • S depends on the inner product of A and lambda • The denominator of s is only for normalization • Exponential function is used to generate a nonnegative number
Convergence Analysis (cont) • 2D case is simple for visualization Where to move? Objective? Proof
Stopping Conditions • FCLS: S components are all non-negative • Generates least square solutions, minimizes ||As-x|| • When SNR is low, the algorithm overfits the noise • MaxEnt[2]:the same with FCLS • The solution is not least squares • Can never fits the data perfectly • GDME: S is relatively stable • Is able to giveleast square solution • The solution lies somewhere between equal abundances and the least square estimation, which determined by the stopping condition
Experimental Results • MaxEnt[2] is too slow to be applied to a big image, so the simple data in ref[2] is used Two groups of testing Data Results 250 mixed pixels with the abundance randomly generated The average of 50 runs Par: 500, 4
Experimental Results (cont) • Apply to synthetic hyperspectral images • Metrics: ARMSE, AAD, AID
Summary of GDME • The same target as QP and FCLS, i.e., min ||As-x|| • The maximum entropy formulation is used to incorporate the two constraints through exponential function and normalization. • Does maximum entropy really play a role? • By carefully selecting stopping condition, GDME on average is able to generate better performance in terms of abundance estimation. • The convergence rate is faster than QP and MaxEnt[2] and similar to FCLS (based on experiments) • GDME is more flexible, which presents strong robustness under low SNR cases • GDME presents better performance when source vectors are close to each other
Future Work • Speed up the learning algorithm • Investigate optimal stop conditions (what’s the relationship between SNR and stop condition?) • Study the performance w.r.t the number of constituent materials
Reference • [1] D. C. Heinz and C.-I Chang. Fully constrained least squares linear spectral mixture analysis method for material quantification in hyperspectral imagery. IEEE Trans. Geosci. Remote Sensing, vol.39, no. 3, pp. 529-545, 2001. • [2] S. Chettri and N. Netanyahu. Spectral unmixing of remotely sensed imagery using maximum entropy. In Proc. of SPIE, vol. 2962, pp. 55–62, 1997.