400 likes | 777 Views
Robust Statistics Why do we use the norms we do?. Henrik Aanæs IMM,DTU haa@imm.dtu.dk. A good general reference is: Robust Statistics: Theory and Methods , by Maronna, Martin and Yohai. Wiley Series in Probability and Statistics . TexPoint fonts used in EMF.
E N D
Robust StatisticsWhy do we use the norms we do? Henrik Aanæs IMM,DTU haa@imm.dtu.dk A good general reference is: Robust Statistics: Theory and Methods, by Maronna, Martin and Yohai. Wiley Series in Probability and Statistics TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAAAA
We thus now model our data set as consisting of inliers, that follow some distribution, at outliers which do not. Inliers Outliers Idea of Robust Statistics To fit or describe the bulk of a data set well without being perturbed (influenced to much) by a small portion of outliers. This should be done without a pre-processing segmentation of the data. Outliers can be interesting too!
. . . . . . . . . . . . . . Line Example
Robust Statistics in Computer VisionImage Smoothing Image by Frederico D'Almeida
Robust Statistics in Computer Visionoptical flow Play Sequence MIT BCS Perceptual Science Group. Demo by John Y. A. Wang.
Robust Statistics in Computer Visiontracking via view geometry Image 1 Image 2
Nice Properties: Central Limit Theorem. Induces two norm. Leads to linear computations. But: Is fiercely influenced by outliers. Empirical distributions often have ‘fatter’ tails. Gaussian/ Normal DistributionThe Distribution We Usually Use
Gaussians Just are Models Too Alternative title of this talk
Error or ρ-functionsConverting from Model-Data Deviation to Objective Function.
Typical ρ-functionsWhere the Robustness in Practice Comes From • 2-norm • 1-norm • Huber norm • Truncated quadratic • Bi-Squared General Idea: Down weigh outliers, i.e. ρ(x) should be ‘smaller’ for large |x|.
Typical ρ-functionsWhere the Robustness in Practice Comes From • 2-norm • 1-norm • Huber norm • Truncated quadratic • Bi-Squared • Induced by Gaussian. • Very non-robust. • ‘Standard’ distribution.
Typical ρ-functionsWhere the Robustness in Practice Comes From • 2-norm • 1-norm • Huber norm • Truncated quadratic • Bi-Squared • Quite Robust. • Convex. • Corresponds to Median.
Typical ρ-functionsWhere the Robustness in Practice Comes From • 2-norm • 1-norm • Huber norm • Truncated quadratic • Bi-Squared • Quite Robust. • Convex. • Corresponds to Median.
Typical ρ-functionsWhere the Robustness in Practice Comes From • 2-norm • 1-norm • Huber norm • Truncated quadratic • Bi-Squared • Mixture of 1 and two norm. • Convex. • Has nice theoretical properties.
Typical ρ-functionsWhere the Robustness in Practice Comes From • 2-norm • 1-norm • Huber norm • Truncated quadratic • Bi-Squared • Discards Outlier’s. • For inliers works asGaussian. • Has discontinues derivative.
Typical ρ-functionsWhere the Robustness in Practice Comes From • 2-norm • 1-norm • Huber norm • Truncated quadratic • Bi-Squared • Discards Outlier’s. • Smooth.
Quantifying RobustnessA peak at tools for analysis Bias vs. Variance
Quantifying RobustnessA peak at tools for analysis Related to variance, on the previous slide
Quantifying RobustnessYou want to be robust over a range of models
Quantifying RobustnessA peak at tools for analysis Other measures (Similar): • Breakage Point: How many outliers can an estimator handle and still give ‘reasonable’ results. • Asymptotic bias: What bias does an outlier impose.
Back to Imageshere we have multiple ‘models’ To fit or describe the bulk of a data set well without being perturbed (influenced to much) by a small portion of outliers. This should be done without a pre-processing segmentation of the data.
Optimization Methods Typical Approach: • Find initial estimate. • Use Non-linear optimization and/or EM-algorithm. NB: In this course we have and will seen other methods e.g. with guaranteed convergence
Hough TransformOne off the oldest robust methods in ‘vision’Often used for initial estimate. Curse of Dimesionality = PROBLEM Example from MatLab help
In a Hough setting: 1. and 2. corresponds to finding a ‘good’ bin in Hough space. 3. Corresponds to calculating the value. RanSaCSampling in Hough space, better for higher dimensions • RANdom SAmpling • Consensus, RANSAC • Iterate: • Draw minimal sample. • Fit model. • Evaluate model by Consensus. Run RanDemo.m
Need to sample only Inliers to ‘succed’. Naïve scheme; try all combinations i.e. all E.g. For 100 points and a sample size of 7, this is 8.0678e+013 trials. • Preferred stopping scheme: • Stop when there is a e.g. 99% chance of getting all inliers. • Chance of getting an inlier • Use consesus of best fit as estimate of N_in • See e.g. Hartley and Zisserman: • “Multiple view geometry” Inliers Outliers RansacHow many iterations
Iteratively Reweighted Least Squares IRLSEM-type or chicken and egg optimization