Robust Statistics Why do we use the norms we do?

Robust StatisticsWhy do we use the norms we do? Henrik Aanæs IMM,DTU haa@imm.dtu.dk A good general reference is: Robust Statistics: Theory and Methods, by Maronna, Martin and Yohai. Wiley Series in Probability and Statistics TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAAAA

How Tall are You ?

We thus now model our data set as consisting of inliers, that follow some distribution, at outliers which do not. Inliers Outliers Idea of Robust Statistics To fit or describe the bulk of a data set well without being perturbed (influenced to much) by a small portion of outliers. This should be done without a pre-processing segmentation of the data. Outliers can be interesting too!

. . . . . . . . . . . . . . Line Example

Robust Statistics in Computer VisionImage Smoothing Image by Frederico D'Almeida

Robust Statistics in Computer VisionImage Smoothing

Robust Statistics in Computer Visionoptical flow Play Sequence MIT BCS Perceptual Science Group. Demo by John Y. A. Wang.

Robust Statistics in Computer Visiontracking via view geometry Image 1 Image 2

Nice Properties: Central Limit Theorem. Induces two norm. Leads to linear computations. But: Is fiercely influenced by outliers. Empirical distributions often have ‘fatter’ tails. Gaussian/ Normal DistributionThe Distribution We Usually Use

Gaussians Just are Models Too Alternative title of this talk

Error or ρ-functionsConverting from Model-Data Deviation to Objective Function.

ρ-functions and MLA typical way of forming ρ- functions

ρ-functions and ML IIA typical way of forming ρ- functions

Typical ρ-functionsWhere the Robustness in Practice Comes From • 2-norm • 1-norm • Huber norm • Truncated quadratic • Bi-Squared General Idea: Down weigh outliers, i.e. ρ(x) should be ‘smaller’ for large |x|.

Typical ρ-functionsWhere the Robustness in Practice Comes From • 2-norm • 1-norm • Huber norm • Truncated quadratic • Bi-Squared • Induced by Gaussian. • Very non-robust. • ‘Standard’ distribution.

Typical ρ-functionsWhere the Robustness in Practice Comes From • 2-norm • 1-norm • Huber norm • Truncated quadratic • Bi-Squared • Quite Robust. • Convex. • Corresponds to Median.

The Median and the 1-Norm

The Median and the 1-NormExample with 2 observations

The Median and the 1-NormExample with more observations

Typical ρ-functionsWhere the Robustness in Practice Comes From • 2-norm • 1-norm • Huber norm • Truncated quadratic • Bi-Squared • Quite Robust. • Convex. • Corresponds to Median.

Typical ρ-functionsWhere the Robustness in Practice Comes From • 2-norm • 1-norm • Huber norm • Truncated quadratic • Bi-Squared • Mixture of 1 and two norm. • Convex. • Has nice theoretical properties.

Typical ρ-functionsWhere the Robustness in Practice Comes From • 2-norm • 1-norm • Huber norm • Truncated quadratic • Bi-Squared • Discards Outlier’s. • For inliers works asGaussian. • Has discontinues derivative.

Typical ρ-functionsWhere the Robustness in Practice Comes From • 2-norm • 1-norm • Huber norm • Truncated quadratic • Bi-Squared • Discards Outlier’s. • Smooth.

Quantifying RobustnessA peak at tools for analysis Bias vs. Variance

Quantifying RobustnessA peak at tools for analysis Related to variance, on the previous slide

Quantifying RobustnessYou want to be robust over a range of models

Quantifying RobustnessA peak at tools for analysis Other measures (Similar): • Breakage Point: How many outliers can an estimator handle and still give ‘reasonable’ results. • Asymptotic bias: What bias does an outlier impose.

Back to Imageshere we have multiple ‘models’ To fit or describe the bulk of a data set well without being perturbed (influenced to much) by a small portion of outliers. This should be done without a pre-processing segmentation of the data.

Optimization Methods Typical Approach: • Find initial estimate. • Use Non-linear optimization and/or EM-algorithm. NB: In this course we have and will seen other methods e.g. with guaranteed convergence

Hough TransformOne off the oldest robust methods in ‘vision’Often used for initial estimate. Curse of Dimesionality = PROBLEM Example from MatLab help

In a Hough setting: 1. and 2. corresponds to finding a ‘good’ bin in Hough space. 3. Corresponds to calculating the value. RanSaCSampling in Hough space, better for higher dimensions • RANdom SAmpling • Consensus, RANSAC • Iterate: • Draw minimal sample. • Fit model. • Evaluate model by Consensus. Run RanDemo.m

Need to sample only Inliers to ‘succed’. Naïve scheme; try all combinations i.e. all E.g. For 100 points and a sample size of 7, this is 8.0678e+013 trials. • Preferred stopping scheme: • Stop when there is a e.g. 99% chance of getting all inliers. • Chance of getting an inlier • Use consesus of best fit as estimate of N_in • See e.g. Hartley and Zisserman: • “Multiple view geometry” Inliers Outliers RansacHow many iterations

Iteratively Reweighted Least Squares IRLSEM-type or chicken and egg optimization

Robust Statistics Why do we use the norms we do?

Robust Statistics Why do we use the norms we do?

Presentation Transcript

international business ethics and incipient capitalism: a double standard richard de george

Robust Lane Detection and Tracking

Descriptive Statistics Introduction to Summary Statistics

Spatial Statistics III

Chapter Eight: Using Statistics to Answer Questions

Statistics

Bivariate Statistics and Linear Regression

As much as I can say about Statistics in 60 minutes …

Statistics

Robust Real-Time Face Detection

Lectures ( Biostatistics)

Online Robust Dictionary Learning

Descriptive Statistics

BASIC STATISTICS For the HEALTH SCIENCES Fifth Edition

Statistics for Genomicists

Sociolinguistics

Statistics

BUSINESS STATISTICS

Deconstructing Standards into Achievable Learning Targets

Statistics Chapter 1 Introduction to Statistics

Univariate Statistics

How Animals Talk