220 likes | 331 Views
Distribution and Outliers. Screening. (Significant Effects). Hadlum vs Hadlum. A univariate example that illustrates deviation from a normal pattern. Normal duration. Percentage (n=13634). Duration of Pregnancy. Bannet (1978) Appl. Statist. 27 , 242-250.
E N D
Screening (Significant Effects)
Hadlum vs Hadlum A univariate example that illustrates deviation from a normal pattern.
Normal duration Percentage (n=13634) Duration of Pregnancy Bannet (1978) Appl. Statist. 27, 242-250
Comparison of Hadlum Jr. to normal pattern Normal duration Percentage (n=13634) Hadlum Jr.
Deviation = observed value - predicted value residual measurement Model ^ y y Model validation
Normal Population - Cumulative plots Traditional Graphical paper Normal distribution paper
Normal plot 1) Sort the observations in increasing order 2) Let each observation present a percent interval that equals of the normal distribution If the observations are normally distributed, they plot like a straight line in the normal plot! Deviation from straight line implies outlying observations or non-normal distribution
Sculls from a cemetery maximum Karl Pearson (1931) Tables for Statisticans and Biometricans, Biometric Lab., London
Is the largest scull from a Maori? Hypothesis: The Maoris have less scull capacity than the whites - the largest scull is a contaminant shipwrecked sailor or missionary?
Probability plot Scull Capacity
Example P. Garrigues R. De Sury M. L. Angelin J. Bellocq J. L. Oudin M. Ewald Geochemica et Cosmochimica Acta, 52, (1988) 375-384
Data ? ?
Robust regression? Two outliers Useful tool to avoid thinking? Sloppy data analyst can find relief in robust regression
Result of “pooled” regression r=0.995
Observation r=0.865 Two phenomena influencing the ratio (predictor) No prediction possible!
Parallel displacement - perfect result for the one who wants to be “straight-lined”