330 likes | 414 Views
Model: { p(x| )}. Truth: t(x). t. . Geometrization of Inference. Embedding in Hilbert Space. Fisher Information metric automagically induced on the tangent bundle !. The Volume Form as Prior.
E N D
Model: { p(x|)} Truth: t(x) t Geometrization of Inference
Embedding in Hilbert Space Fisher Information metric automagically induced on the tangent bundle !
The Volume Form as Prior A hypothesis space M is said to be regular when (M,g) is a smooth orientable riemannian manifold. A k-dim regular M has volume form: In arbitrary (orientation preserving) theta coordinates the volume of (M,g) is:
Model Volume prior ( for c = 0.1 )
n=100 n=500 n=100 n=1000 No learning FLAT 0.025 +/- 0.020 -0.032 +/- 0.016 0.048 +/- 0.24 0.039 +/- 0.24 VOLUME 0.025 +/- 0.0084 -0.011 +/- 0.007 n=10000 n=1000
Dose (log g/ml) x No. of animals n No. of deaths y -0.863 5 0 -0.296 5 1 -0.053 5 3 0.727 5 5 Ex: Simple Logistic Regression Racine’s data independent. log (odds of death) = a + b x logit(p) = log p/(1-p) Need: Ignorance Prior on (a,b)
Ignorance for Logistic Regression Racine’s data MCMC: 250k samples mean a = 0.12 sd = 3.7 Mean b = 0.63 sd = 10.0 corr[a,b] = 0.51
Volumes of Bitnets = dags of bits
Worse < BIC < AIC < CIC < Best % of correct segmentations v/s N. Based on 100 reps for each N. Params at ramdom each time.
.jpg .aiff .txt .gz The Iliad: BOOK I Sing, O goddess, the anger of Achilles son of Peleus, that brought countless ills upon the Achaeans. Many a brave soul did it send hurrying down to Hades, and many a hero did it yield a prey to dogs and vultures, for so were the counsels of Jove fulfilled from the day on which the son of Atreus, king of men, and great Achilles, first fell out with one another………… + + 01100…0 + CIC
MDL bold pragmatism Forget about the data being generated by a probability distribution. This is just a CODING GAME!! Best model is the one providing the shortest code for the observed data. Data is all there is!
Есть Проблема The shortest description length of a sequence is NON-COMPUTABLE!! And can only be approximated with MODELS.
Data and Theory are Entangled There is no data in the vacuum. Data is a logical proposition with truth values only relative to a given domain of discourse. A sequence 0110011110… is NOT DATA as the number 2.4 is not data unless is understood as “the result of such and such experiment is 2.4”. Data is theory laden. Theory is data laden. IMHO
I have a brain I obs. x I want to understand x I need to predict future x’ How?
no inmaculate obs. no theoretical vacuum no fact w/o. fiction no data w/o. theory. dataTheory
Why ? is a logical proposition in a domain of discourse
by meaning data must have meaning I mean Theory
Theory = explanation = compressing code = Probability distribution
obs. hidden
The tatistical Manifold data manifold finite measures
Ignorance = Independence &Uniformity spread concentrate
Max (Ignorance) s.t. Whatever is known (I forgot to mention that Bayes Theorem follows from this as a special case and in 2 very different ways!)
Objective proc. for transforming prior info into prior distributions. • A new understanding of Data, Prior, and Likelihood. • Optimality of scalar field Conjugate Priors for Exp. Fam. • The discovery of Antidata and virtual data. • Optimality of Priors with tails following power laws. • Evaporation of the Bayesian/Freq. divide. • A dent at the Mind/Body problem. • A justification of Perelman’s Action. (That proved Thurston’s Geometrization Conjecture.) • A Geometric Theory of Ignorance. • The solution to a 260 year old problem: Objective Quantification of Ignorance in Statistical Inference. What’s New?