1 / 33

Geometrization of Inference

Model: { p(x|  )}. Truth: t(x). t. . Geometrization of Inference. Embedding in Hilbert Space. Fisher Information metric automagically induced on the tangent bundle !. The Volume Form as Prior.

neveah
Download Presentation

Geometrization of Inference

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Model: { p(x|)} Truth: t(x) t  Geometrization of Inference

  2. Embedding in Hilbert Space Fisher Information metric automagically induced on the tangent bundle !

  3. The Volume Form as Prior A hypothesis space M is said to be regular when (M,g) is a smooth orientable riemannian manifold. A k-dim regular M has volume form: In arbitrary (orientation preserving) theta coordinates the volume of (M,g) is:

  4. Model Volume prior ( for c = 0.1 )

  5. n=100 n=500 n=100 n=1000 No learning FLAT 0.025 +/- 0.020 -0.032 +/- 0.016 0.048 +/- 0.24 0.039 +/- 0.24 VOLUME 0.025 +/- 0.0084 -0.011 +/- 0.007 n=10000 n=1000

  6. Dose (log g/ml) x No. of animals n No. of deaths y -0.863 5 0 -0.296 5 1 -0.053 5 3 0.727 5 5 Ex: Simple Logistic Regression Racine’s data independent. log (odds of death) = a + b x logit(p) = log p/(1-p) Need: Ignorance Prior on (a,b)

  7. Ignorance for Logistic Regression Racine’s data MCMC: 250k samples mean a = 0.12 sd = 3.7 Mean b = 0.63 sd = 10.0 corr[a,b] = 0.51

  8. Volumes of Bitnets = dags of bits

  9. Worse < BIC < AIC < CIC < Best % of correct segmentations v/s N. Based on 100 reps for each N. Params at ramdom each time.

  10. .jpg .aiff .txt .gz The Iliad: BOOK I Sing, O goddess, the anger of Achilles son of Peleus, that brought countless ills upon the Achaeans. Many a brave soul did it send hurrying down to Hades, and many a hero did it yield a prey to dogs and vultures, for so were the counsels of Jove fulfilled from the day on which the son of Atreus, king of men, and great Achilles, first fell out with one another………… + + 01100…0 + CIC

  11. MDL bold pragmatism Forget about the data being generated by a probability distribution. This is just a CODING GAME!! Best model is the one providing the shortest code for the observed data. Data is all there is!

  12. Есть Проблема The shortest description length of a sequence is NON-COMPUTABLE!! And can only be approximated with MODELS.

  13. Data and Theory are Entangled There is no data in the vacuum. Data is a logical proposition with truth values only relative to a given domain of discourse. A sequence 0110011110… is NOT DATA as the number 2.4 is not data unless is understood as “the result of such and such experiment is 2.4”. Data is theory laden. Theory is data laden. IMHO

  14. I have a brain I obs. x I want to understand x I need to predict future x’ How?

  15. no inmaculate obs. no theoretical vacuum no fact w/o. fiction no data w/o. theory. dataTheory

  16. Why ? is a logical proposition in a domain of discourse

  17. by meaning data must have meaning I mean Theory

  18. Theory = explanation = compressing code = Probability distribution

  19. obs. hidden

  20. The tatistical Manifold data manifold finite measures

  21. Sufficient map

  22. Canonical Example:

  23. Ignorance = Independence &Uniformity spread concentrate

  24. Max (Ignorance) s.t. Whatever is known (I forgot to mention that Bayes Theorem follows from this as a special case and in 2 very different ways!)

  25. Objective proc. for transforming prior info into prior distributions. • A new understanding of Data, Prior, and Likelihood. • Optimality of scalar field Conjugate Priors for Exp. Fam. • The discovery of Antidata and virtual data. • Optimality of Priors with tails following power laws. • Evaporation of the Bayesian/Freq. divide. • A dent at the Mind/Body problem. • A justification of Perelman’s Action. (That proved Thurston’s Geometrization Conjecture.) • A Geometric Theory of Ignorance. • The solution to a 260 year old problem: Objective Quantification of Ignorance in Statistical Inference. What’s New?

  26. Spinning Ed!

More Related