390 likes | 572 Views
euler calculus & data. robert ghrist university of pennsylvania depts. of mathematics & electrical/systems engineering. machine learning summer school : june 2009. motivation. tools. euler calculus. χ = Σ (-1) k # { k-cells }. χ = Σ (-1) k rank H k. k. k.
E N D
euler calculus & data robertghrist university of pennsylvania depts. of mathematics & electrical/systems engineering machine learning summer school :june 2009
χ = Σ (-1)k # {k-cells} χ = Σ (-1)krank Hk k k euler calculus χ χ χ χ χ = 7 = 3 = 2 = 3 = 2
χ(AuB) = χ(A)+ χ(B) – χ(A B) u geometry blaschke hadwiger rota chen topology ∫h dχ kashiwara macpherson schapira viro probability networks adler taylor
integration consider the sheaf of constructible functions CF(X) = Z-valued functions whose level sets are locally finite and “tame” axiomatic approach to tameness in the work on o-minimal structures tools collections {Sn}n=1,2,... of boolean algebras of sets in Rn closed under projections, products,... elements of {Sn}n=1,2,... are called “definable” or “tame” sets results all definable sets are triangulable & have a well-defined euler characteristic all functions in CF(X) are of the form h = Σci1Uifor Ui definable all functions in CF(X) are integrable with respect to Euler characteristic explicit definition: euler integral ∫h dχ = ∫(Σci1Ui) dχ = Σ(∫ ci1Ui)dχ = Σciχ(Ui)
integration [schapira, 1980’s; via kashiwara, macpherson, 1970’s] the induced pushforward on sheaves of constructible functions is the correct way to understand dχ F F* in the case where Y is a point, CF(Y)=Z, and the pushforward is a homomorphism from CF(X) to Z which respects all the gluings implicit in sheaves... X X X Y Y pt pt ∫ dχ CF(X) CF(X) CF(X) CF(Y) CF(Y) CF(pt)=Z CF(pt)=Z F corollary: [schapira, viro; 1980’s] fubini theorem ∫ dχ F* sheaf-theoretic constructions also give natural convolution operators, duality, integral transforms, ...
problem a network of “minimal” sensors returns target counts without IDs how many targets are there? = 0 = 1 = 2 = 3 = 4
counting let W = “target space” = space where finite # of targets live let X = “sensor space” = space which parameterizes sensors target iis detected on a target support Ui in X h:X→Z sensor field on X returns h(x) = #{ i : x lies in Ui } 2 theorem: [BG]assuming target supports with uniform χ(Ui)=N # targets = (1/N)∫Xh dχ N ≠ 0 trivial proof: = Σ(∫1Uidχ) = Σχ(Ui) ∫h dχ = ∫(Σ1Ui) dχ = N# i amazingly, one needs no convexity, no leray (“good cover”) condition, etc. this is a purely topological result.
∞ ∞ = Σs χ({ h=s }) = Σχ({ h>s })-χ({ h<-s }) = Σ h(V)χ(v) V s=0 s=0 computation for h in CF(X), integrals with respect to dχ are computable via ∫ h dχ level set upper excursion set weighted eulerindex “chambers” of h components of level sets
= Σχ{h(x)>s} ∞ s=0 example ∫ h dχ h>3 : χ = 2 h>2 : χ = 3 h>1 : χ = 3 h>0 : χ = -1 net integral = 2+3+3-1 = 7
= Σχ{h(x)>s} =Σh(V)χ(V) ∞ s=0 V example ∫ h dχ h=4 : Σ= 2 h=3 : Σ= 1 h=2 : Σ= 0 h=1 : Σ= -4 net integral = 4(1+1)+3(1+1-1)+2(1+1+1-1-1-1)+1(1-1-1-1-2) = 7
waves consider a sensor modality which counts each wavefronts and increments an internal counter: used to count # events 3 booms… whuh? 2 booms… the resulting target impacts are still nullhomotopic (no echoing) accurate event counts obtained via ad hoc network of acoustic sensors with no clocks, no synchronization, and no localization
wheels consider sensors which count passing vehicles and increment an internal counter acoustic sensors embedded in roads… such target impacts may not be contractible… theorem: [BG]if sensors read h = the total number of time intervals in which some vehicle is nearby, then# vehicles = ∫ h dχ
wheels F supports are the projected image of a contractible subset in space-time pt recall: F* Z ∫ dχ ∫X h(x) dχ(x) = ∫Y F*h(y) dχ(y) X Y F*h(y) = ∫F-1(y) h(x) dχ(x) CF(X) CF(Y) let X = domain x time ; let Y = domain ; let F = temporal projection map then F*h(y) = total # of (compact) time intervals on which some vehicle is at/near point w = sensor reading at y
∞ Σ( #comp{ h≥s } - #comp{ h<s } + 1) ∞ ∞ ∞ ∞ = Σb0{h ≥ s } – b0{h < s } + 1 = Σχ{ h ≥ s } = Σb0 {h ≥ s }– b1{h ≥ s } = Σb0{h ≥ s }– b0{h < s } s=1 s=1 s=1 s=1 s=1 χ= Σ (-1)k dim Hk k ad hoc networks theorem: [BG] if the function h:R2→N is sampled over a network in a way that correctly samples the connectivity of upper and lower excursion sets, then the exact value of the euler integral of h is this is a simple application of alexanderduality… ∫ h dχ bk ~ this works in ad hoc setting : clustering gives fast computation
real-valued integrands it’s helpful to have a well-defined integration theory for R-valued integrands: Def(X) = R-valued functions whose graphs are “tame” (definable in o-minimal) take a riemann-sum approach ∫ h dχ● = lim1/n∫ floor(nh) dχ ∫ h dχ● = lim1/n∫ ceil(nh) dχ unfortunately, ∫ _ dχ ●& ∫ _ dχ● are no longer homomorphisms Def(X)→R however, ∫ _ dχ ●&∫ _ dχ● have an interpretation in o-minimal category lemma if h is affine on an open k-simplex, then h ∫ h dχ● = (-1)k inf (h) ∫ h dχ● = (-1)k sup (h)
= Σ (-1)n-μ(p) h(p) = Σ (-1)μ(p) h(p) crit(h) crit(h) real-valued integrands intuition: the two measures correspond to the stratified morse indices of the graph of h in Def(X) with respect to two graph axis directions… I*, I* : Def(X)→CF(X) theorem: [BG] for h in Def(X) ∫ h dχ∙= ∫ h I*h dχ ∫ h dχ∙= ∫ h I*h dχ corollary: [BG] if h : X → R is morse on an n-manifold, then μ = morse index ∫ h dχ∙ ∫ h dχ∙ corollary: [BG] if h is univariate, then ∫ h dχ∙= totvar(h)/2 = - ∫ h dχ∙
real-valued integrands Lebesgue ∫ h dχ● = ∫R χ{h≥s} - χ{h<-s} ds ∫ h dχ● = ∫R χ{h>s} - χ{h≤-s} ds ∫ h dχ● = limε→0+∫R s χ{s ≤ h < s+ε} ds ∫ h dχ● = limε→0+∫R s χ{s < h ≤ s+ε} ds Blaschke, Hadwiger, ... Morse ∫ h dχ●=Σ (-1)n-μ(p)h(p) ∫ h dχ● = Σ (-1)μ(p)h(p) crit(h) crit(h) Duality ∫ h dχ●= - ∫ - h dχ● (Dh)(x) = limε→0+∫h 1B(ε,x) dχ D(Dh) = h Fubini ∫X h dχ●(x) = ∫Y∫ {F(x)=y} h(x) dχ●(x)dχ●(y) F:X→Y with h∙F=h
incomplete data consider the following relative problem: D given h on the complement of a hole D, estimate ∫ h dχover the entire domain ∫R2 h dχ≤∫R2 h dχ≤∫R2 h dχ theorem: [BG] for h:R2→Z a sum of indicator functions over homotopically trivial supports, none of which lies entirely within a contractible hole D, then h = fill in D with maximum of h on ∂D h = fill in D with minimum of h on ∂D reminder: f < g does not imply that ∫ f dχ < ∫ g dχ ...in this case the opposite occurs…
incomplete data but what to choose in between upper and lower bounds? claim: a harmonic extension over a hole is a “best guess”... theorem: [BG] For h:R2→Z a sum of indicator functions over homotopically trivial supports, none of which lies entirely within a contractible hole D, then ∫R2 h dχ≤∫R2 f dχ≤∫R2 h dχ for f any “harmonic” extension of h over D (weighted average of h rel∂D) the proof is surprisingly easy using morse theory: a “harmonic” extension has no local maxima or minima within D... # saddles in D - # maxima on ∂D = χ(D)=1 the integral over D is the heights of the maxima minus the heights of the saddles
expected values in practice, harmonic extensions lead to non-integer target counts ∫h dχ= 1+1-c this is an “expected” target count weights for the laplacian can be chosen based on confidence of data points toward a general theory of expected integrals
inversion X S W ∫Xh dχ = N ∫W1T dχ = N #T h = integral transform of 1T with kernel S
open questions how to correct “side lobes” and energy loss in integral transforms? what is the appropriate integration theory for multi-modal and logical-valued data? how to efficiently compute integral transforms given discrete (sparse) data? …and, well, numerical analysis in general
closing credits… darpa (stomp program) research sponsored by national science foundation office of naval research primary collaborator yuliybaryshnikov, bell labs professional support university of pennsylvania a. mitchell java code davidlipsky, uillinois, urbana a.j. friend, stanford naveenkasthuri, penn