1 / 42

Information Theoretic Approaches to data Association and Fusion in Sensor Networks

Information Theoretic Approaches to data Association and Fusion in Sensor Networks. John Fisher, Alexander Ihler, Jason Williams , Alan Willsky MIT CSAIL/LIDS Haixiao Cai, Sanjeev Kulkarni, Sergio Verdu Princeton University SensorWeb MURI Review Meeting September 22, 2003.

Download Presentation

Information Theoretic Approaches to data Association and Fusion in Sensor Networks

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Information Theoretic Approaches to data Association and Fusion in Sensor Networks John Fisher, Alexander Ihler, Jason Williams , Alan Willsky MIT CSAIL/LIDS Haixiao Cai, Sanjeev Kulkarni, Sergio Verdu Princeton University SensorWeb MURI Review Meeting September 22, 2003

  2. Problem/Motivation • Large number of simple, myopic sensors. • Need to perform local fusion to support global inference (Battlespace Awareness). • Critical need to understand statistical relationships between sensor outputs in the face of many modes of uncertainty (sensors, scene, geometry, etc).

  3. Challenges • Uncertainty in scene and sensor geometry • Complex, dynamic environment • Uncalibrated, multi-modal sensors • Unknown joint sensor statistics • Need fast, low-complexity algorithms

  4. Activity and Accomplishments • Research • Application of data association method to multi-modal (A/V) correspondence problem. • A/V is a surrogate for other modalities primarily because we can easily collect this data (vs. IR, EM, etc.). • Extensions and empirical results to multi-modal feature-aided tracking. • Generalization of data association to triangulated graphs. • Improved K-L Divergence/MI estimators. • New developments on applied information-theoretic sensor management.

  5. Activity and Accomplishments • Tech Transition • ARL visits • Student (Ihler) on-site at ARL • Plans to transition Data Association method to DARPA’s CTS program (Ft. Belvoir installation) • Publications • 4 conference publications • IPSN (2) • ICME (invited) • ICASSP (invited) • 1 journal submission • accepted pending 2nd review • 3 Sensor Network workshop panels • ARO, NSF, SAMSI

  6. A Common Thread • Fusion and correspondence are difficult given the types of sensor uncertainties we are facing. • Various information theoretic measures and the need to estimate them arise naturally in such problems. • Exploiting sensor data subject to a common excitation provides a mechanism for estimating such quantities.

  7. Overview • Estimating Information Theoretic Measures from Sensor Data (MIT, Princeton) • Applications • Data Association, Multi-modal Tracking, Inferring Group Interactions, Sensor Management • Future Directions • Information driven sensor fusion

  8. Measurements: Separated signals Direction of arrival 1 signal/2 sensors Localize >2 signals, 2 sensors Ambiguous B2 A1 A2 B1 Sensor B Sensor A Data Association (last year)

  9. Association as a Hypothesis Test • Assuming independent sources, hypotheses are of the form • Asymptotic comparison of known models to those estimated from a single realization

  10. Decomposes into two sets of terms: Statistical dependencies (groupings) Differences in model parameterizations Asymptotics of Likelihood Ratio

  11. If we estimate from a single realization: Statistical dependence terms remain Model divergences go away Asymptotics of Likelihood Ratio

  12. High Dimensional Data • Learn low-dimensional auxiliary variables which summarize statistical dependency of measurements

  13. New since last year, direct application of the 2 sensors/multiple source case Unknown joint statistics High-dimensional data Varying scene parameters Surrogate for multi-modal sensors AV Association/Correspondence consistent inconsistent

  14. association matrix for 8 subjects AV Association/Correspondence 0.68 0.61 0.19 0.20

  15. association matrix for 8 subjects AV Association/Correspondence 0.68 0.61 0.19 0.20

  16. General Structure Tests • Generalization to hypothesis tests over graphical structures • How are observations related to each other? vs vs

  17. H2 H1 General Structure Tests Intersection Sets - groupings on which the hypotheses agree vs

  18. General Structure Tests • Asymptotics have a similar decomposition as in the 2-variable case (via the intersection sets):

  19. General Structure Tests • Extension of the previous work on data association is straightforward for such tests. • Estimation from a single realization incurs a reduction in separability only in terms of the model difference terms. • The “curse of dimensionality” (with respect to density estimation) arises in 2 ways: • Individual measurements may be of high dimension • Could still design low dimensional auxiliary variables • The number of variables in a group • New results provide a solution

  20. The test implies potentially 6 joint densities, but is simplified by looking at the intersection sets. H2 H1 General Structure Tests

  21. High dimensional variables learning auxiliary variables reduces dimensionality in one aspect. • But we would still have to estimate a 3 dimensional density. • This only gets worse with larger groupings. General Structure Tests

  22. K-L Divergence with Permutations • Simple idea which mitigates many of the dimensionality issues. • Exploits the fact that the structures are distinguished by their groupings of variables. • Key Ideas: • Permuting sample order between groupings maintains the statistical dependency structure. • D(X||Y) >= D(f(X)||f(Y)) • This has the advantage that we can design a single (possibly vector-valued) function of all variables rather than one function for each variable. • Currently doing comparitive analysis (bias, variance) with previous approach.

  23. f K-L Divergence with Permutations

  24. More General Structures • Analysis has been extended to comparisons between triangulated graphs. • Can be expressed as sums and differences of product terms. • Admits a wide class of Markov processes.

  25. Modeling Group Interactions • Object 3 tries to interpose itself between objects 1 and 2. • The graph describes the state (position) dependency structure.

  26. Modeling Group Interactions

  27. Previous Work and Current Efforts (Princeton) • Developed fast algorithms based on block sorting for entropy and divergence estimation for discrete sources. • Simulations and text data show excellent results. • Have provided analysis of methods showing universal consistency. • Have recently investigated estimation of mutual information. • Have recently been investigating performance for hidden Markov sources. • Currently analyzing performance for hidden Markov sources. • Investigating extensions to continuous alphabet sources. • Applications to various types of data.

  28. A “Distilled” Problem • The Problem: How to estimate the entropy, divergence, and mutual information of two sources based only on one realization from each source ? • Assumption: Both are finite-alphabet, finite- memory, stationary sources. • Our goal: Want good estimates, fast convergence, and reasonable computational complexity.

  29. Two Approaches to Estimating Mutual Information • Estimate mutual information via entropy: I(X;Y) = H(X) + H(Y) – H(X,Y). • Estimating mutual information via divergence: I(X;Y)= D(pxy||pxpy). • We use our entropy and divergence estimators via Burrows-Wheeler Block Sorting Transform.

  30. Estimating Mutual Information • Analysis and simulations shows that both approaches converge to the true value. • Entropy approach appears better than the divergence approach. • Divergence approach does not use the fact that the second distribution pxpy is a product of two marginal distributions.

  31. Hidden Markov Processes • X is the underlying Markov Chain. • Y is a deterministic mapping of X, or Y is X observed through a Discrete Memoryless Channel. Then, Y is a Hidden Markov Process. Useful in a wide range of applications.

  32. Entropy of HMP In order to get the mutual information of the input and output of a DMC, we need the entropy of the output, which is a HMP if the input is Markov. The entropy of HMP can be approximated by an upper bound and a lower bound. These bounds can be calculated recursively.

  33. Estimating entropy of HMP

  34. MSE of Our Estimators • The MSE of our entropy estimator for i.i.d. sources satisfies • The MSE of our mutual information estimator for i.i.d. sources • We have convergence results for divergence estimator and for Markov sources and stationary ergodic sources.

  35. MSE of Entropy Estimator for HMP • We can prove H(Yd|Yd-1,…,Y1) converges to H(Y) exponentially fast w.r.t. d, if the Hidden Markov Process’ mapping satisfies that there exists an , such that for exactly one • We want to further establish the convergence rate of our entropy estimator for HMP.

  36. The MI fusion approach is equivalent to learning a latent variable model of the audio video measurements. Random variables: Parameters, appearance bases: Simultaneously learn statistics of joint audio/video variables and parameters as the statistic of association (consistent with the theory) Association vs the Generative Model

  37. Extension of multi-modal fusion to include nuisance parameters Audio is an indirect pointer to the object of interest. Combine motion model (nuisance parameters) with audio-video appearance model. Incorporating Motion Parameters

  38. Incorporating Motion Parameters without motion model example frames average image with motion model

  39. Following Zhao, Shin Reich (2002), Chu, Haussecker, Zhao (2002), Ertin, Fisher, Potter (2003) we’ve started extending IT approaches to sensor management. Specifically, consider the case where a subset of measurements over time has been incorporated into the belief state. When is it better to incorporate a measurement from the past versus a new measurement? How can we efficiently choose a set of measurements (avoid the greedy approach)? Information Theoretic Sensor Management

  40. Summary • Applied association method to multi-modal data • New MI/K-L divergence estimators based on permutation approach • Mitigates dimensionality issues, avoids some of the combinatorics. • Extended approach to triangulated graphs. • New estimators for information measures (entropy, divergence, mutual information) based on BWT (block sorting). • Doesn’t require knowledge of distribution or parameters of the sources. • Efficient algorithm, good estimates, fast convergence. • Significantly outperforms other algorithms tested. • Investigating use in several applications including as component for correspondence and fusion algorithms.

More Related