210 likes | 314 Views
Uncertainty and Information Integration in Biomedical Applications. Claudia Plant Research Group for Bioimaging TU München. Outline. Motivation: massive increase of data 2) Integration and Uncertainty Neurosciences: fMRI and EEG data. Proteomics: Peptide Profiling. 3) Conclusion.
E N D
Uncertainty and Information Integration in Biomedical Applications Claudia Plant Research Group for Bioimaging TU München
Outline • Motivation: massive increase of data 2) Integration and Uncertainty • Neurosciences: fMRI and EEG data. • Proteomics: Peptide Profiling. 3) Conclusion
Motivation: Data Explosion in Medicine and Life Sciences The amount of scientific data doubles each year. Szalay et Grey, Nature 2006
BMBF Project: Understanding Resting-state Brain Aktivity • Metabolism of the brain is not significantly reduced in comparison to task. • Other regions become active during rest, so-called resting state networks. • Goal of this project: • Understand function of • Resting state networks, • compare healthy persons • And subjects with functional • brain disorders. • Methods: • fMRI, EEG • Challenge for data mining: • Massive data sets, uncertainty, information integration
VOXEL (Volumetric Pixel) Slice Thickness e.g., 6 mm In-plane resolution e.g., 192 mm / 64 = 3 mm 3 mm 6 mm SAGITTAL SLICE IN-PLANE SLICE 3 mm Number of Slices e.g., 10 Matrix Size e.g., 64 x 64 Field of View (FOV) e.g., 19.2 cm fMRI Imaging: Spatial Aspect
3 mm 6 mm 3 mm fMRI Imaging: Temporal Aspect With spatial resolution 3x3x6 mm approximately 80,000 voxels the brain. Temporal resolution: up to some hundreds of timepoints.
EEG/MEG Low spatial but high temporal resolution (milliseconds). Can we combine the benefits of the two modalites? fMRI: high spatial, low temporal resolution EEG/MEG: high temporal, low spatial resolution
The Cocktail Party Problem electrode/ voxel brain process Space: (x +/- e1, y +/- e1, z +/- e1) Time: t +/- e2 With e1 >>> e2 And e3 << e4 Space: (x +/- e3, y +/- e3, z +/- e3) Time: t +/- e4
For Single Type of Microphone: ICA brain process Successfully applied for spatial and temporal de-mixing of fMRI and EEG data. V. D. Calhoun, T. Adali, M. Stevens, K. A. Kiehl, and J. J. Pekar, Semi-Blind ICA of FMRI: A Method for Utilizing Hypothesis-Derived Time Courses in a Spatial ICA Analysis, NeuroImage, vol. 25, pp. 527-538, 2005. V. D. Calhoun, J. J. Pekar, and G. D. Pearlson, Alcohol Intoxication Effects on Simulated Driving: Exploring Alcohol-Dose Effects on Brain Activation Using Functional MRI, Neuropsychopharmacology, vol. 29, pp. 2097-2107, 2004.
Example temporal ICA u = u1, …, un v = v1, …, vn 1) Centering and Whitening De-correlate and standardizise uw = L-1/2 * VT * (u-m) 3) Konvergence M = V * L-1/2 * W, S = X * M-1 2) Fix Point Iteration: wi = E{uw (g(wiT-uw)} – E{uw(g‘(wiT-uw)} Temporal ICA with FastICA
IC2: basal regions IC1: visual cortex The red time series of IC1 preceeds the green of IC2. S Results of Spatial ICA on Task-fMRI Experiment: Subject hits buttom as soon she sees a red light. Spatial ICA = X M Time series IC
Existing Approches to Joint ICA EEG • Scale to common resolution and perform usual ICA • Problem: Information Loss! fMRI V. D. Calhoun., T. Adali, N. R. Giuliani, J. J. Pekar, K. A. Kiehl and G. D. Pearlson, Method for multimodal analysis of independent source differences in schizophrenia: combining gray matter structural and auditory oddball functional data, HBM, vol. 27, pp. 47-62, 2006
Existing Approaches to Joint ICA EEG 2) Perform ICA on each modality separately Problem: How to interpret the result? 3) Parallel ICA: Change the objective function of ICA to find similar components in both modalities Problem: Objective function has now two different goals. How to weight them? Parametrization difficult. Perhaps use concepts of Information Theory for this? -> Later fMRI
Our Idea: Probabilistic ICA Represent each object (x,y,z,t) as PDF and perform Joint ICA. How to represent? As PDF
Probabilistic ICA combined with Information-theoretic Clustering Classical ICA model assumes a global mixing matrix A. This is not always the case, especially for data from different modalites. Do not force integration by parameters, let the data decide. Combine ICA with Clustering!
OCI: Outlier-robust Clustering using Independent Components (Sigmod 2008) …so far only for certain data. Parameter- free clustering Non-Gaussian Clusters noise
too many bits too few bits Relationship between PDFs and Data Compression Suppose we know the mixing Matrix and have two candidate PDFs for coordinate zi good fit Information Theory: We want to transmit the data and sender and receiver know the correct PDF. The minimum description length is: ? ? We do not know the correct PDF. Try both!
x after x before ICA and Data Compression ICA yields mixing matrix with directions of minimal entropy -> Efficient coding. Apply FastICA algorithm at a cluster level. Whitening Centering After 1 iteration After 4 iterations ICA minimizes Entropy -> reduces uncertainty -> reduces compression cost
Data Integration and Information Theory Concepts of Information Theory provide means to measure how different Information of different sources is. If information is similar, it can be compressed effectively together. Therefore, information-theoretic clustering is a parameter-free approch to data Integration.
Conclusion • Integrative mining of uncertain data is a challenging task of emerging importance in many applications, • We discussed an example from Neurosciences and some ideas for possible but there are many, many others.. (applications and ideas) • This is a very interesting problem specification for basic research in data mining. • Have fun!