170 likes | 595 Views
Wim de Leeuw, Swammerdam Institute for Life Sciences, Amsterdam Pernette Verschure, Swammerdam Institute for Life Sciences, Amsterdam Robert van Liere, Center for Mathematics and Computer Science, Amsterdam
E N D
Wim de Leeuw, Swammerdam Institute for Life Sciences, Amsterdam Pernette Verschure, Swammerdam Institute for Life Sciences, Amsterdam Robert van Liere, Center for Mathematics and Computer Science, Amsterdam Visualization and analysis of large data collections: a case study applied to confocal microscopy data
Motivation (1): Context: cell biology experiments Phenomenon captured using digital microscopy Experiment characteristics: Biological diversity Not all biological parameters can be controlled Many measurements needed
Motivation (2): Visualization and analysis of collections of data sets High variability Non-trivial information extraction (eg segmentation) Noise Visualization Modes: Interactive vs Batch Interactive control+feedback vs static settings of parameters Time consuming vs multiple data sets processed simultaneously Aim: combine advantages of Interactive and Batch Visualization
Agenda Biological Problem Chromatin structure and gene control Visualization Problem Data collection description Analysis with visual summaries
Chromatin Structure and Gene Control Chromatin Structure Low level : DNA, nucleosomes, 30 nm fiber High level: fiber folding Gene control Regulation of gene activity Biological research question: Relation chromatin structure and gene control Is there, what is, when, etc....
Experiment Question: influence of Hetrochromatin protein 1 on chromatin structure? Approach: Prepare collection of cells with a specific region Control group: target GFP to the region HP1 group : target GFP/HP1 to the region Observe regions with confocal microscope Data analysis question: Identify and quantify the differences between control and HP1 group
Collection of data sets 60 data sets (30 control group, 30 HP1 group) Each data set: 512 x 512 x 32 Sample images: Control group (left) HP1 group (right) Data analysis questions: Accurately detect region of interest Quantify region attributes (size, roughness, roundness, etc) What are the attribute differences in the control and HP1 groups ?
Interactive Visualization of Collection Advantages Control over visualization tools and parameters Segmentation Attribute computations Direct feedback Disadvantages Laborious Error prone
Batch processing of collection Advantage All sets are processed automatically A-priori parameter settings Disadvantage No feedback on the process
Visual Summaries Definition: a user defined compact visual representation of the data during (batch) processing Governing idea: the visual summary is used to visualize the steps in batch process Examples: General strategy: Interactive setup (determine parameters, attributes, etc) Batch processing using setup Information visualization with visual summaries
Discriminating groups • Red: HP1 sets, Green: control • Region granularity vs number of spots in region • Granularity attribute • Average intensity gradient of region • Plot tells us: • Large variation, some outliers • HP1 and control seem different
Large variation, some outliers • Brush / link outliers • Investigate visual summary • Problems with data set • Corrupt data
HP1 and control seem different • Further analysis • Histograms • Box plots • Statistical tests • Wilcoxon • Wilcoxon tells us that there is indeed a significant difference
Lessons learned Showing a significant difference in granularity vs number of spots tells us that the HP1 effects the structure of chromatin. The effect is that chromatin is condensed in a number of compact regions. Biological significant result. Two papers published Strategy for analysis of collections of confocal data sets Interactive visualization and batch processing are both needed Information visualization is used for the analysis of batch output Visual summaries are used to link back to original data set or previous steps in batch process Strategy has been implemented as the Argos system
Generality Argos has been used for the analysis of an experiment consisting of 2500+ confocal data sets Argos has been used for the analysis of micro array data