Statistical Techniques for Temporal Microarray Data Analysis

Statistical Techniques for Temporal Microarray Data Analysis Ritesh Krishna Department Of Computer Science WPCCS July 1, 2008

Why should you listen to my talk ? • System Biology is everybody’s playground in this room – Image processing, Algorithms, Parallel processing etc. • Importance of System Biology in today’s context – • Agriculture • Energy sources (Bio Fuels) • Gene Therapy • Waste clean-up

Use of Computational Techniques • Massive data generated by molecular biology experiments • Need to analyse outputs files produced in various formats, facilitate storage of bulk data, quick and precise retrieval, and most importantly understanding the behaviour and pattern in the data

How are these experiments performed Major revolution in the world of molecular biology No limitation of one gene in one experiment Possible to monitor expression levels of thousands of genes simultaneously

An example - Arabidopsis Thaliana • Popular in plant biology as a model plant • One of the smallest plant genome • First plant genome to be sequenced • Present Study • The present study is about understanding • leaf senescence process in Arabidopsis. • Senescence refers to the biological processes • of a living organism approaching an advanced • age, caused due to age and stress in plant • It is a programmed event responding to a wide • range of external and internal signals and is • controlled in a tightly regulated manner by • different genes and proteins..

Experimental Design Dye Laser (Total 16 replicates) Quantitative Data

Issues with data • Biological variations vs. Technical variations • Technical variations – Sample bias, Dye bias, Slide bias, Experimental conditions variations, Scanning and Imaging errors, Human errors • Massive dataset with ~31,000 genes • Goal is to understand functioning of certain sets of genes (needle in the haystack)

Step one – Clean the raw data using Normalization • To assess different sources of technical biases • To remove the correlations between replicates to make them independent from each other • Fitting a multivariate error model - Normal distribution with mean zero and constant variance for the residuals associated with genes • Propose statistical tests for evaluating the effects of normalization

Step two - Clustering • Reduce the data dimension • Similar genes sit in the same cluster.

Step three – Causal Network inference

Circadian Circuit ELF4 TOC1 LFY CCA1

ERS2 ERS1 ETR2 ETR1 CTR1 EIN2 EIN6 EIN4 EIN3 EIL2 EIL1 EIL4 EIL3 EIL5 ERF1 PDF1.2

More information…. • Affymetrix Inc. (http://www.affymetrix.com/index.affx) • Agilent Technologies (http://www.chem.agilent.com) • Microarray Analysis , Gibson G (2003) Microarray Analysis. PLoS Biol 1(1): e15

Statistical Techniques for Temporal Microarray Data Analysis

Statistical Techniques for Temporal Microarray Data Analysis

Presentation Transcript

Microarray Data Analysis

Microarray Data Analysis

Microarray data analysis

Some Statistical Issues in Microarray Data Analysis

Microarray Data Analysis

Statistical Analysis of Microarray Data

Microarray Data Analysis

Microarray Data Analysis

Microarray data analysis

Microarray Data Analysis

Microarray Data Analysis

Microarray data analysis

Statistical Analysis of Microarray Data

Statistical Analysis of DNA Microarray.

Microarray Data Analysis

Microarray Data Analysis

Microarray data analysis

Cluster analysis for microarray data

Microarray Data Analysis

Statistical Analysis of Microarray Data

Microarray Data Analysis

Statistical Analysis of Microarray Data