Bayesian Processing of cDNA Microarray Images

Bayesian Processing of cDNA Microarray Images Neil Lawrence Machine Learning Group Department of Computer Science The University of Sheffield

Overview • cDNA Microarray Review • Bayesian Model for Image Processing (Bioinformatics, In press) • cDNA Summary • Tracking • Toy application • Real application • Tracking Summary

cDNA Microarrays • Genes -> RNA -> protein (Central Dogma). • Proteins made determines role of cell. • Gene expression • want to measure which genes are expressing themselves. • important in cell differentiation & interaction networks. • Collaborators: • Biologists: Pen Rashbass, Matthew Holley, Alireza Fazeli. Sheffield Medicine & Biomedical Sciences. • Comp Sci: Marta Milo, Mahesan Niranjan.

Microarray Slides • Slides consist of genetic material (DNA) placed by a robot. • Slides are then • hybridised with fluorescent mutant and wild-type DNA samples. • scanned by a laser system. • Mutant is one colour wild-type another.

Microarray Slides • Slides consist of genetic material placed by a robot. Different solutions DNA in solution

One Grid

Microarray Slides • Expression ratio • the ratio of red/green. • Software packages allow • manual placement of grids/spots. • extraction of mean/median pixel intensities. • We wish to • automate placement of spots. • extract with uncertainty measures.

The Model likelihood prior x z is measurements (e.g. pixel intensities) z x is latent variable (e.g. a state space of interest)

The Model • x – x, y position and radii of spot. • prior – taken from known grid layout. • likelihood – expected pixel values given spot shape and location. • Rx - region specified by x • p(zi | F) and p(zi | B) – foreground and background models.

Model Inference • We wish to extract gene expression ratios. • Ratio value depends on spot location. • Can also find variance etc. • How do we get p(x|z)?

Importance Sampling • Obtain samples from proposal, r(x), {xs}Ss=1. • Associate weights with samples to obtain estimates of moments under posterior.

Importance Sampling • If proposal is the prior, r(x) = p(x). • The weights become the likelihood • Demo of approach:

Approach Demo • msrcOne.m

Fixes • `Anneal’ the likelihood values • Either raise it to power of a fraction (don’t like this!) • Or use simulated annealing (annealing schedule) • Take more samples • May need very many more. • Our approach: • Hierarchical Bayesian model likelihood prior

The New Model likelihood prior P  x x is latent variable p()andp(P)are hyper-priors - priors on the prior z

The Plan • The prior is now • We will use hyper-priors to form `hyper-posteriors’. • These `hyper-posteriors’ will govern the proposal. • The proposal will then adapt to z.

New approach – Adaptive Proposal expectations under `hyper-posteriors’ likelihood p(z|x) p(x|<>,<P-1>)

Model Specification • We take • Given • Marginalised prior is intractable. • `multi-variate Student-t with Gaussian prior on its mean.’

Variational Inference • We have chosen distributions from conjugate exponential family. • This means we can use variational inference to approximate the posterior. • Variational inference: assume posterior factorises

Variational Inference • What are the best forms for these factors? • Answer: • Relies on <P>, <xxT> and <x>. • Relies on <>, <T>, <xxT> and <x>. • q(x) however is more difficult. Notaition <> denotes expectation.

Variational Inference • Relies on <P>, <>. • Finding constant of proportionality now intractable. • Turn to sampling to compute <x>, <xxT>.

VIS • Variational importance sampler: • Estimate moments <x> and <xxT> by importance sampling. • Use N(x|, <>, <P>-1) as proposal. • Comparison with vanilla method: • Proposal distribution mean and covariance reflect data. • Iterative method where:

Computational Requirements • Computation dominated by likelihood evaluation. • Vanilla importance sampling: • Requirement rises with S. • Variational importance sampling: • Requirement rises with S x number of iterations. • Approach is worthwhile if accuracy improves with iterations quicker than it improves with number of samples.

Back to Demo • msrcTwo.m

Microarray Demo 2

Numerical Results Mean squared error for duplicate to duplicate prediction.

variance vs intensity

Simple Downstream Analysis • Aphakia – a developmental defect affecting the lens of the eye. • Samples of mutant and wild-type mice from E10.5, E11.5 and E12.5. • Here results are plotted on a 2-D graph. • x-axis is E10.5 – E11.5 • y-axis is E11.5 – E12.5

Summarisation of Replicates

cDNA Summary • Automatic processing of cDNA images. • Improved consistency of image processing • Used uncertainty in combination of replicates. • Future work: • Propagate uncertainty to more complex downstream analyses. • Automate initial rough grid layout.

Dynamic Model • The variational importance sampler is a general methodology. • Here we apply it in tracking. • Joint work with: • Jaco Vermaak, Cambridge Engineering • Patrick Perez, Microsoft Research • Presented at CVPR 03

Dynamic Model Pt Pt+1 t t+1 xt xt+1 zt zt+1

Dynamic Model • `Prior’ at step t+1 is taken as

Toy Problem

Results

Tracking Problem

Tracking Results

Contour Tracking

Overall Summary • Bayesian Inference on images: • No need to artificially adjust likelihood. • Considerably better performance. • Developments: • Multi-modality through mixtures.

Bayesian Processing of cDNA Microarray Images

Bayesian Processing of cDNA Microarray Images

Presentation Transcript

CDNA synthesis and microarray hybridization

Preprocessing of cDNA microarray data

Normalization for cDNA Microarray Data

Microarray Pre-Processing

Learning Bayesian Networks with microarray data

Post-processing images

Introduction to the design of cDNA microarray experiments

Image Processing for cDNA Microarray Data

The second-simplest cDNA microarray data analysis problem

cDNA Microarray

Some thoughts of the design of cDNA microarray experiments

Preprocessing of cDNA microarray Data

Bayesian processing of vestibular information

cDNA Microarray analysis of an invasive brain tumor

Estimating expression differences in cDNA microarray experiments

RIFGP cDNA Microarray Database

Gene Discovery from Microarray Images

Image Analysis on cDNA Microarray Data Demo of Spot

Bayesian analysis of microarray traits

The second-simplest cDNA microarray data analysis problem

Normalization for cDNA Microarray Data

Processing Digital Images