400 likes | 584 Views
Bayesian Processing of cDNA Microarray Images. Neil Lawrence Machine Learning Group Department of Computer Science The University of Sheffield. Overview. cDNA Microarray Review Bayesian Model for Image Processing ( Bioinformatics , In press) cDNA Summary Tracking Toy application
E N D
Bayesian Processing of cDNA Microarray Images Neil Lawrence Machine Learning Group Department of Computer Science The University of Sheffield
Overview • cDNA Microarray Review • Bayesian Model for Image Processing (Bioinformatics, In press) • cDNA Summary • Tracking • Toy application • Real application • Tracking Summary
cDNA Microarrays • Genes -> RNA -> protein (Central Dogma). • Proteins made determines role of cell. • Gene expression • want to measure which genes are expressing themselves. • important in cell differentiation & interaction networks. • Collaborators: • Biologists: Pen Rashbass, Matthew Holley, Alireza Fazeli. Sheffield Medicine & Biomedical Sciences. • Comp Sci: Marta Milo, Mahesan Niranjan.
Microarray Slides • Slides consist of genetic material (DNA) placed by a robot. • Slides are then • hybridised with fluorescent mutant and wild-type DNA samples. • scanned by a laser system. • Mutant is one colour wild-type another.
Microarray Slides • Slides consist of genetic material placed by a robot. Different solutions DNA in solution
Microarray Slides • Expression ratio • the ratio of red/green. • Software packages allow • manual placement of grids/spots. • extraction of mean/median pixel intensities. • We wish to • automate placement of spots. • extract with uncertainty measures.
The Model likelihood prior x z is measurements (e.g. pixel intensities) z x is latent variable (e.g. a state space of interest)
The Model • x – x, y position and radii of spot. • prior – taken from known grid layout. • likelihood – expected pixel values given spot shape and location. • Rx - region specified by x • p(zi | F) and p(zi | B) – foreground and background models.
Model Inference • We wish to extract gene expression ratios. • Ratio value depends on spot location. • Can also find variance etc. • How do we get p(x|z)?
Importance Sampling • Obtain samples from proposal, r(x), {xs}Ss=1. • Associate weights with samples to obtain estimates of moments under posterior.
Importance Sampling • If proposal is the prior, r(x) = p(x). • The weights become the likelihood • Demo of approach:
Approach Demo • msrcOne.m
Fixes • `Anneal’ the likelihood values • Either raise it to power of a fraction (don’t like this!) • Or use simulated annealing (annealing schedule) • Take more samples • May need very many more. • Our approach: • Hierarchical Bayesian model likelihood prior
The New Model likelihood prior P x x is latent variable p()andp(P)are hyper-priors - priors on the prior z
The Plan • The prior is now • We will use hyper-priors to form `hyper-posteriors’. • These `hyper-posteriors’ will govern the proposal. • The proposal will then adapt to z.
New approach – Adaptive Proposal expectations under `hyper-posteriors’ likelihood p(z|x) p(x|<>,<P-1>)
Model Specification • We take • Given • Marginalised prior is intractable. • `multi-variate Student-t with Gaussian prior on its mean.’
Variational Inference • We have chosen distributions from conjugate exponential family. • This means we can use variational inference to approximate the posterior. • Variational inference: assume posterior factorises
Variational Inference • What are the best forms for these factors? • Answer: • Relies on <P>, <xxT> and <x>. • Relies on <>, <T>, <xxT> and <x>. • q(x) however is more difficult. Notaition <> denotes expectation.
Variational Inference • Relies on <P>, <>. • Finding constant of proportionality now intractable. • Turn to sampling to compute <x>, <xxT>.
VIS • Variational importance sampler: • Estimate moments <x> and <xxT> by importance sampling. • Use N(x|, <>, <P>-1) as proposal. • Comparison with vanilla method: • Proposal distribution mean and covariance reflect data. • Iterative method where:
Computational Requirements • Computation dominated by likelihood evaluation. • Vanilla importance sampling: • Requirement rises with S. • Variational importance sampling: • Requirement rises with S x number of iterations. • Approach is worthwhile if accuracy improves with iterations quicker than it improves with number of samples.
Back to Demo • msrcTwo.m
Numerical Results Mean squared error for duplicate to duplicate prediction.
Simple Downstream Analysis • Aphakia – a developmental defect affecting the lens of the eye. • Samples of mutant and wild-type mice from E10.5, E11.5 and E12.5. • Here results are plotted on a 2-D graph. • x-axis is E10.5 – E11.5 • y-axis is E11.5 – E12.5
cDNA Summary • Automatic processing of cDNA images. • Improved consistency of image processing • Used uncertainty in combination of replicates. • Future work: • Propagate uncertainty to more complex downstream analyses. • Automate initial rough grid layout.
Dynamic Model • The variational importance sampler is a general methodology. • Here we apply it in tracking. • Joint work with: • Jaco Vermaak, Cambridge Engineering • Patrick Perez, Microsoft Research • Presented at CVPR 03
Dynamic Model Pt Pt+1 t t+1 xt xt+1 zt zt+1
Dynamic Model • `Prior’ at step t+1 is taken as
Overall Summary • Bayesian Inference on images: • No need to artificially adjust likelihood. • Considerably better performance. • Developments: • Multi-modality through mixtures.