1 / 36

Tom Kepler Santa Fe Institute Normalization and Analysis of DNA Microarray Data by

Tom Kepler Santa Fe Institute Normalization and Analysis of DNA Microarray Data by Self-Consistency and Local Regression kepler@santafe.edu. Rat mesothelioma cells control. Rat mesothelioma cells treated with KBrO 2. Normalization Method to be improved:

santod
Download Presentation

Tom Kepler Santa Fe Institute Normalization and Analysis of DNA Microarray Data by

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Tom Kepler Santa Fe Institute Normalization and Analysis of DNA Microarray Data by Self-Consistency and Local Regression kepler@santafe.edu

  2. Rat mesothelioma cells control Rat mesothelioma cells treated with KBrO2

  3. Normalization • Method to be improved: • Assume that some genes will not change under the treatment under investigation. • Identify these core genes in advance of the experiment. • Normalize all genes against these genes assuming they do not change

  4. Normalization • New Method: • Assume that some genes will not change under the treatment under investigation. • Choose these core genes arbitrarily. • Normalize (provisionally) all genes against these genes assuming they do not change. • Determine which genes do not change under this normalization. • Make this set the new core. If this core differs from the previous core, go to 3. Else, done.

  5. Error Model I = spot intensity [mRNA] = concentration of specific mRNA c = normalization constant

  6. Error Model I = spot intensity [mRNA] = concentration of specific mRNA c = normalization constant  = lognormal multiplicative error

  7. Error Model I = spot intensity [mRNA] = concentration of specific mRNA c = normalization constant  = lognormal multiplicative error index 1, i: treatment group index 2, j: replicate within treatment index 3, k: spot (gene)

  8. Y = log spot intensity  = mean log concentration of specific mRNA  = treatment effect (conc. specific mRNA)  = normalization constant  = normal additive error index 1, i: treatment group index 2, j: replicate within treatment index 3, k: spot (gene)

  9. Model: Identifiability constraints: Estimate by ordinary least squares:

  10. Model: Identifiability constraints: But note: cannot identify between a and d

  11. Self-consistency: The weight wk(d) is small if the kth gene is judged to be changed; close to one if it is judged to be unchanged. Procedure is iterative.

  12. Failure of Model

  13. Generalized Model The normalization aij(xk) and the heteroscedasticity function gij(xk) are slowly varying functions of the intensity, x. Estimate by Local Regression

  14. Local Regression data

  15. Predict value at x=50: weight, linear regression

  16. Predict whole function similarly

  17. Compare to known true function

  18. Simulation-based Validation 1. Reproduce observed bias.

  19. Simulation-based Validation 2. Reproduce observed heteroscedasticity.

  20. Test based on z statistic:

  21. Choice of significance level: expected number of false positives: E(false positives) = a N But minimum detectable difference increases as a gets smaller

  22. a E(fp) min diff min ratio 0.05 250 0.916 2.5 0.01 50 1.09 3 0.001 5 1.29 3.6 0.0001 0.5 1.61 5

  23. Validation of method against simulated data 3. Hypothesis testing: Simulated from stated model bias “-fold change” Proportion changed spots “rate false pos.” = mean observed / expected

  24. Simulated data: mis-specified model — multiplicative + additive noise

  25. Validation of method against simulated data 4. Hypothesis testing: Simulated from “wrong” model: additive + multiplicative noise. bias “-fold change” Proportion changed spots

  26. Acknowledgments Lynn Crosby North Carolina State University Kevin Morgan Strategic Toxicological Sciences GlaxoWellcome

  27. Santa Fe Institute www.santafe.edu  postdoctoral fellowships available (apply before the end of the year) kepler@santafe.edu

More Related