260 likes | 396 Views
Epigenetics. Changes to a gene's expression without a changing the genes themselvesDNA methylationHistone modificationwhen chromatin is methylated, it is in a closed configuration, inhibiting regulatory proteins necessary for transcriptionwhen histones are acylated, the chromatin remains openBo
E N D
1. DMH Data Normalization Methods Daniel Tse
Lab of Dr. Tim Huang; Mentor: Dr. Dustin Potter
8.23.06
2. Epigenetics Changes to a gene’s expression without a changing the genes themselves
DNA methylation
Histone modification
when chromatin is methylated, it is in a closed configuration, inhibiting regulatory proteins necessary for transcription
when histones are acylated, the chromatin remains open
Both work together to control gene expression
3. Epigenetic Research Looking at CpG islands
dense regions of cytosine and guanine near transcription sites where there is the highest amount of methylation
Identification of hyper/hypo methylated genes
in tumorigenesis, many genes and aberrantly methylated or demethylated such as genes responsible for tumor suppression, apoptosis, or senescence
Demethylating agents, methyltransferases, deacylation inhibitors
4. DMH
5. Biases in Two-Color Arrays Slide
inconsistencies within the slide
slide color biasing intensity readings
Scanning
aberrations or inconsistencies between slides
differences in machines
Dye
at low intensities, green dye is favored (hypo methylation)
at high intensities, red dye is favored (hyper methylation)
Impact
All can produce non-biological outliers, irregular distributions due to non-biological factors
7. Normalization Increased bias and confounding variables for high-throughput analysis
little work has been done to adapt normalization techniques to DMH
Normalization techniques are needed to accurately interpret the results
remove aberrations and outliers due to non-biological factors
reveal the patterns and outliers of actual biological factors
Work done in R
open source statistical language based on S+
Two main categories for study
intra-slide normalization (within slide)
inter-slide normalization (between slide)
8. Logarithmic Transformation common technique used for two-color arrays
log transformations often convert data to a more normal distribution
normal distribution often needed for some statistical methods
M= log2(Cy5/Cy3)
A=log2(Cy5*Cy3)*0.5
9. Loess
10. Loess-based Normalizations intensity-dependent non-linear normalization
a loess curve is fit to the M vs A data
predicted loess value is subtracted from the data to decrease the standard deviation and place the mean log ratio at 0
rank invariant non-linear normalization
loess curve is fit to rank invariant signals with respect to Cy5 and Cy3
better for reducing outliers when formulating the loess curve
11. Loess-based Normalizations Cyclic loess
take two arrays a and b
calculate M=log2(probe intensities(a)/probe intensities(b))
also: A=0.5*log2(probe intensities(a)*probe intensities(b))
a loess curve is fit to these new M and A values
the original M and A values are then adjusted
pair-wise combinations are formed to perform this normalization
averages are taken of the resulting M and A values between the pair-wise combinations for the final adjustments
12. Regional Loess
13. Global Mean
14. Regression Normalization
15. Other Between-slide Normalizations Interquantile normalization
forces all samples to have very similar distributions
the green samples are normalized by a ranked-mean method
the red samples are then normalized using the linear regression model in a similar fashion to regression normalization
Scaling methods
a baseline array is chosen
the other arrays in the set of experiments are normalized to the baseline array
16. Complexity Issues - CPU economy
17. Global Mean
18. Loess
19. Cyclic Loess
20. Cyclic Loess
21. Standard Deviation Changes
22. Outliers
23. Outliers
24. Discussion Loess normalization
effective in reducing the dye biases observed in raw data
reduces the standard deviation effectively
reduces old outliers and possibly revealing biological outliers
does not induce excessive negative frequencies
Rank invariant loess normalization is quickest computational with similar results as other loess normalization schemes
25. Future Study Use in other data sets
Continued analysis with sequential normalization
Spike-in data
Looking for additional normalizations schemes
26. Acknowledgements Mentor: Dr. Dustin Potter
Supervisors: Dr. Tim Huang, Dr. Pearlly Yan
MBI Summer Undergraduate Program