1 / 48

Quality control issues

Explore sources of error in microarray experiments, including chip quality, reproducibility, and hybridization quality. Learn methods to evaluate and enhance data quality through self-self hybridization and replicate experiments. Understand spot intensity distributions and variance in spot intensity to improve data quality.

darricka
Download Presentation

Quality control issues

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Quality control issues Overview Chip quality Hybridisation quality Reproducibility NERC/Manchester Array Course

  2. Chip Quality • Sources of error: • Print tips • PCR reactions • Humidity • Contamination • ….. NERC/Manchester Array Course

  3. Error types • Basic problems • Slide background • Batch to batch variation NERC/Manchester Array Course

  4. Chip quality • How to get information on chip quality? • Monitor number of flagged spots • Eyeball the TIF files • Self-self hybridisation • replicates NERC/Manchester Array Course

  5. Microarray data quality NERC/Manchester Array Course

  6. Flagged data is usually poor quality Flag data not included (left) and included (right) NERC/Manchester Array Course

  7. Microarray data quality • Sources of data quality information • Self-self hybridisation microarray chips • In self-self hybridisation microarray chips, the gene expression levels of two same samples are measured from one chip. It is logically that: • The absolute gene expression levels measured from two samples should be the same. • The difference of gene expression level for any gene in two samples should be zero. • If measurements from two channels are not equal, then measurement error exists in experiment NERC/Manchester Array Course

  8. Microarray data quality • Sources of data quality information • Self-self hybridisation microarray chips Let measurements from two channels of a slide be yjR ,yjG j=1,2,…N (N number of genes) • If yjR = yjG for all j • There is no measurement error between two channels • If yjR-yjG varies around zero • Random error exists between two channels • If yjR –yjG varies not around zero • Both random and systematic errors exist between channels • Self-self chips is the most useful information source of data quality of microarray experiment NERC/Manchester Array Course

  9. Microarray data quality • Data quality information from replicate experiments • Replicate experiments are the main source of the information about reproducibility of a experiment. Take two replicates as example and the measurements are: • x1,j,x2,j,y1,j,y2,j; where: j=1,2…N, x, y -- different sample, 1, 2 --first second experiments • Different profiles available for selection: • x1,j- x2,j Channel intensity (difference of same X on different slides) • y1,j –y2,j Channel intensity (differenceof same Y on different slides) • (x1,jy1,j)- (x2,jy2,j)Point intensity • x1,j/ y1,j- x2,j/ y2,jRatio • Log scale are usually employed NERC/Manchester Array Course

  10. Microarray data quality-between channelsExamples of data quality profile extracted from a self-self hybridisation chip NERC/Manchester Array Course

  11. Microarray data quality-between channelsExamples of data quality profile extracted from a self-self hybridisation chip NERC/Manchester Array Course

  12. Microarray data quality-between slidesExamples of data quality profile extracted from two self-self hybridisation chips X=log(Ra/Rb) mean(x)=0.068 Var(x)=0.600 NERC/Manchester Array Course

  13. Microarray data quality-between slidesExamples of data quality profile extracted from two self-self hybridisation chips X=Aa-Ab mean(x)=0.076 Var(x)=0.326 NERC/Manchester Array Course

  14. Microarray data quality-between slidesExamples of data quality profile extracted from two self-self hybridisation chips X=log(Ra/Rb) mean(x)=0.017 Var(x)=0.308 NERC/Manchester Array Course

  15. Microarray data qualityExamples of data quality profile extracted from two self-self hybridisation chips NERC/Manchester Array Course

  16. Microarray data qualityExamples of data quality profile extracted from two Ref.-Treatment hybridisation replicated chips NERC/Manchester Array Course

  17. Microarray data quality • Conclusions • Both systematic error and random noise are observed between channels of a microarray chips. • Both systematic error and random error is observed between slides • Log(ratio) is the least noisy data NERC/Manchester Array Course

  18. Hybridisation quality • Is there enough cDNA • Has the labelling worked? • Has it worked as expected for that species? NERC/Manchester Array Course

  19. Spot intensity distributions • Is there a generic form • Can an understanding of the generic form • help in QC NERC/Manchester Array Course

  20. Spot Intensity Distribution - 1 • Asymmetric, Heavy Tail. • Most spots have small intensity. Few have high intensity. NERC/Manchester Array Course

  21. Spot Intensity Distribution - 2 • Logged data distribution is symmetric. • Logged data approximated by a Normal in central region. NERC/Manchester Array Course

  22. Example Data Sets NERC/Manchester Array Course

  23. Spot Intensity Distribution 3 • Characterize width of distribution by variance 2 •  unaffected by simple normalization schemes, e.g. mean, median centering of log values. • Study variation of 2 between samples and between species NERC/Manchester Array Course

  24. Var( log spot intensity) • 2 increasing with genome size (no. of genes) • Is this trend truly biological ? NERC/Manchester Array Course

  25. Characterising the distribution • Microarray data obeys Benford’s law • P(D) = log10(1+D-1) NERC/Manchester Array Course

  26. Statistics • Calculate fit to Benford’s law • Monitor the distribution width • (see Practical) NERC/Manchester Array Course

  27. Experimental design Controlling variation Has the experiment worked Optimising the design NERC/Manchester Array Course

  28. Experimental design Microarray experiment and its aims • Microarray experiments have multiple sources of variation which include the interesting variation (biological based) and other variations (non-biological based) • Microarray experiments target at the identification and measurement of biological based variation: • Normal vs. abnormal • General condition vs. extreme condition • Controls vs. treatments • Treatment vs. other different treatment • Or time series NERC/Manchester Array Course

  29. Variation in microarray experiment • Microarray data is variable • The variation may arise from the experiment process through: • Extraction of samples; • Chips printing; • Dye; • Hybridisation; • Image processing; • Background handling. NERC/Manchester Array Course

  30. Variation in microarray experimentsFactors related to accuracy and reproducibility of microarray data • Each process of microarray experiment usually involve a number of factors which affect the accuracy and reproducibility of the experiments. However, they can be classified into five categories: • Human; • Equipment; • Samples and chips; • Method, procedure, specification; • Environment; NERC/Manchester Array Course

  31. Task • To maximise the appropriate biologically relevant data in a cost-effective way • Issues • How many repeats • Dye flips • To pool or not to pool? • What is the optimal design of the hybridisations? NERC/Manchester Array Course

  32. Experiment (1) • Response to control variables NERC/Manchester Array Course

  33. The experiment must be stable • Set up experiment • Collect data • Assess data • If stable continue • If not, why not? .. Correct and continue NERC/Manchester Array Course

  34. Experiment 2 • Controlling for noise NERC/Manchester Array Course

  35. Controlling for noise • Look at noise as a function of expression level • Self-hybridisations • Reverse labelling • Same sample, different preps • Different samples, different preps NERC/Manchester Array Course

  36. How many repeats? • t = (m1-m2)/(2s2/N)0.5 therefore (m1-m2) = t(2s2/N)0.5 • Plug in values to get significant fold changes • t = 3, N=2, s2=0.3 then: 5-fold change is significant • t = 3, N=4, s2=0.3 then: 3.2-fold change is significant • t = 3, N=6, s2=0.3 then: 2.5-fold change is significant NERC/Manchester Array Course

  37. Pooling • An issue of cost • Analysis modified • Average log does not equal log average NERC/Manchester Array Course

  38. Experimental design • Y follows • Z=(y1+y2+…+yn)/n follows • z and z specified by: NERC/Manchester Array Course

  39. Mixed tissue or single cell? NERC/Manchester Array Course

  40. Has the experiment worked? • Use principal component analysis to cluster experiments NERC/Manchester Array Course

  41. Experimental designBasics of experimental design • Examples: • (a) Reference – non-reference • (b) Loop • (c) Reference – non-reference plus ref-ref • (d) Modified loop NERC/Manchester Array Course

  42. Basic principles • Make sure the system is behaving • Biological repeats are best • The most important comparisons should be performed on the same chip • Dye-flips are very useful NERC/Manchester Array Course

More Related