1 / 24

A NOVEL APPROACH TO IMPROVE THE NOISE IN DETECTING COPY NUMBER VARIATIONS USING OLIGONUCLEOTIDE

A NOVEL APPROACH TO IMPROVE THE NOISE IN DETECTING COPY NUMBER VARIATIONS USING OLIGONUCLEOTIDE MICROARRAYS. 12 November 2008 Noushin Farnoud , Marco Marra, Jan Friedman, Stephane Flibotte, Allen Delaney Canada’s Michael Smith Genome Sciences Centre. Outline.

mimis
Download Presentation

A NOVEL APPROACH TO IMPROVE THE NOISE IN DETECTING COPY NUMBER VARIATIONS USING OLIGONUCLEOTIDE

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A NOVEL APPROACH TO IMPROVE THE NOISE IN DETECTING COPY NUMBER VARIATIONS USING OLIGONUCLEOTIDE MICROARRAYS 12 November 2008 Noushin Farnoud, Marco Marra, Jan Friedman, Stephane Flibotte, Allen Delaney Canada’s Michael Smith Genome Sciences Centre

  2. Outline • What are Copy Number Variations (CNVs)? • Why is it important to study copy number variations? • How can we study CNVs? • What are the issues associated with studying CNVs? • How can we deal with them?

  3. What is Copy Number Variation (CNV)? • The DNA copy number of a region of a genome is the number of copies of genomic DNA. • In humans the normal copy number is two for majority of autosomes. However, recent discoveries have revealed that many segments of DNA, ranging in size from kilobases to megabases, can vary in copy-number. • These DNA copy number variations (CNVs) are a result of genomic events causing discrete gains and losses in contiguous segments of the genome.

  4. Why is it important to study CNVs? • CNVs are common in cancer and other diseases. For example, a review paper by Charles Lee have listed 17 conditions of the nervous system alone – including Parkinson’s Disease and Alzheimer’s Disease – that can result from copy number variation (Neuron Oct 06) • CNVs are also common in normal individual and contribute to our uniqueness. These changes can also influence the susceptibility to disease. • Since CNVs often encompass genes, they can have important roles both in characterizing human disease and discovering drug response targets. • Understanding the mechanisms of CNV formation may also help us better understand human genome evolution.

  5. How can we detect CNVs? Two-color arrays Reference Patient One-color arrays

  6. Main issue of oligonucleotide microarrays 4 2 2 Log2 Ratio of Intensity 0 -2 -4 50 0 100 200 250 150 Position (Mb) Although high density microarrays provide genome wide data on copy number, they are often associated with substantial amount of noise that could affect the performance of the analyses. * http://dsgweb.wustl.edu/qunyuan

  7. How can we improve this noise? Hypothesis Can we improve the oligonucleotide microarray noise by analyzing individual oligonucleotide probes? Each SNP probe set has : # oligonucleotide probes (10K array): 647,080 oligos # oligonucleotide probes (100K array) : 4,648,160oligos # oligonucleotide probes (500K array): 12,013,632 oligos

  8. Therefore… • We can conclude that a major source of the noise is the different behavior of the individual oligonucleotide probes in the SNP probe-set. • This points out to the fact that averaging all PM oligos is not a proper approximation of information content of a SNP.

  9. Novel Algorithm: Oligonucleotide Probe-level Analysis of Signal intensities (OPAS) • Clusters the individual oligos in each SNP probe-set • Apply Null-hypothesis testing : estimates the likelihood (p-value) that each cluster of oligos have log-ratio-intensity =0; >0 or <0 • Based on these p-values and ML classification algorithms; identify the “most significant cluster of oligos”. The other cluster(s) of oligos is noise; exclude them from analysis.

  10. Example of Improving the SNP Noise by OPAS Before After

  11. How does OPAS Affect CN analysis?

  12. What's next? • The next-generation of DNA microarray-based technologies will allow equal detection of large and small CNVs. • Also on the horizon are new DNA sequencing technologies enabling rapid (and ultimately inexpensive) 'personalized' genome sequencing projects. • Coupled together, these technologies will capture almost all the variation in a genome.

  13. Acknowledgments Funding GSC • Marco Marra • Stephane Flibotte • Allen Delaney • Irene Li • Hong Qian • Robert Holt • Sussana Chan BC Children’s & Women’s Hospital • Jan Friedman • Patrice Eydoux Contact: nfarnoud@bcgsc.ca

  14. Advantages of Array CGH

  15. Test Array (Normalized Log2 Raw- Intensity) Ref Set (Pool of Normal Parents) Log2 Ratio Clustering PM oligos (using Fuzzy Clustering approach) Likelihood Estimation Apply a series of Null-Hypothesis Tests, to determine the likelihood : PHs(cluster = 0) PHs(cluster< -0.5) PHs(cluster> +0.6 Classification (each SNP is classified to be deleted, normal or amplified, based on comparing the P’s of its consisting clusters of PM oligos Post Processing the Results

  16. What is Copy Number • Introduction - What is a SNP? - What is a SNP array? Array Design + Target Preparation • Applications of SNP arrays (other than genotyping) - Copy number analysis • Genotyping using SNP arrays - Generations of methodologies - Properties of SNP arrays

  17. Schematic Representation of DNA Copy Number Change Normal cell CN=2 deletion amplification CN=0 CN=1 CN=3 CN=4

  18. Background (1) : What are SNPs? T T A A A T A T C G C G G G A A T T G C G C A T T A T A C G C G T T A A G G C C T A T A Single Nucleotide- Polymorphism (SNP) • Definition: SNPs are variations in single base pairs that are randomly dispersed throughout the genome

  19. Major conclusions so far* … • There is a considerable variation among the numbers and types of candidate CNVs detected by different analysis approaches. • Multiple programs are needed to find all real aberrations in a test set. • The frequency of false positive deletions is substantial, but can be greatly reduced by using the SNP genotype information to confirm loss of heterozygosity. * Friedman et. al, AJHG 2006 Baross et. al, BMC Bioinformatics, 2007 Delaney et. al , in progress

  20. Profile of SNP probe sets Deleted SNPs SNPs in ‘Normal’ Region

  21. 3 Generation of Affymetrix SNP arrays

  22. Background : Structure of Affy SNP array Each SNP probe set has : 57 oligonucleotide probes (10K array): 647,080 oligos 40 oligonucleotide probes (100K array) : 4,648,160oligos 20oligonucleotide probes (500K array): 12,013,632 oligos

More Related