1 / 35

Copy Number Variation

Copy Number Variation. Eleanor Feingold University of Pittsburgh March 2012. GCTC ATATATAT TTG. kb - Mb (gene or gene region). What do we mean by “ copy number variation? ”. “ normal ”. deletion. duplication of one gene. duplication of several genes. duplication of part of a gene.

milla
Download Presentation

Copy Number Variation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Copy Number Variation Eleanor Feingold University of Pittsburgh March 2012

  2. GCTCATATATATTTG kb - Mb (gene or gene region) What do we mean by “copy number variation?”

  3. “normal” deletion duplication of one gene duplication of several genes duplication of part of a gene Copy number variation in a gene or gene region

  4. Classical copy number study types Cancer genetics Clinical pediatrics What Find chromosomal segments (usually large ones) that are duplicated and/or deleted in tumor cell lines Why Learn something about cancer biology or Implications for treatment and prognosis What Detect inherited or de novo deletions in individuals Why “Diagnose” birth defects

  5. 1) Collect cases and controls. 2) “Genotype” everyone at a CNV. 0 4 2 1 2 5 0 1 1 4 1 16 3 2 1 2 0 3) Test genotype/phenotype association. And now:Genetic association studies for CNVs

  6. How do we assay copy number variation?

  7. Generation 1 - Array CGH What Microarray of clones (e.g. BACs) Usually on glass slide Competitive hybridization of test and reference samples. Measure fluorescence ratio clone by clone. Limitations Large clones. Sparse coverage. High noise due to spotting process.

  8. Generation 2 - SNP chips What High-throughput SNP genotyping platforms (e.g. Affymetrix, Illumina) Disadvantages Technology was never intended for measuring copy number. SNPs on chip selected to avoid CNV regions by design. Advantage Hundreds of thousands of points of info.

  9. Generation 3 - SNP chips with CNV markers (Affy 6.0, Illumina 1M) Advantages SNPs in known CNV regions are now included. Also have “non-polymorphic SNPs” (SNs?) Illumina 1M markers in 10K regions of various types and sizes Affymetrix 200K probes in 5K known large CNV regions 700K probes “evenly spaced along the genome”

  10. Generation 4 - (Illumina 2.5M, 5M) Changes Got rid of the non-polymorphic markers. Special coverage of CNV regions??? Are these better or worse for CNVs than the previous generation?

  11. What data do these technologies give us, and how do we use it?

  12. Standard genotyping Genotype information is in the angle (relative intensity of the two alleles). Copy number information is in the distance from the origin (total intensity). BB AB AA

  13. In theory AAB AAA ABB AA AB A BB BBB B null

  14. But when you look at the data … AAA and AA trisomic (Down Syndrome) AAB disomic AB ABB BBB and BB

  15. disomic trisomic total intensity (trisomic) total intensity total intensity (disomic) All SNPs on chromosome 21

  16. In theory AAB AAA ABB AA AB A BB BBB B null

  17. In practice A B null

  18. So how are copy numbers called? Look for runs of SNPs that are high or low in intensityMany available algorithms e.g. HMM, CBS, change-point

  19. Basic picture

  20. Komura et al. Genome Research 2006

  21. More complex examples (cancer genetics) Peiffer et al. Genome Research, 2006

  22. amplification total intensity Angle (genotype info) AA AB BB

  23. deletion deletion

  24. total intensity high over whole chromosome 3 genotype groups Extra copy of whole chromosome

  25. LOH No copy number change, but a region of homozygosity (LOH)

  26. Basic picture Wang et al. Genome Research, 2007

  27. Chromosome 9

  28. A few statistical issues to think about … (there’s still a lot to do)

  29. Many run-calling algorithms are oriented towards clinical applications. Many CNV detection algorithms are very conservative - aim for zero false positive rate. Most use normalization methods that assume a large reference population is not available. Many use models that make assumptions about what kinds of variation are likely (e.g. cancer).

  30. Family data should be modeled together. CNV “calls” will be much more accurate if you use the whole family, but the model you use should depend on whether you are expecting de novo mutations or not. For some diseases you’ll expect associations with de novo changes. For others you might expect inherited variants.

  31. deletion deletion deletion deletion duplication How do we group CNVs for association testing?

  32. Separate methods for deletions? Deletions are easier to detect than other changes. Deletions are likely to have simpler biological effects.

  33. The most important one … The technology is still NOT intended for reliably and comparably measuring total intensity! Total intensity numbers are very sensitive to DNA source, sample handling, etc., so extreme measures must be taken to ensure that cases and controls are comparable.

More Related