1 / 54

Which arrays should be used?

High Density Oligo Arrays for S ingle F eature P olymorphism Genotyping and Mapping Justin Borevitz Ecology & Evolution University of Chicago http://naturalvariation.org. Which arrays should be used?. Spotted arrays Arizona 29,000 - 70mers ATH1, Affymetrix expression GeneChip

Download Presentation

Which arrays should be used?

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. High Density Oligo Arrays forSingle Feature PolymorphismGenotyping and Mapping Justin BorevitzEcology & EvolutionUniversity of Chicagohttp://naturalvariation.org

  2. Which arrays should be used? • Spotted arrays Arizona 29,000 - 70mers • ATH1, Affymetrix expression GeneChip 202,806 unique 25bp oligo nucleotides features • AtTILE1, universal whole genome array every ~35bp, > 3Million PM features • Re-sequencing array 120M*8bp • 20 Accessions, Perlegen, • Max Planck (Weigel), USC (Nordborg) GeneChip

  3. Universal Whole Genome Array RNA DNA Chromatin Immunoprecipitation ChIP chip Gene Discovery Gene model correction Non-coding/ micro-RNA Antisense transcription Methylation Transcriptome Atlas Expression levels Tissues specificity Polymorphism SFPs Discovery/Genotyping Comparative Genome Hybridization (CGH) Insertion/Deletions Alternative Splicing ~35 bp tile, non-repetitive regions, “good” binding oligos, evenly spaced

  4. ChipViewer: Mapping of transcriptional units of ORFeome From 2000v At1g09750 (MIPS) to the latest AGI At1g09750 2000 v Annotation (MIPS) The latest AGI Annotation

  5. Improved Genome Annotation ORFa Transcriptome Atlas ORFb start AAAAA deletion M M M M M M M M M M M M SFP SNP SNP SFP SFP conservation Chromosome (bp)

  6. Talk Outline • Single Feature Polymorphisms (SFPs) • Barley SFPs • Uses of SFPs • Haplotype analysis • Expression

  7. Potential Deletions

  8. Spatial Correction Improved reproducibility Next: Quantile Normalization Spatial Artifacts

  9. False Discovery and Sensitivity • Cereon • may be a • sequencing • Error • TIGR • match is • a match 90% 80% 70% 41% 53% 85% 90% 80% 70% 67% 85% 100% 3/4 Cvi markers were also confirmed in PHYB PM only GeneChip SAM threshold SFPs nonSFPs Cereon marker accuracy 5% FDR 3806 89118 100% Sequence 817 121 696 Sensitivity Polymorphic 340 117 223 34% Non - polymorphic 477 4 473 False Discovery rate: 3% Test for independence of all factors: Chisq = 177.34, df = 1, p - value = 1.845e - 40 GeneChip SAM threshold SFPs nonSFPs Cereon marker accuracy 18% FDR 10627 82297 100% Sequence 817 223 594 Sensitivity Polymorphic 340 195 145 57% Non - polymorphic 477 28 449 False Discovery rate: 13% Test for indep endence of all factors: Chisq = 265.13, df = 1, p - value = 1.309e - 59

  10. Effect of SNP position 340 Candidate Polymorphisms False negative True Positive

  11. Complex Genomes? • Signal to Noise with Large Genomes • RNA, less complex, but differential expression

  12. Barley SFPs

  13. Barley SFPs RNA 2 genotypes, 18 replicates

  14. False Discovery Rate RNA

  15. Barley SFPs Genomic DNA 3 genotypes 3 replicates

  16. False Discovery Rate DNA

  17. Sequence Verification of SFPs

  18. Position of SNP

  19. Barley SFPs per probeset

  20. Uses of SFPs • Recombination Events • Mapping Mendelian mutations • Mapping QTL • Deletions • Haplotyping

  21. Chip genotyping of a Recombinant Inbred Line 29kb interval Discovery 6 replicates X $500 12,000 SFPs = $0.25 Typing 1 replicate X $500 12,000 SFPs = $0.041

  22. 100 bibb mutant plants Map bibb 100 wt mutant plants

  23. bibb mapping Bulk segregant Mapping using Chip hybridization bibb maps to Chromosome2 near ASYMETRIC LEAVES1 AS1 ChipMap

  24. BIBB = ASYMETRIC LEAVES1 AS1 (ASYMMETRIC LEAVES1) = MYB closely related to PHANTASTICA located at 64cM as1 bibb Sequenced AS1 coding region from bib-1 …found g -> a change that would introduce a stop codon in the MYB domain bib-1 W49* as-101 Q107* bibb as1-101 MYB

  25. Array Mapping chr1 chr2 chr3 chr4 chr5 Hazen et al Plant Physiology (2005)

  26. eXtreme Array Mapping 15 tallest RILs pooled vs 15 shortest RILs pooled

  27. Chromosome 2 16 12 RED2 QTL LOD 8 4 0 0 20 40 60 80 100 cM RED2 QTL 12cM Composite Interval Mapping eXtreme Array Mapping LOD Allele frequencies determined by SFP genotyping. Thresholds set by simulations Red light QTL RED2 from 100 Kas/ Col RILs (Wolyn et al Genetics 2004)

  28. eXtreme Array Mapping BurC F2

  29. QTLLz x Ler F2 XAMLz x Col F2 (Werner et al Genetics 2005)

  30. eXtreme Array Fine Mapping ~2Mb ~8cM Col Low RED2 QTL >400 SFPs High Kas X mark2 mark1 ~2 ~268 ~43 Kas Col Col Col het Col ~43 ~43 ~539 ~539 Kas het Col het het het het het ~268 ~2 ~43 Kas Kas Kas Kas Col het Select recombinants by PCR >200 from >1250 plants

  31. Potential Deletions >500 potential deletions 45 confirmed by Ler sequence 23 (of 114) transposons Disease Resistance (R) gene clusters Single R gene deletions Genes involved in Secondary metabolism Unknown genes

  32. FLM natural deletion Potential Deletions Suggest Candidate Genes FLOWERING1 QTL Chr1 (bp) MAF1 Flowering Time QTL caused by a natural deletion in FLM (Werner et al PNAS 2005)

  33. Het Fast Neutron deletions FKF1 80kb deletion CHR1 cry2 10kb deletion CHR1

  34. Array Haplotyping • What about Diversity/selection across the genome? • A genome wide estimate of population genetics parameters, θw, π, Tajima’D, ρ • LD decay, Haplotype block size • Deep population structure? • Col, Lz, Bur, Ler, Bay, Shah, Cvi, Kas, C24, Est, Kin, Mt, Nd, Sorbo, Van, Ws2 Fl-1, Ita-0, Mr-0, St-0, Sah-0

  35. Chromosome1 ~500kb Col Ler Cvi Kas Bay Shah Lz Nd Array Haplotyping Inbred lines Low effective recombination due to partial selfing Extensive LD blocks

  36. Distribution of T-stats 208,729 null (permutation) actual 32,427 Calls Not Col NA Col NA duplications 12,250 SFPs

  37. Sequence confirmation of SFPs

  38. SFPs for reverse genetics 14 Accessions 30,950 SFPs` http://naturalvariation.org/sfp

  39. Chromosome Wide Diversity

  40. Diversity 50kb windows

  41. Tajima’s D like 50kb windows RPS4 unknown

  42. R genes vs bHLH

  43. Consider SFPs during expression • Remove SFPs • Allele specific expression

  44. differences may be due to expression or hybridization

More Related