470 likes | 590 Views
Genomics of Natural Variation in Arabidopsis thaliana Justin Borevitz Salk Institute naturalvariation.org. Talk Outline. Natural Variation in Light Response PHYTOCHROME A/QTL mapping Fine Mapping/ Gene Expression candidates Single Feature Polymorphisms Deletion/ Candidate genes
E N D
Genomics of Natural Variation in Arabidopsis thalianaJustin BorevitzSalk Institutenaturalvariation.org
Talk Outline • Natural Variation in Light Response • PHYTOCHROME A/QTL mapping • Fine Mapping/ Gene Expression candidates • Single Feature Polymorphisms • Deletion/ Candidate genes • Bulk Segregant/ eXtreme Mapping • Haplotype analysis
Light Affects the Entire Plant Life Cycle de-etiolation } hypocotyl
Quantitative Trait Loci EPI1 EPI2
Epistasis scan BQTL http://hacuna.ucsd.edu/bqtl 43,956 pair-wise tests 163 markers and 133 intervals Permutation threshold p < 0.05 (5000 permutations) Chr1 Chr2 Chr3 Chr4 Chr5 Chr1 Chr2 Chr3 Chr4 Chr5
Marker Mb SM184 5.50 SM50 5.87 SM35 6.34 SM106 7.01 G2395 7.30 SNP377 7.44 CER45 7.60 MAT7787 7.79 MSAT7964 7.96 TH1 8.13 8.29 SEQ8298 8.65 SM40 SNP65 9.32 mm Near-Isogenic Lines for LIGHT1 Ler / Cvi #3 RVE7 GI Ler 81N-J 17A-A/J 114 124 189 Line 194 6 2 4 3 3 3 Plants 3 5.8 5.8 5.1 5.9 5.7 5.8 Phenotype 5.0
Downstream players` differences may be due to expression or hybridization
PAG1 down regulated in Cvi PLALE GREEN1 knock out has long hypocotyl in red light
Abundant Genetic Variation in Light Response • Quickly map in new crosses (XAM still to come!) • QTL map to novel loci and candidate genes • New crosses find major loci and new loci • Gene Expression • NILs, pools of extreme RILs or F2s, • Identify candidate genes at QTL (linked) • or downstream effects of QTL (unlinked)
What is Array Genotyping? • Affymetrix expression GeneChips contain 202,806 unique 25bp oligo nucleotides. • 11 features per probset for 21546 genes • New array’s have even more • Genomic DNA is randomly labeled with biotin, product ~50bp. • 3 independent biological replicates compared to the reference strain Col GeneChip
Spatial Correction Improved reproducibility Next: Quantile Normalization Spatial Artifacts
physical location known Genetic Markers in genes
False Discovery and Sensitivity • Cereon • may be a • sequencing • Error • TIGR • match is • a match 90% 80% 70% 41% 53% 85% 90% 80% 70% 67% 85% 100% 3/4 Cvi markers were also confirmed in PHYB PM only GeneChip SAM threshold SFPs nonSFPs Cereon marker accuracy 5% FDR 3806 89118 100% Sequence 817 121 696 Sensitivity Polymorphic 340 117 223 34% Non - polymorphic 477 4 473 False Discovery rate: 3% Test for independence of all factors: Chisq = 177.34, df = 1, p - value = 1.845e - 40 GeneChip SAM threshold SFPs nonSFPs Cereon marker accuracy 18% FDR 10627 82297 100% Sequence 817 223 594 Sensitivity Polymorphic 340 195 145 57% Non - polymorphic 477 28 449 False Discovery rate: 13% Test for indep endence of all factors: Chisq = 265.13, df = 1, p - value = 1.309e - 59
Chip genotyping of a Recombinant Inbred Line 29kb interval Discovery 6 replicates X $500 12,000 SFPs = $0.25 Typing 1 replicate X $500 12,000 SFPs = $0.041
Potential Deletions >500 potential deletions 45 confirmed by Ler sequence 23 (of 114) transposons Disease Resistance (R) gene clusters Single R gene deletions Genes involved in Secondary metabolism Unknown genes
MAF1 natural deletion MAF1 Flowering Time QTL caused by a natural deletion in MAF1 Potential Deletions Suggest Candidate Genes FLOWERING1 QTL Chr1 (bp)
Het Fast Neutron deletions FKF1 80kb deletion CHR1 cry2 10kb deletion CHR1
100 bibb mutant plants Map bibb 100 wt mutant plants
bibb mapping Bulk segregant Mapping using Chip hybridization bibb maps to Chromosome2 near ASYMETRIC LEAVES1 AS1 ChipMap
BIBB = ASYMETRIC LEAVES1 AS1 (ASYMMETRIC LEAVES1) = MYB closely related to PHANTASTICA located at 64cM as1 bibb Sequenced AS1 coding region from bib-1 …found g -> a change that would introduce a stop codon in the MYB domain bib-1 W49* as-101 Q107* bibb as1-101 MYB
eXtreme Array Mapping 15 tallest RILs pooled vs 15 shortest RILs pooled
Chromosome 2 16 12 RED2 QTL LOD 8 4 0 0 20 40 60 80 100 cM RED2 QTL 12cM Composite Interval Mapping eXtreme Array Mapping LOD Allele frequencies determined by SFP genotyping. Thresholds set by simulations Red light QTL RED2 from 100 Kas/ Col RILs
Single Feature Polymorphisms • Improve with replicates (easy) • Improved statistical models • Genotyping • Precisely define recombination breakpoints • Fine mapping • Gene conversion • Potential Deletions • Candidate genes/ induced mutations • Bulk segregant Mapping • eXtreme Array Mapping, F2s etc
Array Haplotyping • What about Diversity/selection across the genome? • A genome wide estimate of population genetics parameters, θw, π, Tajima’D, ρ • LD decay, Haplotype block size • Deep population structure? • Col, Lz, Ler, Bay, Shah, Cvi, Kas, C24, Est, Kin, Mt, Nd, Sorbo, Van, Ws2
Chromosome1 ~500kb Col Ler Cvi Kas Bay Shah Lz Nd Array Haplotyping Inbred lines Low effective recombination due to partial selfing Extensive LD blocks
Distribution of T-stats null (permutation) actual Not Col NA Col NA duplications
SFPs for reverse genetics 14 Accessions 30,950 SFPs` http://naturalvariation.org/sfp
Remember to think about hybridization polymorphism in RNA analysis (affy or cDNA) • Keep in mind that DNA can be used on many arrays • Example for mapping ESTs • Haplotyping • Diversity/Selection • Association Mapping • Population Genomics (hybrid zones)
Universal Whole Genome Array RNA DNA Chromatin Immunoprecipitation ChIP chip Gene Discovery Gene model correction Non-coding/ micro-RNA Antisense transcription Methylation Transcriptome Atlas Expression levels Tissues specificity Polymorphism SFPs Discovery/Genotyping Comparative Genome Hybridization (CGH) Insertion/Deletions Alternative Splicing ~35 bp tile, non-repetitive regions, “good” binding oligos, evenly spaced
ChipViewer: Mapping of transcriptional units of ORFeome From 2000v At1g09750 (MIPS) to the latest AGI At1g09750 2000 v Annotation (MIPS) The latest AGI Annotation
Improved Genome Annotation ORFa Transcriptome Atlas ORFb start AAAAA deletion M M M M M M M M M M M M SFP SNP SNP SFP SFP conservation Chromosome (bp)
Haplotype Map – Linkage Disequilibrium, Gene Family (R genes) Association Studies – Whole Genome Arrays 192 Accessions, > 200,000 SFPs (~600bp resolution) Confirm Associations in specific crosses with eXtreme Array Mapping Future Projects DNA
True natural variation in gene expression polymorphism accounted for, alternative splicing Cis regulatory variation/ Imprinting reciprocal F1s 3 replicates Transcriptome QTL Map – VanC Advanced Intercross Recombinant Inbred Lines How many loci control the variation in gene transcription? Candidate TF and binding sites? Future Projects RNA
Future work with Natural Variation • VanC advanced intercross RIL population • Backcross collections
NaturalVariation.org Salk Jon Werner Todd Mockler Sarah Liljegren Olivier Loudet Huaming Chen Joanne Chory Detlef Weigel Joseph Ecker UC San Diego Charles Berry Scripps Sam Hazen Elizabeth Winzeler Salk Jon Werner Todd Mockler Sarah Liljegren Olivier Loudet Huaming Chen Joanne Chory Detlef Weigel Joseph Ecker UC San Diego Charles Berry Scripps Sam Hazen Elizabeth Winzeler UC Davis Julin Maloof University of Guelph, Canada Dave Wolyn Sainsbury Laboratory Jonathan Jones USC Magnus Nordborg Tina Hu Syngenta Hur-Song Chang Tong Zhu UC Davis Julin Maloof University of Guelph, Canada Dave Wolyn Sainsbury Laboratory Jonathan Jones USC Magnus Nordborg Tina Hu Syngenta Hur-Song Chang Tong Zhu Helen Hay Whitney Foundation Helen Hay Whitney Foundation