370 likes | 559 Views
Genomic Approaches to Understand Natural Genetic Variation Justin Borevitz Ecology & Evolution University of Chicago http://naturalvariation.org/. Major Issues in Breeding Complex Traits. High throughput Phenotyping Physiological dissection of 1000s correlated traits Biological Variation
E N D
Genomic Approaches to Understand Natural Genetic VariationJustin BorevitzEcology & EvolutionUniversity of Chicagohttp://naturalvariation.org/
Major Issues in Breeding Complex Traits • High throughput Phenotyping • Physiological dissection of 1000s correlated traits • Biological Variation • Multiple genes under major QTL • High Density markers • High throughput seedling screens • Linkage Drag • Environmental Interaction (GxE) • Good for optimizing local varieties • Epistasis (GxG) • Magnify minor QTL in local backgrounds • Multi species ecological interactions • “extended phenotype”
Genomic Breeding Path Experimental Design Mapping population Marker Identification Genotyping Phenotyping QTL Analysis Fine Mapping Candidate gene Polymorphisms gene expression loss of function QTL gene Confirmation QTL gene Confirmation Phenotyping QTL Analysis Fine Mapping Genomics path Experimental Design Mapping population Borevitz and Chory, COPB 2003
Unite Genetic and Physical Map • Shotgun genomic or 454 reads • ESTs/ cDNAs/ BAC ends • 1000s of contigs • Genotype mapping population on arrays • Create very high density genetic map • Known position of genes/contigs allow QTL candidatet gene identification • Control hybridization variation for gene expression
Talk Outline Talk Outline • Germplasm Diversity • Population structure, Haplotype Mapping set • SNP/Tiling microarrays • Very High Density Markers • Mapping Extreme Bulk Segregant • Haplotype Mapping • Copy Number Polymorphisms / Deletions
Local Population Structure common haplotypes 149 Non singleton SNPs >6000 accessions Global, Midwest, and UK Megan Dunning, Yan Li
Diversity within and between populations 17 Major Haplotypes 80 Major Haplotypes
Universal Whole Genome Array DNA RNA Gene/Exon Discovery Gene model correction Non-coding/ micro-RNA Chromatin Immunoprecipitation ChIP chip Alternative Splicing Methylation Antisense transcription Polymorphism SFPs Discovery/Genotyping Transcriptome Atlas Expression levels Tissues specificity Comparative Genome Hybridization (CGH) Insertion/Deletions Copy Number Polymorphisms RNA Immunoprecipitation RIP chip Allele Specific Expression Control for hybridization/genetic polymorphisms to understand TRUE expression variation
Improved Genome Annotation ORFa Transcriptome Atlas ORFb start AAAAA deletion M M M M M M M M M M M M SFP SNP SNP SFP SFP conservation Chromosome (bp)
Which arrays should be used? BAC array cDNA array Long oligo array
Which arrays should be used? Gene array Exon array Tiling array 35bp tile, 25mers 10bp gaps
Which arrays should be used? SNP array How about multiple species? Microbial communities? Pst,Psm,Psy,Psx, Agro, Xanthomonas, H parasitica, 15 virus, Ressequencing array Tiling/SNP array 2007 250k SNPs, 1.6M tiling probes
SFP detection on tiling arrays Delta p0 FALSE Called FDR 1.00 0.95 18865 160145 11.2% 1.25 0.95 10477 132390 7.5% 1.50 0.95 6545 115042 5.4% 1.75 0.95 4484 102385 4.2% 2.00 0.95 3298 92027 3.4%
Chip genotyping of a Recombinant Inbred Line Van x Col RIL 23
100 bibb mutant plants Map bibb 100 wt mutant plants
bibb mapping Bulk segregant Mapping using Chip hybridization bibb maps to Chromosome2 near ASYMETRIC LEAVES1 AS1 ChipMap
BIBB = ASYMETRIC LEAVES1 AS1 (ASYMMETRIC LEAVES1) = MYB closely related to PHANTASTICA located at 64cM as1 bibb Sequenced AS1 coding region from bib-1 …found g -> a change that would introduce a stop codon in the MYB domain bib-1 W49* as-101 Q107* bibb as1-101 MYB
Array Mapping chr1 chr2 chr3 chr4 chr5 Hazen et al Plant Physiology (2005)
eXtreme Array Mapping 15 tallest RILs pooled vs 15 shortest RILs pooled
Chromosome 2 16 12 RED2 QTL LOD 8 4 0 0 20 40 60 80 100 cM RED2 QTL 12cM Composite Interval Mapping eXtreme Array Mapping LOD Allele frequencies determined by SFP genotyping. Thresholds set by simulations Red light QTL RED2 from 100 Kas/ Col RILs (Wolyn et al Genetics 2004)
QTLLz x Ler F2 XAMLz x Col F2 (Werner et al Genetics 2006)
eXtreme Array Fine Mapping ~2Mb ~8cM Col Low RED2 QTL >400 SFPs High Kas X mark2 mark1 ~2 ~268 ~43 Kas Col Col Col het Col ~43 ~43 ~539 ~539 Kas het Col het het het het het ~268 ~2 ~43 Kas Kas Kas Kas Col het Select recombinants by PCR >200 from >1250 plants
SFP haplotype diversity B A Chromosome1 ~500kb Chromosome1 ~500kb Col Ler Cvi Kas Bay Shah Lz Nd Col Ler Cvi Kas Bay Shah Lz Nd
SFP d-statistic 208,729 null (permutation) actual 32,427 Calls Not Col NA Col NA duplications 12,250 SFPs
A C 3 pollen allergen like proteins B D RPS4 R gene cluster
Diversity and Selection R-genes A B Selection Diversity 70 60 50 40 frequency 30 20 10 0 (-1,-0.8] (-0.6,-0.4] (-0.2,0] (0.2,0.4] (0.6,0.8] Tajima's D like statistic
Potential Deletions >500 potential deletions 45 confirmed by Ler sequence 23 (of 114) transposons Disease Resistance (R) gene clusters Single R gene deletions Genes involved in Secondary metabolism Unknown genes
Natural Copy Variation on Tiling Arrays Segregating self seed from wild ME isolate (Early – Late)
FLM natural deletion FLM Potential Deletions Suggest Candidate Genes FLOWERING1 QTL Chr1 (bp) Flowering Time QTL caused by a natural deletion in FLM (Werner et al PNAS 2005)
Het Fast Neutron deletions FKF1 80kb deletion CHR1 cry2 10kb deletion CHR1
NaturalVariation.org NaturalVariation.org USC Magnus Nordborg Paul Marjoram Max Planck Detlef Weigel Scripps Sam Hazen University of Michigan Sebastian Zoellner USC Magnus Nordborg Paul Marjoram Max Planck Detlef Weigel Scripps Sam Hazen University of Michigan Sebastian Zoellner University of Chicago Xu Zhang Yan Li Peter Roycewicz Evadne Smith Megan Dunning Joy Bergelson Michigan State Shinhan Shiu Purdue Ivan Baxter University of Chicago Xu Zhang Yan Li Peter Roycewicz Evadne Smith Megan Dunning Joy Bergelson Michigan State Shinhan Shiu Purdue Ivan Baxter