420 likes | 605 Views
SNP/Tiling arrays for very high density marker based breeding and QTL candidate gene identification Justin Borevitz Ecology & Evolution University of Chicago http://naturalvariation.org/. Major Issues in Breeding Complex Traits. High throughput Phenotyping
E N D
SNP/Tiling arrays for very high density marker based breeding and QTL candidate gene identificationJustin BorevitzEcology & EvolutionUniversity of Chicagohttp://naturalvariation.org/
Major Issues in Breeding Complex Traits • High throughput Phenotyping • Physiological dissection of 1000s correlated traits • Biological Variation • Multiple genes under major QTL • High Density markers • High throughput seedling screens • Linkage Drag • Environmental Interaction (GxE) • Good for optimizing local varieties • Epistasis (GxG) • Magnify minor QTL in local backgrounds • Multi species ecological interactions • “extended phenotype”
Genomic Breeding Path Experimental Design Mapping population Marker Identification Genotyping Phenotyping QTL Analysis Fine Mapping Candidate gene Polymorphisms gene expression loss of function QTL gene Confirmation QTL gene Confirmation Phenotyping QTL Analysis Fine Mapping Genomics path Experimental Design Mapping population Borevitz and Chory, COPB 2003
Talk Outline Talk Outline • Phenotyping in multiple environments • Seasonal Variation in the Lab • Germplasm Diversity • Population structure, Haplotype Mapping set • SNP/Tiling microarrays • Very High Density Markers • Mapping Extreme Bulk Segregant • Expression, splicing, and allelic variation • Ecological context • Arabidopsis and Aquilegia • Phenotyping in multiple environments • Seasonal Variation in the Lab • Germplasm Diversity • Population structure, Haplotype Mapping set • SNP/Tiling microarrays • Very High Density Markers • Mapping Extreme Bulk Segregant • Expression, splicing, and allelic variation • Ecological context • Arabidopsis and Aquilegia
Begin with regions spanning the Native Geographic range Lund Sweden Nordborg et al PLoS Biology 2005 Li et al PLoS ONE 2007 Tossa Del Mar Spain
Seasons in the Growth Chamber Seasons in the Growth Chamber Sweden Spain • Changing Day length • Cycle Light Intensity • Cycle Light Colors • Cycle Temperature • Changing Day length • Cycle Light Intensity • Cycle Light Colors • Cycle Temperature Geneva Scientific/ Percival
Solar Calc II • Kurt Spokas • Version 2.0a June 2006 • USDA-ARS Website Midwest Area (Morris,MN) • http://www.ars.usda.gov/mwa/ncscrl
Col-gl1 Col-gl1 Number of RILs Sweden 2 Sweden 1 FLM Kas1 Kas1 FRI Col-gl1 Col-gl1 Spain 1 Spain 2 Number of RILs Kas1 Kas1 Flowering time QTL, Kas/Col RILs Flowering time QTL, Kas/Col RILs
Kas/Col flowering time QTL GxE Chr4 FRI Chr1 FLM Chr4 FRI
Global and Local Population Structure Olivier Loudet
Local Population Structure common haplotypes 144 Non singleton SNPs >2000 accessions Global, Midwest, and UK Megan Dunning, Yan Li
Diversity within and between populations 80 Major Haplotypes
Universal Whole Genome Array DNA RNA Gene/Exon Discovery Gene model correction Non-coding/ micro-RNA Chromatin Immunoprecipitation ChIP chip Alternative Splicing Methylation Antisense transcription Polymorphism SFPs Discovery/Genotyping Transcriptome Atlas Expression levels Tissues specificity Comparative Genome Hybridization (CGH) Insertion/Deletions Copy Number Polymorphisms RNA Immunoprecipitation RIP chip Allele Specific Expression Control for hybridization/genetic polymorphisms to understand TRUE expression variation
Improved Genome Annotation ORFa Transcriptome Atlas ORFb start AAAAA deletion M M M M M M M M M M M M SFP SNP SNP SFP SFP conservation Chromosome (bp)
Which arrays should be used? BAC array cDNA array Long oligo array
Which arrays should be used? Gene array Exon array Tiling array 35bp tile, 25mers 10bp gaps
Which arrays should be used? SNP array How about multiple species? Microbial communities? Pst,Psm,Psy,Psx, Agro, Xanthomonas, H parasitica, 15 virus, Ressequencing array Tiling/SNP array 2007 250k SNPs, 1.6M tiling probes
Global Allele Specific Expression 65,000 SNPs Transcribed Accession Pairs 12,000 genes >= 1 SNP 6,000 >= 2 SNPs Zhang, X., Richards, E., Borevitz, J. Current Opinion in Plant Biology (2007)
SFP detection on tiling arrays Delta p0 FALSE Called FDR 1.00 0.95 18865 160145 11.2% 1.25 0.95 10477 132390 7.5% 1.50 0.95 6545 115042 5.4% 1.75 0.95 4484 102385 4.2% 2.00 0.95 3298 92027 3.4%
Chip genotyping of a Recombinant Inbred Line 29kb interval
100 bibb mutant plants Map bibb 100 wt mutant plants
bibb mapping Bulk segregant Mapping using Chip hybridization bibb maps to Chromosome2 near ASYMETRIC LEAVES1 AS1 ChipMap
BIBB = ASYMETRIC LEAVES1 AS1 (ASYMMETRIC LEAVES1) = MYB closely related to PHANTASTICA located at 64cM as1 bibb Sequenced AS1 coding region from bib-1 …found g -> a change that would introduce a stop codon in the MYB domain bib-1 W49* as-101 Q107* bibb as1-101 MYB
Array Mapping chr1 chr2 chr3 chr4 chr5 Hazen et al Plant Physiology (2005)
eXtreme Array Mapping 15 tallest RILs pooled vs 15 shortest RILs pooled
Chromosome 2 16 12 RED2 QTL LOD 8 4 0 0 20 40 60 80 100 cM RED2 QTL 12cM Composite Interval Mapping eXtreme Array Mapping LOD Allele frequencies determined by SFP genotyping. Thresholds set by simulations Red light QTL RED2 from 100 Kas/ Col RILs (Wolyn et al Genetics 2004)
QTLLz x Ler F2 XAMLz x Col F2 (Werner et al Genetics 2006)
eXtreme Array Fine Mapping ~2Mb ~8cM Col Low RED2 QTL >400 SFPs High Kas X mark2 mark1 ~2 ~268 ~43 Kas Col Col Col het Col ~43 ~43 ~539 ~539 Kas het Col het het het het het ~268 ~2 ~43 Kas Kas Kas Kas Col het Select recombinants by PCR >200 from >1250 plants
Unite Genetic and Physical Map • Shotgun genomic or 454 reads • ESTs/ cDNAs/ BAC ends • 1000s of contigs • Genotype mapping population on arrays • Create very high density genetic map • Known position of genes/contigs allow QTL candidatet gene identification • Control hybridization variation for gene expression
Potential Deletions >500 potential deletions 45 confirmed by Ler sequence 23 (of 114) transposons Disease Resistance (R) gene clusters Single R gene deletions Genes involved in Secondary metabolism Unknown genes
FLM natural deletion Potential Deletions Suggest Candidate Genes FLOWERING1 QTL Chr1 (bp) MAF1 Flowering Time QTL caused by a natural deletion in FLM (Werner et al PNAS 2005)
Het Fast Neutron deletions FKF1 80kb deletion CHR1 cry2 10kb deletion CHR1
Ecological and Evolutionary context • Abiotic conditions • Light, temperature, humidity • Soil, water • Biotic conditions • Pathogens and pollinators • Conspecifics, grasses, shrubs, trees Industrial Agriculture -> Sustainable EcoAgriculture Green, Super Hybrids!
Seasonal Variation Matt Horton Megan Dunning
Aquilegia (Columbines) Recent adaptive radiation, 350Mb genome
Aquilegia (Columbine) NSF Genome Complexity • Microarray floral development • QTL candidates • Physical Map (BAC tiling path) • Physical assignment of ESTs • QTL for pollinator preference • ~400 RILs, map abiotic stress • QTL fine mapping/ LD mapping • Develop transformation techniques • VIGS • Whole Genome Sequencing (JGI 2007) Scott Hodges (UCSB) Elena Kramer (Harvard) Magnus Nordborg (USC) Justin Borevitz (U Chicago) Jeff Tompkins (Clemson)
NaturalVariation.org NaturalVariation.org USC Magnus Nordborg Paul Marjoram Max Planck Detlef Weigel Scripps Sam Hazen University of Michigan Sebastian Zoellner USC Magnus Nordborg Paul Marjoram Max Planck Detlef Weigel Scripps Sam Hazen University of Michigan Sebastian Zoellner University of Chicago Xu Zhang Yan Li Peter Roycewicz Evadne Smith Megan Dunning Joy Bergelson Michigan State Shinhan Shiu Purdue Ivan Baxter University of Chicago Xu Zhang Yan Li Peter Roycewicz Evadne Smith Megan Dunning Joy Bergelson Michigan State Shinhan Shiu Purdue Ivan Baxter