270 likes | 362 Views
Genome-Wide SNP Genotyping in Grape – What is Next? Part of National Genetic Trait Index Project CRIS# 1907-21000-030-00D USDA-ARS Geneva, Cornell, Davis, Cold Spring Harbor Acknowledgement: Sean Miles/Doreen Ware. Team. Edward Buckler and Sean Myles – Genomics and statistical analysis
E N D
Genome-Wide SNP Genotyping in Grape – What is Next? Part of National Genetic Trait Index Project CRIS# 1907-21000-030-00D USDA-ARS Geneva, Cornell, Davis, Cold Spring Harbor Acknowledgement: Sean Miles/Doreen Ware
Team • Edward Buckler and Sean Myles – Genomics and statistical analysis • Doreen Ware, Jer-Ming Chia, Bonnie Hurwitz – Bioinformatics • Charles Simon, Gan-Yuan Zhong, Mallikarjuna Aradhya, Bernard Prins – Germplasm
Genus Vitis • Contains over 60 species mostly found in temperate regions of the northern hemisphere (Both Old and New World Distributions) – ~3500 accessions • Vitis vinifera is the most important domesticated species cultivated for table grapes and wine making (~1300 Accessions) • The wild grape Vitis sylvestris is considered the progenitor of the domesticated grape • Highly heterozygous and low LD (~200bp)
Cluster Density Cluster Size Genetic Diversity in the Domesticated Grape Genetic Diversity ? Berry Shape Berry Size
Objectives 1. Grape as a Model Crop for National Genetic Trait Index (NGTI) 2. Characterization of Molecular Diversity – Functional Variability 3. Genome-wide Association Mapping 4. Identify Markers Associated With Economic Traits 5. Develop Strategies for Marker Assisted Breeding – Juvenile Selection in Perennial/Tree Crops
Steps • Step 1: SNP Discovery - Next-Generation Sequencing to sample diversity • DNA preparation, sequencing method and analysis of sequencing reads for variation • Characterization of SNPs: position, allele support, and coverage • 10k SNP array development • Step 2: Genotype and Assemble Data for Analysis • Step 3: Phenotyping
Step 1: Discovery of genetic variants (SNPs) Make data available Integrate SNP data into public grape genome browser Diverse Samples 10 cultivated Vitis varieties (Vitis vinifera) 6 wild Vitis species 60 million sequences Total: 2 billion base pairs of sequence Discovery of >1 million SNPs Genome complexity reduction Digestion with HpaII restriction enzyme Illuminia/Solexa sequencing Sequencing by synthesis
Ehrenfelser French Colombard Gewurztraminer Kadarka Malvasia Muscat of Alexandria Pinot Noir Plavac Mali Thompson Seedless White Riesling Vitis amurensis Vitis cinerea Vitis labrusca Vitis palmata Vitis rotundifolia Vitis sylvestris Inbred Pinot Noir (Reference Genome) SNP Discovery Panel • Goal: Capture recent variation in the genus Vitis • RRLs constructed from 10 domesticated cultivars and 6 wild species
Library Construction ProtocolReducing the complexity of the Genome DNA Extraction Solexa Genome Analyzer Whole Genome Amplification* Ligation of Solexa Adaptors Genome Complexity Reduction: Restriction enzyme digest Addition of ‘A’ Base to 3`ends Size Selection from Gel: 100-600bp
Image files from Solexa GA Ungapped Alignment Read Mapping Sequence and Base Quality Firecrest, Bustard NO Base Calling Gapped Alignment Mapped to genome? YES Sequence and Base Quality Alignments Data Storage Aln Consensus & Quality Variation Discovery Variation Data Accessibility Filters Variation Discovery Called SNPs Next-Generation Sequence Analysis Workflow
Deciphering Genetic Diversity From High-Throughput Sequencing
Overview of the Solexa SNP pipeline 56 Million reads (1.8 billion bp) are aligned to the reference genome The divergence within V. vinifera and with other Vitis is so great we need to develop other algorithms to map the reads 1.1 Million regions of the genome have potential SNPs, which are statistically evaluated for genotypic basis. 50,000 high probability SNPs are identified Empirically validating a small subset of the data. With improved algorithms and increased knowledge of grape diversity, we may be able to extract 100,000s of SNPs.
Mapping Statistics of reads from each of the germplasm to the reference vitis genome
10K SNPs Consequence within Genomic Sequence • SNP consequence data facilitated via the integration of SNP calls with the genome annotation through Ensembl • Selected 10K SNPs enriched for genic SNPs. • In contrast, genome is 46% in genic space, 41% repetitive/transposable elements
Step 2: Genotyping the grape germplasm repository • Analyses • Establish core germplasm collection • Identify synonyms and homonyms • Association mapping • Estimate population genetic parameters SNP selection Choose 10,000 high quality SNPs from the 500,000 Solexa SNPs 10K SNP chip Production of custom 10,000 (8898) SNP genotyping array 21 million genotypes • Genotype the germplasm repository • 1200 cultivated species (Vitis vinifera) • 1000 wild species
PCA analysis of array scored SNPs show clustering of the different germplasm
Eurasian wild Vitis American wild Vitis
MDS plot Vitis vinifera 6907 SNPs Error or biologically interesting?
Phenotyping Economic Traits/ Key Secondary Metabolites of Grapes • Phenotyping the USDA-ARS Vitis collections will be the next critical step for maximizing the value of the current genotyping effort • A pilot project has been initiated for phenotyping key secondary metabolites of the Vitis collections from both Davis, CA and Geneva, NY • About 400 V. vinifera and 200 North American collections will be phenotyped for 50 various phenolics including anthocyanins 525nm 365nm 280nm Profiling anthocyanins (525 nm) and other phenolics in grapes (HPLC-DAD chromatograms)