200 likes | 209 Views
The Gramene Genetic Diversity database specializes in storing and analyzing large-scale SNP and indel genotype datasets and their accompanying phenotype measurements. Discover associations between genes and traits in maize, rice, Arabidopsis, wheat, and sorghum.
E N D
Introduction to the Gramene Genetic Diversity module 5/2010 Build #31
The Gramene Genetic Diversity database module specializes in storage of data sets that study genetic variation and genotype-phenotype association in populations of plants.
Our focus: • Storage of large-scale SNP and indel genotype datasets and accompanying phenotype measurements • Facilitate discovery of associations between genes and traits • Species of primary interest: maize, rice, Arabidopsis. We also house some important wheat and sorghum data Maize Rice Arabidopsis Sorghum Wheat
To find the Genetic Diversity module:- Go to: www.gramene.org, and click on one of the Diversity links- Or, navigate go there directly: www.gramene.org.diversity Click on “Diversity” link Click on “Diversity” thumbnail
What’s on the Diversity main page Browse data Access latest Diversity datasets Large-scale SNP datasets Browse through SSR, RFLP and other smaller scale diversity or QTL studies here
What’s on the Diversity main page Search data Quick search for germplasm or markers SNP Query Tool for mining SNP data
What’s on the Diversity main page Downloading data Download subsets of SNP data by using the SNP Query tool (more about this in a few slides) Download SNP data in hapmap, plink and flapjack format. Click on the DOWNLOAD link and we’ll take a look at the Download Data page on the next slide… All download options, including full Diversity MySQL db dumps Launch Diversity datasets live in TASSEL. Download graphs and results from analyses
SNP download page www.gramene.org/diversity/download_data.html • The large-scale SNP datasets are offered for download in hapmap, plink (.map and .ped files) and flapjack format. The plink files are loadable in the PLINK analysis program (pngu.mgh.harvard.edu/~purcell/plink) and the flapjack project files are loadable in Flapjack visualization tool (bioinf.scri.ac.uk/flapjack).
Genotype • The Diversity module stores all of its data in MySQL using the GDPDM database model. • GDPDM links genotype, phenotype, germplasm, field and environment information in one resource. • GDPDM v 4.0 is optimized for efficient handling of large scale SNP data by using BLOBs (binary database objects) • For full GDPDM documentation, go here: http://www.maizegenetics.net/gdpdm/documentation_list.html Field/Plant information Germplasm Phenotype
SNP QUERY Web-based tool for mining SNP datasets in Gramene Genetic Diversity
Features of the ‘SNP Data’ area of Diversity: SNP QUERY TOOL Click SEARCH link to go to SNP Query
Select a species Select a dataset to load
Select subset of plants in the experiment by holding down the SHIFT, ALT, or APPLE-CMD key while clicking. The default (selecting none) will return data for all accessions. Select chromosome Optionally enter start and stop coordinate for range of interest Select output format. ‘Text’ and ‘HTML’ will return the results in the browser, ‘File download’ will return results as a downloadable csv text file. Finally, click Submit
Here’s what the HTML results of SNP Query look like: Hyperlinked to Gramene Genomes Ensembl browser You can save the results to your hard disk as text (some browsers allow you to save in HTML), or you can redo your query and select File Download option
TASSEL Java program for evaluating trait associations, evolutionary patterns, and linkage disequilibrium
TASSEL Launch Diversity SNP datasets with Java web-start Click on the LAUNCH TASSEL link to access the web-start links
TASSEL Launch Diversity SNP datasets with Java web-start www.gramene.org/diversity/tassel_launch.html When you click on one of the hyperlinks, each of which represent one chromosome’s worth of SNP data, a dialog box should appear asking you if you want to open Java Web Start to open the file – click OK. TASSEL will take a minute or two to launch…
Once TASSEL launches, here is what you should see: Notice the dataset tag ‘chr1’ appears in the left bar. Click on ‘chr1’ to see the data….
When you click ‘chr1’ in left Data column, the data will be displayed in main window:
With TASSEL you can perform many analyses. Please see TASSEL documentation for much more information: maizegenetics.net/tassel PCA of genotypes Phylogeny LD Diversity sliding window LD v Distance