440 likes | 593 Views
Resources at HapMap.Org. Tutorial Marcela K. Tello-Ruiz Cold Spring Harbor Laboratory. HapMap Phase II Dataset . Release #21a, January 2007 (NCBI build 35) 3.8 M genotyped SNPs => 1 SNP/700 bp. # polymorphic SNPs/kb in consensus dataset.
E N D
Resources at HapMap.Org Tutorial Marcela K. Tello-Ruiz Cold Spring Harbor Laboratory
HapMap Phase II Dataset Release #21a, January 2007 (NCBI build 35) 3.8 M genotyped SNPs => 1 SNP/700 bp # polymorphic SNPs/kb in consensus dataset International HapMap Consortium (2007). Nature 449:851-861
Goals of This Tutorial This tutorial will show you how to: • Find HapMap SNPs near a gene or region of interest (ROI) • View patterns of LD in the ROI • Select tag SNPs in the ROI • Download information on the SNPs in ROI for use in Haploview • Add custom tracks of association data • Create publication-quality images • Generate customized extracts of the entire data set • Download the entire data set in bulk
Finding HapMap SNPs in a Region of Interest • Find the TCF7L2 gene • Identify the characterized SNPs in the region • View the patterns of LD (NCBI b35) • Pick tag SNPs (NCBI b35) • Download the region in Haploview format • Upload your own annotations & superimpose on the HapMap • Make a customized image for publication • View GWA hits & OMIM annotations in the region (NCBI b36)
HapMap Glossary • LD (linkage disequilibrium):For a pair of SNP alleles, it’s a measure of deviation from random association (which assumes no recombination). Measured by D’, r2, LOD • Phased haplotypes: Estimated distribution of SNP alleles. Alleles transmitted from Mom are in same chromosome haplotype, while Dad’s form the paternal haplotype. • Tag SNPs: Minimum SNP set to identify a haplotype. r2= 1 indicates SNPs are redundant, so either one “tags” the other. • Questions? help@hapmap.org
1: Surf to the HapMap Browser 1a. Go to www.hapmap.org 1b. Select “HapMap Genome Browser B35” ncbi B35: full dataset (includes LD patterns) ncbi B36: latest, new tracks (e.g., GWA hits)
2: Search for TCF7L2 Search for a gene name, a chromosome band, or a phrase like “insulin receptor” 2. Type search term – “TCF7L2”
Chromosome-wide summary data is shown in overview 3: Examine Region Region view puts your ROI in genomic context 3: This exonic region has many typed SNPs. Click on ruler to re-center image. Default tracks show HapMap genotyped SNPs, refGenes with exon/intron splicing patterns, etc.
3: Examine Region (cont) Use the Scroll/Zoom buttons and menu to change position & magnification As you zoom in further, the display changes to include more detail
4: Turn on LD & Haplotype Tracks 4a: Scroll down to the “Tracks” section. Turn on the LD Plot and Haplotype Display tracks. 4b: Press “Update Image” These sections allow you to adjust the display and to superimpose your own data on the HapMap
5: View variation patterns Triangle plot shows LD values using r2 or D’/LOD scores in one or more HapMap population Phased haplotype track shows all 120 chromosomes with alleles colored yellow and blue
7: Adjust Track Settings (on the spot) 7a. Click on question mark preceding track name 7b. Adjust population and display settings & press “Configure”
7: Adjust Track Settings (cont) Select the analysis track to adjust and press “Configure”
8: Turn on Tag SNP Track 8: Activate the “tag SNP Picker” and press “Update Image”
9: Adjust tag SNP picker Tag SNPs are selected on the fly as you navigate around the genome Alternatively, you may select “Annotate tag SNP Picker” and press “Configure…” 9a: Click on question mark behind “tag SNP Picker”
9: Adjust tag SNP picker (cont) Select population Select tagging algorithm and parameters [optional] upload list of SNPs to be included, excluded, or design scores 9b: Press “Configure” to save changes
10: Generate Reports 10: Select the desired “Download” option and press “Go” or “Configure” • Available Downloads: • Individual Genotypes • Population Allele & Genotype frequencies • Pairwise LD values • Tag SNPs
10: Generate Reports (cont) The Genotype download format can be saved to disk or loaded directly into Haploview
10: Generate Reports (cont) The tag SNP download is the same as you get from TAGGER …
11: Create your own tracks Example: • Interested in T2DM genetics • Create file with custom annotations from http://www.broad.mit.edu/diabetes and superimpose on the HapMap 11: Upload example file: TCF7L2_annotations.txt Detailed help on the format is under the “Help” link
11: Create your own tracks (cont) Formatted data for the T2DM association results (score is -LOG10 of p-value) Some SNPs were typed (known platform) and others were imputed. Format data for both typed & imputed SNPs. Save as a text file!
11: Create your own tracks (cont) Make edits on your own browser window by clicking on “Edit File…”
12: Create Image for Publication Click on the +/- sign to hide/show a section 12a. Click on “High-res Image” Mouse over a track until a cross appears. Click on track name to drag track up or down.
12: Image for Publication (cont) 12b. Click on “View SVG Image in new browser window” 12c. Save generate file with “.svg” extensions Can view file in Firefox, but use other programs (Adobe Illustrator or Inkscape) to convert to other formats and/or edit
12: Image for Publication (cont) Inskape is free and lets you edit and convert to other formats (many journals prefer EPS)
13: View GWA hits 13a. Go to www.hapmap.org 13b. Select “HapMap Genome Browser B36”
13: View GWA hits (cont) 13c. Type search term - “FTO” Default tracks for B36 include GWA hits, OMIM predicted associations, and Reactome pathways
14: Read PubMed abstracts for GWA hits 14a: Mouse over a GWA hit to learn more about the association 14b: Click on the GWA hit to see the study’s PubMed abstract
Use HapMart to Generate Extracts of the HapMap Dataset Find all HapMap characterized SNPs that: • Have a MAF > 0.20 in the Yoruban population panel (YRI) • Cause a nonsynonymous amino acid change
1. Go to hapmart.hapmap.org 1. From www.hapmap.org click on “HapMart”
2. Select data source and population of interest 2b. Press “Next” Use schema menu to select dataset 2a. Choose Yoruba population or “All Populations”
3. Select the desired filters 3c. Press “Next” 3a. Check “Allele Frequency Filter” and select MAF >= 0.2 3b. Select “SNPs found in Exons – non synonymous coding SNPs”
4. Select output fields 4c. Press “Export” The summary shows active filters and # SNPs to be output 4a. Choose among several pages of fields Options at the bottom let you select text or Excel format 4b. Select the fields to include in the report.
Bulk downloads: Download the Complete Data • Download the entire HapMap data set to your own computer
Surf to www.hapmap.org Or directly click on “Data” 1. From www.hapmap.org, click on “Bulk Data Download”
2. Choose the Data Type 2. Select “Genotypes” Raw genotypes & frequencies Analytic results Protocols & assay design Your own copy of the HapMap Browser HapMap Samples * Data also available via FTP ftp://www.hapmap.org
3. Choose the dataset of interest 3. Select latest build, fwd_strand orientation, and “non-redundant” fwd_strand => same as NCBI reference assembly rs_strand => same as in dbSNP • Available Genotype Datasets: • Non-redundant: QC+ filtered & redundant data removed • Filtered-redundant: QC+ filtered; duplicated data not removed • Unfiltered-redundant: Includes assays that failed QC
Further Information • HapMap Publications & Guidelines http://hapmap.cshl.org/publications.html.en • Past tutorials & user’s guide to HapMap.org http://www.hapmap.org/tutorials.html.en • Questions? help@hapmap.org
HapMap DCC Present Members (CSHL) Lincoln Stein Marcela K. Tello-Ruiz Lalitha Krishnan Zhenyuan Lu HapMap DCC Former Members Albert Vernon Smith Gudmundur Thorisson Fiona Cunningham