HapMap data for genome-wide disease association studies ~ cases from SNP Research Center, RIKEN ~

HapMap data for genome-wide disease association studies ~ cases from SNP Research Center, RIKEN ~ Toshihiro Tanaka SNP Research Center, RIKEN

Millennium SNP projects in Japan (April, 2000 – March, 2005) I. Infrastructure a) collection of gene-based SNPs 190,000 variations identified in two years b) high-throughput genotyping system low cost, semi-automated using Invader assay II. Application Identification of genes with medical importance Disease associated genes Genes defining drug sensitivity

Two-step genotyping strategy for genome-wide approach 1. genotype small number of samples (100 ~ 200) for a large set of SNPs (100,000 ~ 250,000) 2. set p-value threshold to take further steps (0.01) 3. loci that passed the threshold will be further examined by expanding the sample scale And, also candidate gene approach

SNP Research Center, RIKEN Laboratory for Cardiovascular Diseases lymphotoxin-a (Nature Genetics, 2002) galectin-2 (Nature, 2004) Laboratory for Rheumatic Diseases PADI4 (Nature Genetics, 2002) SLC22A4 (Nature Genetics, 2002) FCRL3 (Nature Genetics, 2005) Laboratory for Bone & Joint Diseases asporin (Nature Genetics, 2005) CILP (Nature Genetics, 2005) CALM1 (Hum Mol Genet, 2005) Laboratory for Diabetic Nephropathy SLC12A3 (Diabetes, 2003) WNT5B (Am J Hum Genet, 2004) Laboratory for Allergic Diseases CLCA1 (Genes and Immunity, 2004) DAP3 (J Hum Genet, 2004) IFNA (Hum Genet, 2004) ADAM33 (Clin Exp Allergy, 2004)

Purpose To know the practical usefulness of HapMap data for disease association studies Question: Could we have identified disease-associated loci/SNPs if we had used SNP data and software from HapMap HP to select SNPs to be genotypedin the first stage screening?

Question, in other words…. Imagine a researcher wishing to identify certain disease associated loci by GWA study, without knowing any previous association reports. He/she decided to select SNPs to be genotyped by using HapMap data and Haploview software. He/she examined 500 patients and 500 controls. He/she set the threshold p-value, 0.01. Could he/she detect loci that were previously reported by us? (even when the associated SNPs were hidden from HapMap data)

Study protocol Obtain genotyping data around the disease-associated loci from HapMap home page Select tag SNPs using Haploview software (block-by-block basis, and Tagger) * All the disease-associated SNPs were in the database. treated as untyped (hidden SNPs). * Default settings were used for Haploview in most conditions. Genotype selected tag SNPs and perform association analysis for ~500 case and ~500 control samples

LGALS2 locus (candidate gene approach) association result p=4.5x10-6 OR=1.23 n=~2,000 tagged SNPs block-by-block basis: 8,9,10 Tagger (r2>0.8): 9,10 Tagger (r2=1): 9,10,11,12

Association analyses (comparison of allele frequency) SNP8 SNP9 SNP10 SNP11 SNP12 P = 0.0023 OR = 1.35 r2 = 0.832 D’ = 0.956 P = 0.015 OR = 1.25 r2 = 0.587 D’ = 0.978 P = 0.0038 OR = 1.32 r2 = 0.867 D’ = 0.978 P = 0.0092 OR = 1.29 r2 = 0.863 D’ = 0.931 P = 0.0020 OR = 1.32 r2 = 0.616 D’ = 0.973 SNP14 (disease associated SNP, MAF=35.0%) P = 0.00036, OR = 1.41

Haplotype association

LTA locus (HLA region, genome-wide approach) association result p=1.3x10-4 n=~1,000 association result p=3.3x10-6 n=~1,000 r2=0.866 D'=1

Association analysis SNP18 disease associated SNP MAF=34.1% SNP9 (MAF=32.5%) P = 0.0015, OR = 1.35 P = 0.00033, OR = 1.40 r2 = 0.90, D' = 0.99

Newly identified locus for one common disease (candidate gene approach) association result p=3.3x10-7 n=~3,000 100kb no haplotype block no related SNP

Sample scale and cut-off p value p value Minor Allele Frequency = 0.35 OR = 1.41 OR = 1.35 number of samples

Summary All disease-associated SNPs were in the database. = in part, good luck, in part, good quality of the database. If they are treated as untyped (hidden SNPs), we lose some of the disease-associated loci, depending on their haplotype structure. There is a need to examine certain number of samples and to set appropriate p-value threshold to detect them, which, naturally, should take cost of the study into account.

HapMap data for genome-wide disease association studies ~ cases from SNP Research Center, RIKEN ~

HapMap data for genome-wide disease association studies ~ cases from SNP Research Center, RIKEN ~

Presentation Transcript

Malignant Breast Disease

Effects of Renal Disease on Pharmacokinetics

Educational Research: Experimental Studies

Prefrontal cortex: categories, concepts and cognitive control Earl K. Miller Picower Center for Learning and Memory,

Overview of the NAMCS and NHAMCS

Association Analysis

Case-control association techniques in genetic studies

Big Data Use Cases

Literacy Clinical Teacher Preparation that is Transformative

Imputation 2

Peptic Ulcer Disease

Etiologic research

SNP Discovery and Analysis Application to Association Studies

PTS Data Center Solutions Data Center Decisions Presentation 2008

The International HapMap Project: A Rich Resource of Genetic Information

Big Data Use Cases

Genome-wide association studies (GWAS)

School-Wide Positive Behavior Support: What Is It?

MIGRANT STUDIES

Data from Far and Wide: Finding IT, Managing IT, Using IT

Peptic Ulcer Disease