340 likes | 355 Views
This report covers MESA Genetics' activities, including updates on MESA Family and Candidate Gene Genotyping. It summarizes genotyping results, data quality, duplicate analyses, and proposes efficient genotyping strategies. The report also addresses the creation of a subcommittee for evaluating and approving individual gene studies.
E N D
MESA Genetics Activities Report to MESA Steering Committee Jerome I. Rotter, MD February 8th, 2006
MESA Genetics Activities • Update on MESA Family – includes Large Scale MESA Candidate Gene Analyses • Individual candidate gene genotyping • Larger scale candidate gene genotyping - NHLBI’s Candidate-gene Association REsource (CARE) • Scale of MESA Genetics Projects • Human Subjects / Field Center Issues
MESA Family Current Priorities • Recruitment and Enrollment • Candidate Gene Analyses • Analyses of the 1st half of the family cohort
MESA Candidate Gene Genotyping Study • Conduct an association study of 2,880 MESA participants (parent study) • 720 randomly selected from each ethnic group of Caucasian-, African-, Mexican-, and Chinese-Americans • Include the well-phenotyped “MESA 1000” • Genotype 1536 SNPs in candidate genes proposed by MESA Investigators
MESA Illumina Marker Panel Marker Set # SNPs Cardiovascular Candidate Genes 1439 Ancestry Informative Markers (AIMs) 97 Mean SNPs/gene 11.3 Max SNPs/gene 49 (VWF) Min SNPs/gene 1 (VEGFB, MRPL10, KCNH2, FLJ3116, DMGDH, C4orf9) TOTAL SNPs ASSAYED BY 1536 ILLUMINA
MESA Candidate Gene Data Overview • 1440/1536 SNPs were successfully genotyped • 119 genes represented • 97 ethnic-specific markers • Only 6 DNA samples did not genotype.
MESA Candidate GeneSummary of Genotyping Results Illumina reports: • Data quality was very high • DNA success rate unprecedented (for such a large project) • Being pleased with the locus conversion rate (higher than predicted) • Excellent DNA quality (aided in achieving the high locus success rate)
Analysis of DuplicatesGenotyping Quality • Spike each of 32 (96 well) DNA plates plates with 2 duplicate samples = 64 in total • Illumina notified of duplicates (non-blinded) • Add an extra plate (#33) of 4 ethnicities x 23 duplicates = 92 duplicates • Illumina not notified that these were duplicates (blinded) • Total: 64 + 92 = 156 duplicate pairs of DNAs
Analysis of DuplicatesIllumina Non-Blinded 64 x 1440 – 45 – 31 = 92,084 genotype pairs (less missing 1 or 2 genotypes) Remove and do not count: 1 pair of clearly non-identical samples, mismatch at 628 / 1439 SNPs total typed (lab error ?). One other single pair of genotypes do not match Non-blinded pair concordance rate = (92084 -*1439-1) /(92084 – 1439) = >99.998% *1439 not 1440 – 1 missing SNP genotypes.
Analysis of DuplicatesIllumina Blinded 92 x 1440 – 138 – 32 = 132,310 genotype pairs (less missing 1 or 2 genotypes) Remove and do not count: 2 pairs of clearly non-identical samples: 717 / 1439 and 717 / 1438 SNPs mismatch (lab error?). 4 other genotype pair mismatches Blinded pair concordance rate = (132,310 -1438-1439-4) /(132,310 –1438-1439) = >99.996% *1438 not 1440 – 2 missing SNP genotypes.
Candidate Gene Data Use and Publication Approach • Moratorium on candidate gene manuscript proposals (~3 months) • MESA and MESA Family Study investigators surveyed concerning gene and phenotype interests • If several interested in same gene and phenotype, P&P Committee will encourage the development of writing groups: • To delineate the hypotheses and models • To recommend number of manuscripts • To recommend authors • Should a candidate gene have no ‘champion’, it will be assigned to Investigator who recommended it
Individual MESA Candidate Gene Genotyping: The Problem Issue Need to prioritize Transferring samples is inefficient a. Inefficient use of samples b. Significant work for central laboratory/genotyping center Efficiencies of combining genotyping, e.g. multi-plexing Investigators often need guidance - a. on how to comprehensively interrogate a gene b. which samples should be utilized to answer the question
Individual MESA Candidate Gene Genotyping Proposal Proposal: Create a subcommittee of the MESA Genetics Committee to evaluate and approve individual candidate gene studies Proposed subcommittee: Mike Tsai (Minnesota, Chair), Don Bowden (Wake Forest), Kent Taylor (Cedars-Sinai) Prior Discussion/Buy-in: Genotyping groups, MESA Lab Committee representatives, Ancillary Study Committee representatives, Program Officers
Large-scale genotyping of NHLBI cohortsCARE = Candidate-gene Association REsource • Subjects • 8 NHLBI Cohorts • ~50,000 subjects • Genotyping • 1700 genes • ~8-10 markers per gene • Steering Committee • Formed 1/25/2006 • Selects genes and markers • Selects phenotypes
Large-scale genotyping of NHLBI cohortsCARE = Candidate-gene Association REsource 8 NHLBI Cohorts • Atherosclerosis Risk in Communities (ARIC) • Coronary Artery Risk Development In young Adults Study (CARDIA) • Cardiovascular Health Study (CHS) • Cooperative Study Sickle Cell Disease • Framingham Heart Study • Jackson Heart Study • Multi-Ethnic Study of Atherosclerosis (MESA) • Sleep Heart Health Study
Scale of MESA Genetics Projects * Under discussion
Contrasting MESA Genetics Projects vs. Large Scale Sequencing Characteristics of MESA Studies • All studies being conducted at the moment (or being considered) are association studies • Association studies ask the question whether common variations in or near a gene are associated with a trait/disease • Therefore information is obtained only regarding the traits under study, not for other diseases or traits Alternate Approach • Large scale sequencing completely interrogates specific genes • If these genes are associated with any known disease, then finding variants/mutations have a clinical implication
Recent NHLBI RFI • RFI seeking comment on proposed creation and release of limited access datasets (LAD) which are to include large amounts of genotype and phenotype data. • Cleaned genotype data to be delivered to NHLBI within 6 months of creation • Data immediately available to LAD participants at that time. • One year moratorium on publication submission for LAD participants.
Issues Raised by NHLBI RFI Data cleaning • Is 6 months adequate for large scale data cleaning Is 12 months sufficient for: • Knowledgeable study Investigators to conduct well thought out analyses • Sufficient academic reward for overall conduct of the study Confidentiality • Did the study subjects knowingly consent to a forensic equivalent genetic dataset? • Subject concerns • IRB concerns Negative impact on future cohort studies
Possible Responses to NHLBI RFI • NHLBI encourages comments (due by March 8th) • Individual MESA and other Cardiovascular scientists should weigh in • When doing so, acknowledge the potential conflict of interest and/or emphasize scientific and human subject issues • Besides individual responses, propose a MESA Steering Committee response (drafted by MESA Genetics and appropriately revised by MESA Steering Committee)
Analyses in MESA Family First Half CohortWhat to do until the genes come? Heritability analysis Bivariate analysis
Simple correlation & Bivariate analysis in MESA Family, Example Individual Sib1 Sib2 CAC IMT CAC IMT Simple correlation Bivariate analysis
MESA Candidate Gene Selection • 267 candidate genes were initially proposed by MESA investigators, collated by DCC, and then ranked by MESA investigators • Further input by MESA laboratory and MESA Family Study Genetics Committee (294 genes) • List of 294 genes sent to Illumina • list of 60,293 SNPs returned with feasibility scores • tag SNPs identified, total 1,439 • List became final when SNP selection was complete
Individual MESA Candidate Gene Genotyping Proposal (continued) Procedure: proposal would come to the Ancillary Committee and be routed to MESA Genetics, then to subcommittee Proposals allowed twice a year All genotyping will be done by “MESA-related” genotyping labs, e.g. Minnesota, Cedars-Sinai, Wake Forest, Vermont Analysis: Subcommittee to decide: Which site – e.g. Cedars-Sinai, Minnesota, etc… Which samples - 720 (one ethnic group) - multiple of 720, up to 2880 - 1000 (MESA 1000) - all Work with proposer as to number of markers required
Individual MESA Candidate Gene Genotyping Proposal (continued) Ranking Subcommittee or whole committee would rank proposals • highly meritorious and rank within • meritorious and rank within • not approved Funding: There will be 2 categories of approved proposals One: Proposals that will be done “gratis” (supported by genotyping lab and/or MESA funds) Size/amount to be determined Two: Proposals that will be done with budget provided (e.g. supplies, technician time) Details to be worked out by subcommittee
Genome Wide Association Studies • SNPs and the Extent of Atherosclerosis (SEA), David Herrington, PI • Genome wide association will be done in PDAY samples • 1st confirmation will utilize PDAY samples • 2nd confirmation will use MESA samples
Genome Wide Association Studies • NIH RFA (to be released, due date 4/20/06) • MESA, with its 6000+ individuals, 4 ethnic groups, and quantitative phenotypes, is a very attractive venue for such studies • Ideally such a study would involve 3 stages • Stage 1: genome wide: 2000 subjects (all 4 ethnic groups) • From 250,000 to 500,000 markers • Stage 2: 1st confirmation large scale: 2000 subjects • From 20,000 to genome wide • Stage 3: 2nd confirmation: 2000 subjects • From 500 to 2,500 markers • However, the resources proposed to be provided by the RFA appear to be too limited for this scope
MESA Large Scale Collaborations Queries: Dallas Heart Framingham Offspring Rochester Family Heart (ECAC) Concept: Devise arrangements for joint testing, or for confirming results seen in one population in the other
MESA Large Scale Collaborations • Example • Epidemiology of Coronary Artery Calcification (ECAC) • Formerly Rochester Family Heart Study • Pat Peyser, University of Michigan, PI • Scientific Questions • The genetic basis of progression of CAC • ECAC has follow-up data, average 8 yrs (up to 13 yrs) • ECAC would like to interact (i.e. could MESA confirm candidate gene associations they will identify) • Issues for MESA • What does MESA want to do regarding this scientific question? • What interaction, if any, do we want with another group? • If yes, how would we want to structure it?