250 likes | 272 Views
MESA Family Study Genetics Committee Report. Friday Morning September 7, 2007 8:20-8:45 am. Candidate Gene Progress Panel 2 (ILMN2 typing). Progress. MESA CG Panel 2 typed by Illumina, 1536 SNPs CG2 data QC completed (CC+UVA) Same process as CG1, sample-based + SNP-based QC
E N D
MESA Family StudyGenetics Committee Report Friday Morning September 7, 2007 8:20-8:45 am
Progress • MESA CG Panel 2 typed by Illumina, 1536 SNPs • CG2 data QC completed (CC+UVA) • Same process as CG1, sample-based + SNP-based QC • Annotation of SNPs, gene loci rechecked and updated • Updates to HapMap data in SNP tables • New Release of Data to Gene Pages (2007.07) • Updated CG1 + new CG2 = integrated Gene List
Study Design • Same samples as MESA CG Panel 1 • 2880 DNA samples were drawn equally from four major ethnic groups: • EUA European American • AFA African American • HIS Hispanic American • CHN Chinese American 720 samples in each major group • 1536 SNPs picked in candidate genes + additional AIMs
CG Panel 2: Sample QC Summary CG2 CG1 Total Samples 2944 3036 Sample Failures 8 6 Expected Samples to Delete (QC duplicate) 156 156 Actual Samples Deleted 179 183 Duplicate 168 Duplicate+Unresolved Gender Err 4 Triplicate 3 Triplicate+Unresolved Gender Err 1 Identical Twins 1 Unresolved Gender Error 6 Remaining Samples 2757 2847
CG Panel 2: SNP QC & Missing Data • 18 SNPs with >5% missing data in CG2 rs737497, rs4147567, rs7526440, rs6436094, rs518116 rs12266012, rs6591258, rs1079598, rs11216137, rs929610, rs4630, rs6004034, rs140313, rs6519497, rs2844010, rs140316, rs140317, rs5751789 • 24 multi-allelic/CNV SNPs identified by ILMN Dropped from data set (Dropped = 1 in Gene Pages table), including 16 of 18 with >5% missing • Retain 2 markers > 5% missing after ILMN review, not in multi-allelic SNP list (rs929610, rs12266012). All missing genotypes were converted to homozygotes • ILMN manually reviewed and verified 9 cluster plots for SNPs with 3% ≤ missing < 5%
CG Panel 2: SNP QC • 3 Alleles: • A / B / - • Could be: • Single base deletion for this SNP • SNP is in a CNV deletion region
CG Panel 2: SNP QC • 3 Alleles: • A / B / - • Could be: • Single base deletion for this SNP • SNP is in a CNV deletion region
CG Panel 1+2: Summary SNPs CG1 CG2 TOTAL Picked 1536 1535* 3071 Typed 1440 1467 2907 Typed, Not Dropped 1440 1442 2882 * One duplicate SNP AIMs CG1 CG2 TOTAL Picked 97 112 209 Typed 96 106 202 Typed, Not Dropped 96 103 199 (-3 multiallelic/possible CNV) Annotated Genes (non AIM-ETHNIC) Unique Genes per Panel CG1 CG2 TOTAL Typed, Not Dropped SNPs §119 123 230 Genes w/no data 0 1 1 (GSTT1) § Include any gene that has 1+ typed, not dropped SNP; genes typed in CG1 & CG2 only counted once in TOTAL
CG Panel1 + CG Panel2 • Data set is 2847 individuals x 2882 SNPs • Individuals that were successful in CG1 but failed in CG2 are retained in the gene data files • Non-typed SNPs in CG2 are indicated as missing (XX in _cnf file, 0 0 in _ped file) • CG1+CG2 SNPs are integrated into a single gene map • Flags indicate whether the SNP was typed in CG1 or CG2
Major New Features • Statistics for typing in CG1, CG2 panels • Updated HapMap data + NCBI Map details • Phenotype classes • Demographics restructured • Retinal Photography • Pericardial Fat • Exposures • Lipid Metabolism - 5 replicates with imputed LDL, HDL to address effects of medication on measured LDL
Methods Document • Methods Document v1.0 is available for CG papers and proposals • Describes SNP selection and QC process
Ancestry Informative Markers (AIMs) • Multiple approaches • Principal components • Partial least squares • Other • Each approach has impact on implementation • Use in existing and future manuscripts
Progress: Linkage Study • DNAs have been isolated and arrayed • Release of Infinium HumanLinkage-12 by Illumina • 6090 SNPs per sample • 12 samples/chip • Chips have been manufactured and received • Automated processing operational in lab
When the Data Arrives • Data Q/C led by UVA (Mychaleckyj) • Initial automated analyses • Genome-wide linkage scan strategy • Rapid analyses and resolution of priority phentoypes • Distribution of results, identification of working groups (manuscripts) • Speed of essence as GWAS nears • Manuscript proposals, more work for P&P
MESA Family StudyPublications & Presentations Report Friday Morning September 7, 2007 8:45-9:45 am
MESA Genetics P&P Committee • Lots of work!!! • Manuscript proposals • Approved (29) • Under revision (3) • Pen drafts (1; G001 Bielinski) • Abstracts • 5 submitted • Notify Coordinating Center re:abstract outcome • Final copy of abstract to Coordinating Center • Timing for review prior to Conference Deadlines! • Genetics abstracts sharonf@u.washington.edu
Candidate-wide Association:the Need for Speed • Rapid data generation has created a “need for speed” in reporting candidate gene associations from MESA • Current genome-wide discoveries • Ongoing genome-wide efforts in SHARE & STAMPEED • CARE (MESA is now involved) • SHARE (MESA involvement proposed) • Other investigators will be working on our own MESA data sometime next year
Candidate-wide Association: A Scientific Opportunity • Global analyses of a dataset present the opportunity to identify previously unknown relationships between genes & phenotypes • It is appropriate to apply to our large-scale candidate gene study • Look at the analyses already completed more globally • Employ novel statistical approaches that consider multiple genes, multiple phenotypes, and multiple interactions
Candidate-wide AssociationA ProposalEndorsed by the MESA Steering Committee • Modify restrictions on candidate-wide association (formerly “data mining”) paper proposals • Introduce “Committee-based candidate gene” (CBCG) screening and distribution
MESA Candidate-wide Association • Encourage candidate-wide association proposals by removing time limitations • Screen and distribute top “hits” for development of paper proposals by interested investigators