210 likes | 361 Views
Analyses of Complete Genomics Sequencing Data Maxwell Lee National Cancer Institute Center for Cancer Research Laboratory of Population Genetics May 23, 2011. Structural Variations. highConfidenceJunctionsBeta*.tsv. cgatools junctiondiff exclude repeats and >= 100 kb.
E N D
Analyses of Complete Genomics Sequencing Data Maxwell Lee National Cancer Institute Center for Cancer Research Laboratory of Population Genetics May 23, 2011
Structural Variations highConfidenceJunctionsBeta*.tsv cgatools junctiondiff exclude repeats and >= 100 kb 86 translocations (tumor-specific) both ends within named genes 23 translocations (12 in CC0996 and 11 in E8413) Designed PCR assays for 21 translocations 10 gave PCR products and junctions confirmed by Sanger sequencing
PCRs of Structural Variations T N B W
A Model of Structural Variations SPATA17 SIKE1 217947233 115323202 chr 1 115321907 1st inversion SIKE1 217947233 115323202 115321907 2nd inversion plus deletion SIKE1 SPATA17 217947233 115323202 115321907 SV 5103 SV 5104
The Structure of Fusion Gene FOXJ3-ATCAY Chr 1 Chr 19 RNA FOXJ3-ATCAY exon2 exon3 MGLYGQACPSVTSLRPLPEETGVELLGSPVEDT
The Structure of Fusion Gene FGFR2-PLA2G2A in CC0996T Chr 10 Ch 1 FGFR2-PLA2G2A RNA alternative splicing creates an inframe fusion NM_000141 NM_000300 exon6 exon5 HTYHLDVVGLLELWDKSPNRVSPHSCCVTHDCCY
The Structure of Fusion Gene FGFR2-PLA2G2A in CC1220T Chr 10 Ch 1 FGFR2-PLA2G2A RNA NM_000141 NM_000300 exon4 exon6 EDFVSENSNNKTKQDSCRSQ
The Structure of Fusion Gene MUC5B-AP2A2 in CC0996T Chr 11 Ch 11 exon29 skipped RNA MUC5B-AP2A2 NM_002458 NM_012305 exon28 exon15 fs VPTAENCQSCLRLLRRQ
The numbers of SVs in CC The numbers of SV interchr
The Intra-chromosome Strand Consistent in CC The numbers of SV consistent
Sanger Sequencing of Mutations 41 genes were PCRed/sequenced in DNA from blood, normal, and tumor 39 assays worked 16 variants Not detected in tumor 2 from variant in multiple tumors 23 variants detected in tumor 8 variants are heterozygous in B&N 6 from TplusNT 15 variants validated as somatic mutation
Comparison of CGI and Bambino Mutation Calls evidenceDnbs* cgatools evidence2sam sam and bam files bambino SNP report (single sample) blood, normal, tumor tumor only
Improving Mutation Call by Combining CGI with Bambino Call p-value = 0.0003551 15/39 (38%) 15/26 (58%)
The Numbers of Missense Variants Validated by RNA-seq Data The numbers of mutations