160 likes | 297 Views
VarDetect. Chumpol Ngamphiw 1 , Supasak Kulawonganunchai 2 , Anunchai Assawamakin 3 , Ekachai Jenwitheesuk 1 and Sissades Tongsima 1. VarDetect: a nucleotide sequence variation exploratory tool. 1 Genome Institute, National Center for Genetic Engineering and Biotechnology, Thailand
E N D
VarDetect Chumpol Ngamphiw1, Supasak Kulawonganunchai2, Anunchai Assawamakin3, Ekachai Jenwitheesuk1 and Sissades Tongsima1 VarDetect: a nucleotide sequence variation exploratory tool 1 Genome Institute, National Center for Genetic Engineering and Biotechnology, Thailand 2 Department of Computer Science, School of Engineering and Technology, Asian Institute of Technology, Thailand 3 Division of Medical Genetics, Siriraj Hospital, Mahidol University, Thailand
Outline • Nucleotide sequence variation • Common sequencing artifacts • VarDetect: algorithms overview • Experimental results • Conclusions 2
Nucleotide sequence variation http://urgi.versailles.inra.fr/projects/GnpSNP/general_documentation.php 3
Common sequencing artifacts http://seqcore.brcf.med.umich.edu/doc/dnaseq/trouble/badseq.html 4
VarDetect: algorithms Reading nucleotide traces Base-Calling Re-sampling Alignment of input sequences to the reference sequence Pre-alignment, Alignment Enhancement SNPs identification CodeMap 5
Chromatogram trace: base-calling Base-calling with BioJava Reading nucleotide traces: base-calling 6
Calculate peak intensity ratio Qv– Qo (δ) for increasing the confidence of SNP detection Reading nucleotide traces: intensity ratio 7
Partitioning and Re-sampling (PnR) technique Reading nucleotide traces: partition and re-sampling 8
Pooled DNA: possible biallelic pattern Base-call parameters setting in VarDetect 9
Pre-alignment & alignment enhancement Alignment 10
CodeMap analysis SNPs identification: CodeMap 11
VarDetect: main graphical user interface http://www.biotec.or.th/GI/tools/vardetect 12
Experimental results - Tocharoentanaphol C, et al. : Evaluation of resequencing on number of tag SNPs of 13 atherosclerosis-related genes in Thai population. J Hum Genet 2008, 53:74–86. - Thailand SNP discovery project : http://www.biotec.or.th/thaisnp 13
Conclusions • We presented novel algorithm to interpret fluorescent-based chromatograms in an automatic fashion and platform independent (Java). • Three main heuristic procedures are employed: • Turning point (bell shape) detection (PnR algorithm). • Increasing the SNP detection confidence by checking the difference between vicinity and observed quality values (Qv - Qo). • Introduction of CodeMap to detect pattern of SNP and Indel. • VarDetect offers the most features including the ability to detect SNPs from pooled DNA samples. • VarDetect use of XML annotated reference sequence to cross check the SNP discovery results within the tool without using external applications. • VarDetect’s heuristics minimize both false positive and negative errors reducing the effort needed to detect and validate SNPs, making it the tool of choice for automatic SNP detection. 14
Acknowledgements • Dr. Mazazumi Takahashi, Centre National • de Genotypage (CNG), France • Dr. Philip Shaw • Dr. Prasit Palittapolgarnpim • Dr. Chintana Tocharoentanaphol, • Chulabhorn Research Institute • Dr. Chanin Limwongse, Siriraj Hospital, • Mahidol University • Thailand SNP discovery project • National Center for Genetic Engineering and • Biotechnology(BIOTEC) 15
VarDetect Chumpol Ngamphiw1, Supasak Kulawonganunchai2, Anunchai Assawamakin3, Ekachai Jenwitheesuk1 and Sissades Tongsima1 Thank You For Your Attention 1 Genome Institute, National Center for Genetic Engineering and Biotechnology, Thailand 2 Department of Computer Science, School of Engineering and Technology, Asian Institute of Technology, Thailand 3 Division of Medical Genetics, Siriraj Hospital, Mahidol University, Thailand