250 likes | 400 Views
Brudno lab: A WHIRLWIND TOUR. Marc Fiume Department of Computer Science University of Toronto. 1. what we do, our tools 2. Savant Genome Browser. Outline. WHAT WE DO. main focus: genomic analysis using output from high-throughput sequencing (HTS) machines
E N D
Brudno lab: A WHIRLWIND TOUR Marc Fiume Department of Computer Science University of Toronto
1. what we do, our tools • 2. Savant Genome Browser • Outline Savant Genome Browser - http://compbio.cs.toronto.edu/savant/
WHAT WE DO Savant Genome Browser - http://compbio.cs.toronto.edu/savant/
main focus: genomic analysis using output from high-throughput sequencing (HTS) machines • high throughput: sequence billions of nucleotides per week • poor data quality: “reads” are shorter; error profiles are poorly understood • What we do Savant Genome Browser - http://compbio.cs.toronto.edu/savant/
HTS Pipeline Savant Genome Browser - http://compbio.cs.toronto.edu/savant/
What to do with all these reads? Savant Genome Browser - http://compbio.cs.toronto.edu/savant/
1. Assembly • ASSEMBLY: • reconstruct the donor’s genome • “HapSembler”: specialized for highly polymorphic species Savant Genome Browser - http://compbio.cs.toronto.edu/savant/
2. Alignment • ALIGNMENT • find region in a “reference” genome that matches closely with each read; suggests similar origin from “donor” • “SHRiMP”:Short Read Mapping Package Savant Genome Browser - http://compbio.cs.toronto.edu/savant/
3. Genetic Variation Discovery • GENETIC VARIATION DISCOVERY • find differences between two genomes • between donor and reference • between two samples (e.g. tumour vs. normal) • “VARiD”, “MODiL”, and “CNVer” Savant Genome Browser - http://compbio.cs.toronto.edu/savant/
Genetic Variation • Single Nucleotide Polymorphism (SNP): genomes have different nucleotides at corresponding positions • VARiD – VARiationIDentification • Insertions and Deletions (Indels): genomes have additional sequence put in or sequence taken out at corresponding locations • MODiL – Mixtures of Distributions Indel Locator • Copy Number Variation (CNV): genomes have a different number of the same sequence • CNVer Savant Genome Browser - http://compbio.cs.toronto.edu/savant/
Our Bioinformatics Tools COMPRESSION ASSEMBLY (HapSembler) READ MAPPING (SHRiMP) VISUALIZATION (SAVANT) SNP DETECTION (VARiD) INDEL DETECTION (MODiL) CNV DETECTION (CNVer)
SAVANT GENOME BROWSER Savant Genome Browser - http://compbio.cs.toronto.edu/savant/
Genome Browsing, the old way Savant Genome Browser - http://compbio.cs.toronto.edu/savant/
Challenge presented by HTS datasets • genomic data is generated in high volumes • HTS machines generate billions of bases per run • interpretation and analysis challenge • typical pipeline employs many separate tools for computation and visualization Savant Genome Browser - http://compbio.cs.toronto.edu/savant/
Tools for HTS data analysis • substantial disconnect between the processes of computational analysis and visualization Savant Genome Browser - http://compbio.cs.toronto.edu/savant/
Tools for Genomic Data Analysis • substantial disconnect between the processes of computational analysis and visualization Savant Genome Browser - http://compbio.cs.toronto.edu/savant/
ASIDE: Cytoscape? • platform for visual analysis of networks • extensive plugin framework Bader Lab Savant Genome Browser - http://compbio.cs.toronto.edu/savant/
Savant Genome Browser • platform for integratedvisual analysis of genomic data • feature-rich genome browser • computationally extensible via plugin framework Savant Genome Browser - http://compbio.cs.toronto.edu/savant/
FEATURE demonstration • INTERFACE • HTS READ ALIGNMENTS • EXAMPLE PLUGIN: SNP FINDER Savant Genome Browser - http://compbio.cs.toronto.edu/savant/
Power of visual analytics • task: find the correct parameter for command-line tool Savant Genome Browser - http://compbio.cs.toronto.edu/savant/
Plugin Framework • unlocks the potential for performing visual analytics • beneficial for both users and tool developers • tool developers: simple platform for development and dissemination of work • plugindevelopment is easy • API contains over a hundred prebuilt functions (e.g. get track data, add bookmarks, draw custom graphics, etc.) Savant Genome Browser - http://compbio.cs.toronto.edu/savant/
Conclusions Savant Genome Browser - http://compbio.cs.toronto.edu/savant/
Conclusions • Savant is a platform for integrated visualization and analysis of genomic data • stand-alone genome browser • novel features: e.g. table view, visualization modes, data selection, etc. • computationally extensible through plugin framework • makes interpretation and analysis of genomic data easier and more efficient Savant Genome Browser - http://compbio.cs.toronto.edu/savant/
Acknowledgements Orion Vanessa Joe Nilgun Paul Vera Recep Andrew Vlad Mike Brudno Yue Marc Misko Yoni
Thanks! Savant Genome Browser - http://compbio.cs.toronto.edu/savant/