360 likes | 536 Views
PHYTOPHTHORA GENOME SEQUENCING: A case study. Santhosh J. Eapen sjeapen@spices.res.in. Status of whole genome sequencing of Phytophthora spp. Phytophthora Whole Genome Sequencing. The sequencing platform was Illumina Genome Analyzer
E N D
PHYTOPHTHORA GENOME SEQUENCING: A case study Santhosh J. Eapen sjeapen@spices.res.in
Phytophthora Whole Genome Sequencing The sequencing platform was Illumina Genome Analyzer The sequence base calling, alignment, and variant analysis were done using CASAVA v1.7 (short for "Consensus Assessment of Sequence And VAriation“). Maq software was used for assembly and variant detection using reference genome. P. capsici genome of JGI was used as the reference genome
Alignment status and reports • Number of reference scaffolds : 917 • Length of reference sequences excluding gaps : 56042007 • Length of gaps in the reference sequences : 8005190 • Length of non-gap regions covered by reads : 22593594 • GC% : 50.4 • Total Reads : 15849154 • Reads Aligned : 48.8738 • Total Genome Size : 64022747 • Genome Covered : 28234853 • %Coverage : 44.1013 • Average Read Depth : 1.50491 • Average depth across all non-gap regions : 11.284 • Average depth across 24 bpunique regions : 1.565 • % Coverage at 1X : 54.8897 • Single Nucleotide Variants at 3X cutoff : 330410
Base composition and genome size of P. capsici Total genome size = 64022747 (64 Mb)
Structural Annotation- • Structural Annotation was conducted using AUGUSTUS (version 2.5.5), Magnaporthe_grisea as genome model • However, we have to develop genome model for Oomycete to obtain accurate result
Functional Annotation Result • Functional Annotation for negative strand is complete
Comparison of P. capsici with P. capsici (JGI), P. infestans, P. ramorum& P. sojae
GenomeView - next-generation stand-alone genome browser Visualize and manipulate a huge number of genomics data Browse high volumes of aligned short read data, with dynamic navigation and semantic zooming, from the whole genome level to the single nucleotide Enables visualization of whole genome alignments of dozens of genomes relative to a reference sequence. Handle thousands of annotation features and millions of mapped short reads
Future Plans • To assign putative functions to the remaining genes • Provide a genome wide comparison with other sequenced Phytophthora species • More genomes to be sequenced
Data, data, everywhere but ... is it knowledge? • Five oomycete genome sequences are available and several more are on the way • The rate of new sequence generation is accelerating extraordinarily with next generation technologies • Even today the ability to generate high throughput sequencing and transcriptomic data is outstripping the ability to transform the data into knowledge • Automated data processing pipelines are not a substitute for human insight
Life in a data-rich environment Every experimental biologist needs to be a computational biologist too
Some concluding remarks • Trust but verify • Beware of gene prediction tools! • Always use more than one gene prediction tool and more than one genome when possible. • Active area of bioinformatics research, so be mindful of the new literature in this .
Other factors • Changing technology • New and disappearing companies? • Changing price structure • Cost of machine • Cost of operation (reagents/people) • Service from the company • 1 machine vs (2 or 3 machines) vs 40 machines. • Changing software and processing
What have we learned? • Sequencing technologies are changing fast • Allowing new biology to be performed, new questions to be asked • Understand the difference between some of the technologies