80 likes | 211 Views
UK NGS Sequencing Update July 2009. Dr Gerard Bishop - Division of Biology Dr Sarah Butcher – Centre for Bioinformatics. Overview. 7-9kb Mate pair library and SOLID sequencing. To discuss “When will we know when we are finished?” “What is the coverage of the NGS genome?”
E N D
UK NGS Sequencing Update July 2009 Dr Gerard Bishop - Division of Biology Dr Sarah Butcher – Centre for Bioinformatics
Overview • 7-9kb Mate pair library and SOLID sequencing. • To discuss “When will we know when we are finished?” “What is the coverage of the NGS genome?” Transcriptome sequencing? What expertise and resources we have.
7-9kb Mate pair library and SOLID sequencing • Consumable cost to do ePCR and run guestimate ~ €16-18K. • Once library is delivered to Imperial we will be able to put into Imperial’s pipeline. (will update on sequence delivery time after this meeting)
EU-SOL Staff Status in the Current Period Bioinformatician currently working on AGP/PGP viewer for T6.1.10 “Bring online a web based Accession Golden Path (AGP) and Pseudo-AGP viewer” New bioinformatician starting full-time 1st September 09
Amended Work Plan W6.1 This Period • T6.1.13Provide assistance and expertise in analysis and integration of data arising from ultra high throughput sequencing platforms, including extensive new data arising from WP5.4, provide coordination efforts with WP1 (11 man-months available)
Support for Ultra High Throughput Sequencing Ilumina GS2, 2x ABI SOLiD3 platforms in-house Scaleable high performance SAN Disk storage and data management (~100 Tb now) Computational power - new 120 core cluster and shared memory machine (128GB, 16 core), access to >128 GB RAM Ultix, other clusters Continually updated software selection for all platforms including vendor pipelines and open source programs Specific analysis expertise >2yrs for 454/GA2
Support for Ultra High Throughput Sequencing • Analyses ongoing for multiple projects (de novo assembly, transcript profiling, full-length cDNA assembly, SNP profiling, ChIP-seq etc) - e.g. Blumeria graminis sp hordei genome sequencing project using Sanger (3730), Illumina GA2, Roche 454 FLX and Titanium reads - www.blugen.org • Bespoke pipelines and data visualisation methods being addressed • Collaborative efforts (e.g. with sequence providers) ongoing
Modelling de novo Assembly methods • Assessment of positional error rates in real 454 and Illumina data • Model read fragments from sequenced eukaryote genome • Assemble with different algorithms and assess • Effects of coverage depth • Read length • Error rate (Abbott et al in preparation)