180 likes | 234 Views
Sequencing technology and assembly. Sanger sequencing. Sanger sequencing with radioactivity High throughput Sanger sequencing with fluorescence. Roche/454 sequencing. Yield: 500,000,000 bp Cost: $5,000 Time: ~1 min per bp Read length: 450 bp - > 1kb. Pyrosequencing. Illumina sequencing.
E N D
Sanger sequencing • Sanger sequencing with radioactivity • High throughput Sanger sequencing with fluorescence
Roche/454 sequencing • Yield: 500,000,000 bp • Cost: $5,000 • Time: ~1 min per bp • Read length: 450 bp - > 1kb
Illumina sequencing • Yield: 8,000,000,000 – 80,000,000,000 bp • Time: ~1 hour per bp • Read length: ~150 bp • Cost: • Sample Extraction, $14.00/sample • Automated Sample Library, $90.00/sample • MiSeq (2x250), 1 lane 8-10Gb/lane, $1,700.00/sample • MiSeq (2x300), 1 lane, 10-12Gb/lane, $2,100.00/sample • HiSeq2500 (2x150), 1 lane, ~40Gb/lane, $2,500.00/lane • HiSeq2500 (2x250), 1 lane, ~65Gb/lane, $3,500.00/lane
Ion Torrent • Yield: 50,000,000 bp • Time: 2 hours • Read length: 500bp • <1 min per bp • Cost: $500
PacBio • Long reads (5-10kb) • High error, but read 150x coverage • Library prep: $600 • Sequencing: $300
Minion • Quick sample prep • Long reads (~50kb) • High error • $150 per run
Errors Different technologies have different error rates:
Base calling • Need to be sure which base you have identified • Depends on the technology • Each machine includes software • Phred is an historical package developed by at U. Washington • Phred scores are probability that the base is correct
Quality values • Phred 10: 1 x 101 chance that the base is wrong • Phred 20: 1 x 102 chance that the base is wrong • Phred 30: 1 x 103 chance that the base is wrong • Phred 40: 1 x 104 chance that the base is wrong • Phred 99: the base is correct! • Fastq scores are the score + 33 then converted to ascii text
Homopolymeric errors Homopolymeric runs: Signal is not linear Not clear if 5 or 6 bases
Errors • Different technologies have different error rates: • Pyrosequencing/Ion Torrent – homopolymeric tracts • Illumina – substitution errors • PacBio – Machines can not keep up with biology • Minion – noise coming through the membrane