320 likes | 503 Views
Some new sequencing technologies. Molecular Inversion Probes. Illumina Genotype Arrays. Single Molecule Array for Genotyping—Solexa. Nanopore Sequencing. http://www.mcb.harvard.edu/branton/index.htm. Pyrosequencing on a chip. Mostafa Ronaghi, Stanford Genome Technologies Center
E N D
Nanopore Sequencing http://www.mcb.harvard.edu/branton/index.htm
Pyrosequencing on a chip • Mostafa Ronaghi, Stanford Genome Technologies Center • 454 Life Sciences
Technologies available today • Illumina • 550,000 SNP array: $300-500 in bulk • 454 • 200 bp reads, 100 Mbp total sequence in 1 run, $8K • 500bp reads in much higher throughput coming soon • Solexa • 1Gbp of sequence coming in paired 35 bp reads • 1 day, approx $10K / run
Short read sequencing protocol • Random, high-coverage clone library (CovG = 7 – 10x) • Low-coverage of clone by reads (CovR = 1 – 2x)
Assembly quality Read length = 200 bp, Error rate = 1%, Net coverage = 20.0x
Evolution at the DNA level Deletion Mutation …ACGGTGCAGTTACCA… SEQUENCE EDITS …AC----CAGTCCACCA… REARRANGEMENTS Inversion Translocation Duplication
Evolutionary Rates next generation OK OK OK X X Still OK?
Genome Evolution – Macro Events • Inversions • Deletions • Duplications
Synteny maps Comparison of human and mouse
Building synteny maps Recommended local aligners • BLASTZ • Most accurate, especially for genes • Chains local alignments • WU-BLAST • Good tradeoff of efficiency/sensitivity • Best command-line options • BLAT • Fast, less sensitive • Good for • comparing very similar sequences • finding rough homology map
Index-based local alignment …… Dictionary: All words of length k (~10) Alignment initiated between words of alignment score T (typically T = k) Alignment: Ungapped extensions until score below statistical threshold Output: All local alignments with score > statistical threshold query …… scan DB query Question: Using an idea from overlap detection, better way to find all local alignments between two genomes?
Chaining local alignments • Find local alignments • Chain -O(NlogN) L.I.S. • Restricted DP
Progressive Alignment x • When evolutionary tree is known: • Align closest first, in the order of the tree • In each step, align two sequences x, y, or profiles px, py, to generate a new alignment with associated profile presult Weighted version: • Tree edges have weights, proportional to the divergence in that edge • New profile is a weighted average of two old profiles y Example Profile: (A, C, G, T, -) px = (0.8, 0.2, 0, 0, 0) py = (0.6, 0, 0, 0, 0.4) s(px, py) = 0.8*0.6*s(A, A) + 0.2*0.6*s(C, A) + 0.8*0.4*s(A, -) + 0.2*0.4*s(C, -) Result:pxy= (0.7, 0.1, 0, 0, 0.2) s(px, -) = 0.8*1.0*s(A, -) + 0.2*1.0*s(C, -) Result:px-= (0.4, 0.1, 0, 0, 0.5) z w
Threaded Blockset Aligner HMR – CD Restricted Area Profile Alignment Human–Cow
Reconstructing the Ancestral Mammalian Genome Human: C C Baboon: C G Dog: G C or G Cat: C