100 likes | 296 Views
Introduction to Bioinformatics. Dot Plots. Dot Plots. One of the simplest and oldest methods for sequence alignment Visualization of regions of similarity Assign one sequence on the horizontal axis Assign the other on the vertical axis Place dots on the space of matches
E N D
Introduction to Bioinformatics Dot Plots
Dot Plots • One of the simplest and oldest methods for sequence alignment • Visualization of regions of similarity • Assign one sequence on the horizontal axis • Assign the other on the vertical axis • Place dots on the space of matches • Diagonal lines means adjacent regions of identity
Simple Example • Construct a simple dot plot forGCTGAA GCGAA One sequence goes horizontally, the other vertically Mark boxes w/ matched horizontal and vertical symbols Look for diagonal(s) Alignment: GCTGAA GCT-AA
Another Example • Construct a simple dot plot forGCTAGTCAGATCTGACGCTAGATGGTCACATCTGCCGC A long stretch of nearly identical residues is revealed starting at the fifth nucleotide of each sequence (GTCA-ATCTG-CGC).
Sliding Window and Cutoff • Problem • Plot becomes noisy when comparing large, similar sequences • Solution • Sliding window (size = w) • Cutoff (value = v) • Consider w nucleotides at a time • When at least v matches in a window, place a dot on the space where the window starts
Example • Same example with w = 4 and v = 3 • Compare to the previous plot. You make the call!
Worksheet • w = 4 and v = 3
What else can it do (and how)? • Gaps • Inverse subsequence • Repeats • Palindrome • Genome rearrangement • Exon identification • RNA structure prediction • Nice tool for conceptualizing sequence-related algorithms