220 likes | 335 Views
Working with Sequencing Results. An Introduction to Sequence Assembly, Primer Design and Editing. Mix. Cycle. Reaction Components. Template DNA : sequence to be determined. Primer : a short single-stranded DNA fragment usually 16-25 bases long.
E N D
Working with Sequencing Results An Introduction to Sequence Assembly, Primer Design and Editing
Mix Cycle Reaction Components Template DNA: sequence to be determined. Primer: a short single-stranded DNA fragment usually 16-25 bases long. Deoxy-NTP (dATP, dTTP, dCTP, dGTP): building blocks for DNA. Dideoxy-NTP (ddATP, ddTTP, ddCTP, ddGTP): terminators. DNA polymerase: an enzyme that catalyzes DNA extension. Suitable buffer. Automated Fluorescent DNA Sequencing • -GCGTCATCTATCGGTAGCTTAACCGTAGGCTAATCGTAGCATCTGCAT- • -CGCAGTAGATAGCCATCGAATTGGCATCCGATTAGCATCGTAGACGTA- • denaturing, 95˚C (strands separation) • -GCGTCATCTATCGGTAGCTTAACCGTAGGCTAATCGTAGCATCTGCAT- • -CGCAGTAGATAGCCATCGAATTGGCATCCGATTAGCATCGTAGACGTA- • annealing, 50˚C (primer anneals to one of the DNA strands) • -GCGTCATCTATCGGTAGCTTAACCGTAGGCTAATCGTAGCATCTGCAT- • CGCAGTAGATAGCCATC • primer extension, 60˚C • -CGCAGTAGATAGCCATCGAATTGGCATCCGATTAGCATCGTAGACGTA • -CGCAGTAGATAGCCATCGAATTGGCATCCGATTAGCATCGTAGACGT • -CGCAGTAGATAGCCATCGAATTGGCATCCGATTAGCATCGTAGACG • -CGCAGTAGATAGCCATCGAATTGGCATCCGATTAGCATCGTAGAC • -CGCAGTAGATAGCCATCGAATTGGCATCCGATTAGCATCGTAGA • -CGCAGTAGATAGCCATCGAATTGGCATCCGATTAGCATCGTAG • -CGCAGTAGATAGCCATCGAATTGGCATCCGATTAGCATCGTA • -CGCAGTAGATAGCCATCGAATTGGCATCCGATTAGCATCGT • -CGCAGTAGATAGCCATCGAATTGGCATCCGATTAGCATCG • -CGCAGTAGATAGCCATCGAATTGGCATCCGATTAGCATC • -CGCAGTAGATAGCCATCGAATTGGCATCCGATTAGCAT • -CGCAGTAGATAGCCATCGAATTGGCATCCGATTAGCA • -CGCAGTAGATAGCCATCGAATTGGCATCCGATTAGC
Final Sequencing Result Computer software interprets light bands, makes base calls and outputs an electropherogram
Sequencing Problems Homopolymers Long run of T’s causes polymerase slippage Use alternate primer - 25 T’s with an anchor base Secondary Structure High GC content forms structures that interfere with polymerase Use alternate sequencing chemistry Double Sequence Multiple templates or priming sites Reprep template Select or design different primer
I Have My Sequence … What Now? • Sequencing results are only data • Several steps remain before this data is useful • Evaluate quality of data • Small scale project- individual electropherograms • Large scale project - statistical / computer assessment • Evaluate completeness of data • Full coverage / Depth of coverage • Sequence assembly • Primer design • Determine biological significance • BLAST/ORF/Sequence Motifs
Tools Assembly Software Used to assemble two to tens of thousands of individual sequencing runs into contigs representing the sequence of the entire sequence of interest Oligo Design Software Used to analyze potential oligonucleotides (primers) for suitability as sequencing primers, PCR primers, or probes
D D B B F F A D E B C A A F Allowing for proper assembly E E C C With no overlapping data, these sequences are impossible to put in the correct order These sequences have areas of overlapping data… Sequence Assembly Overlapping Sequence Is The Key To Accurate Assembly
E C Sequence Assembly Overlapping Sequence Is The Key To Accurate Assembly … and giving you the order of the entire sequence
Indication of the number of sequences covering a particular region Consensus sequence - built from all of the sequences below Graphical representation of the sequences in the project showing relative position and orientation List of all of the sequencing runs in the project Assembly Example
Primer Design 3. This region remains unknown 1. 1.6 kb insert to be sequenced 5. Next sequencing reaction covers unknown region 2. 1st sequencing run covers about 900 bases 4. Design a new primer, allowing for overlap • Primers are used for a variety of purposes to include extending sequencing into an unknown region from a known one • This technique is known as primer walking
Primer Design • Primers must be highly specific and bind optimally to the intended target • Certain design criteria displayed in Oligo 6 will assist in analysis • To get started, enter a primer sequence into the Edit Sequence window and Accept the primer GTATGTCGATGGACAAGTG
Primer Design Melting Temperature Duplex Formation Internal Stability Criteria Criteria Criteria • Between 50-60°C • 54-58 °C ideal • Varies with base composition • 18-24 bases usually in this range • 3’ Dimer ∆G > -3 kcal/mol • Overall dimer ∆G > -10 kcal/mol • Hairpin Tm < primer Tm • Ideally less than room temperature • Represents binding strength of different parts of the oligo • Generally higher in the middle tailing off at the end • Each one will be somewhat different
Primer DesignSmall Changes CCCATCCCCAGCTCCTACGGGTCG CCCAGCTCCTACGGG ATCCCCAGCTCCTAC • Small changes in the location of primer design can have major changes in the primer characteristics • Moving the oligo back 3 bases makes a very poor primer into a good one
Data Quality and Project Editing • Once all the sequencing is completed and assembled, there may need to be some final quality assessment and manual editing • The amount and type of editing will depend on the intended use of the sequence • To find areas to edit need to look for conflicts between runs and consensus
Data Quality and Project Editing The red color of this base shows a disagreement The lower case T indicates uncertainty • The consensus line in this region shows 2 areas that should be looked at more closely • Each of these positions should be evaluated using all of the electropherogram data for this region • If there is enough information these can be edited manually
Data Quality and Project Editing Consensus Sequence 3 Sequence 1 The extra T called in sequence 1 has caused a gap to be inserted in sequences 2 and 3 The disagreement between the added gaps and the extra T cause the lower case T to be displayed in the consensus Click and drag to hilight questionable base Sequence 2 3 T’s called for this poorly formed peak 2 T’s called for these clean sharp peaks • The 3 electropherograms are open with the base calls below the trace information • Since there are 2 high quality sequences and a single poor quality one the data can be edited with a fair amount of certainty
Data Quality and Project Editing Consensus updated Extra T removed Gaps characters removed • With the T highlighted in the consensus, hit delete • In general, need at least 2 good sequences to override one bad • The more good sequences that are present the more certain the edit is
Data Quality and Project Editing N was called at this position, but looking at the electropherogram and A peak can be seen Disagreement in consensus is resolved A was called in the other 2 runs Base is now called correctly With the N highlighted, type A • In this example a base call in one of the runs needs to be edited instead of the consensus • Have to be careful and think when editing - can cause major problems (frame shifts, etc.) • Never edit your only copy
Computer Exercise All instructions are in the manual. Follow the directions carefully. This exercise has been adapted from a larger one so some of the numbering may be out of order or certain steps omitted. If you have any questions please ask the instructor. Fill in your worksheet as you go.