340 likes | 366 Views
Tomato Finishing Workshop. April 2008 Specific Problems and Examples. Other Strategies for Finishing Large Repeats. High Repeat Content Clones and Alternative Libraries Large Insert Libraries and TILs Reading coverage histogram in gap4 and diploid plot BLAST & Pre-analysis Scaffolds
E N D
Tomato Finishing Workshop April 2008 Specific Problems and Examples
Other Strategies for Finishing Large Repeats • High Repeat Content Clones and Alternative Libraries • Large Insert Libraries • and TILs • Reading coverage histogram in gap4 and diploid plot • BLAST & Pre-analysis • Scaffolds • Size using restriction digests and dotter
Repeat Unit Size of Unit Copy Number Direct or Inverted Copies How Conserved? Lower Complexity e.g. Di-nucleotide Runs Higher Complexity e.g. LTRs Gaps and Assembly ProblemsCaused by Repeats Varying complexity of repeats depending on: Importance of visualising repeat sequence to assess repeat type Use Restriction Digests that cut in-frequently in repeat Alter phrap parameters for more stringent assembly Alternative library sizes if necessary
Orchid Functionality • View Template List • Select templates from different plates • View Read Pairs • Show only good read pairs between or within contigs • Show only bad read pairs • select between too large, too small, wrong orientation • Select bad readpairs across a given reading > save to file
Unspanned gap example Original 4-6Kb library
Unspanned gap example 6-9Kb Library
Restriction Fragment for SIL Candidate Corresponds to position of unspanned gap
Sequencing Chemistries and Additives used in Finishing • 4:1 mix ratio of AB Big Dye Terminator : AB dGTP Terminator • used for general finishing reactions, not problem specific • AB dGTP Terminator • used for di-nucleotide runs and inverted repeats • Additive A (SequenceRx Enhancer Solution A - Invitrogen) • Dimethyl sulfoxide (DMSO) • Additive A+DMS0+dGTP • used for mono-nucloetide runs, inverted repeats • Sequence Finishing Kit (SFK) (TempliPhi - Amersham) • used to increase DNA yield • useful for structural problems caused by inverted repeats • GC regions
TA Repeats Standard BDT chemistry continues to call TA even after the di-nucleotide run has ended, as is shown in the cut-off data: Figure.1
dGTP BDT 4:1 dGTP
In addition to di-nucleotide runs other simple repeats can also be difficult to sequence through dGTP Figure.6 In simple repeats the problem is often directional, for example: GGAGGA sequences better in the forward direction CCTCCT sequences better in the reverse direction
Mononucleotide runs frequently occurs with C or G runs and is often a directional issue Fig.2
4:1 dGTP dGTP + additive A + DMSO to resolve G/C mono runs we use dGTP + A + DMSO as standard minimum attempt directional problem - G strand sequences easier than the C strand reactions ordered accordingly
Inverted Repeat Example Characteristic drop in sequence visible in one direction Opposite strand shows no drop in quality but digests suggest data missing
SFK example – small inverted repeat Dotter is used to check for any repeats at problem region Problem visible in more than one digest Break contig and check for read pair coverage
Inverted Repeat Example Resolved with different chemistries and SFK Original subclone Original subclone + A+DMSO+dGTP SFK with 4:1
Inverted Repeat Example Resolved with different chemistries and SFK Original subclone SFK SFK + dGTP + oligo
SFK example – small inverted repeat Using SFK then sequencing with oligo walks from both directions resolved the repeat Restriction Digests now confirm the assembly
Transposon Insertion Library (TIL) Double Stranded Sequencing Vector (pUC) Normal sequencing from either end of insert Read pairs ~4-6Kb apart Inserted sequence (BAC)
Transposon Insertion Library (TIL) Double Stranded Sequencing Vector (pUC) Normal sequencing from either end of insert Read pairs ~4-6Kb apart Inserted sequence (BAC) Transposon randomly inserts across entire plasmid Sequence outwards from transposon insertion site
TIL Read pairs overlap by 9bp duplication site Transposon Insertion Library (TIL) Double Stranded Sequencing Vector (pUC) Sequence outwards from transposon insertion site Inserted sequence (BAC) Transposon randomly inserts across entire plasmid