1 / 26

DNA Sequencing: Present Status and Future Challenges

DNA Sequencing: Present Status and Future Challenges. Elaine Mardis Washington University Genome Sequencing Center. BAC/fosmid library. Plasmid library (3 kb). Production sequencing pipeline. Dual end sequencing. Restriction digest fingerprinting. Physical map generation.

Rita
Download Presentation

DNA Sequencing: Present Status and Future Challenges

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. DNA Sequencing: Present Status and Future Challenges Elaine Mardis Washington University Genome Sequencing Center

  2. BAC/fosmid library Plasmid library (3 kb) Production sequencing pipeline Dual end sequencing Restriction digest fingerprinting Physical map generation Concordance Finishing Genome Sequence: Present Workflow Genomic DNA WGS assembly using ARACHNE algorithm to generate contigs and supercontigs

  3. BAC Fingerprinting: Gel-based Fragment Separation 96 samples, 25 marker lanes Marker every fifth lane 29,950 bp HindIII Restriction Digestion 560 bp 1% agarose; 8 hours, 140 volts @ 14°C Marra et al., Genome Res., 7, 1072-1084 (1997)

  4. A B C D E F G * * * * * Contig assembly:physical map • Software (Image or Bandleader) is used to identify overlapping clones with common restriction fragments and assembles them into a contig (FPC) Clone

  5. Sequence data assembly:Supercontig creation and gap filling (A) A supercontig is constructed by successively linking pairs of contigs that share at least two forward-reverse links. Here, three contigs are joined into one supercontig. (B) ARACHNE attempts to fill gaps by using paths of contigs. The first gap in the supercontig shown here is filled with one contig, and the second gap is filled by a path consisting of two contigs. Genome Research 12: 177-189 (2002)

  6. Whole genome map assembly Genome map Edit contigs and align to map. Gaps between clones can be filled with other clones, such as fosmids, or by generating PCR products from BAC clones or genomic DNA.

  7. Current GSC Production Workflow picking Qpix prepping PlateTrak &DNATraks sequencing Biomek FX detection • Each process is documented by barcode entry into our Oracle database PE 3700/ 3730 data transfer • QC checks are used to assay quality at each step in the pipeline

  8. Qpix picking robot

  9. PlateTrak 1 & 2 Robots

  10. Biomek FX robot

  11. ABI 3700 Sequencer • Enhanced sensitivity relative to gel-based systems • Capillary-based separation of samples eliminates gel pouring, gel loading, lane tracking • Requires large volumes of buffer, polymer per run • Moving parts (robot, sheath flow) increase required maintenance and impact downtime • Sheath flow detection limits sensitivity, laser illumination scheme causes beam dispersion across sheath flow

  12. New generation instrument • In-capillary detection by fixed laser eliminates L>R “fade” and sheath flow, improves sensitivity • Direct load from reaction plate eliminates robotic volume transfer, decreases minimal load volume • Increased plate capacity, decreased buffer/polymer demand and automated plate handling decrease operator intervention ABI 3730 xl DNA Analyzer

  13. Improved results with lower template input

  14. Issue: Large clone end sequencing Due to lower sensitivity, end-sequencing of BAC and fosmid clones was not robust on the 3700. To achieve reliable results, we have utilized the ABI 3100s in a specialty group approach: • requires 1/4th x BDT reactions • requires ~100 cycles in the thermal cycler • lower throughput capability However, the increasing emphasis on large clone linkage for WGS approach requires higher throughput, lower cost for these templates

  15. High-throughput sequencing(c. 2002) • GSC produces 2.6 M reads monthly • Plasmid template preps by robotic SPRI • Sequencing reactions in 384 well/Biomek FX • Loading 120 ABI 3700s • Combined WGS plasmid, fosmid and BAC end reads with a physical map reference is becoming the strategy of choice for denovo genome sequencing • Our recent introduction of 30 x 3730 instruments will increase read capacity to 3.2 M reads monthly, and allow us to efficiently and more cheaply end sequence large clone types such as fosmids and BACs.

  16. What are the future challenges to high-throughput genome sequencing? • Most cost decreases have been incremental, rather than • monumental. Large cost decreases will require a revolutionary • approach to detection—perhaps not based on light. • There is a fundamental disconnect between the sample size • produced by current prepping and sequencing processes, and the • sensitivity of current instrumentation for detection/analysis. 3. There is a need for additional fluor combinations to enable reaction multiplexing.

  17. What are the current trends in DNA sequencing? Re-sequencing of the human genome is becoming a key approach toward understanding certain diseases Characterizing the genetic differences between affected vs. unaffected individuals Characterizing the genetic differences between diseased vs. normal cells Developing diagnostic/prognostic assays for disease

  18. What are the technical challenges of re-sequencing human samples? • Limited quantities of samples • Large sample numbers w/multiple analyses • Critical need to avoid sample mix-ups/QA • Ultimately: instrumentation and methods that reduce cost per reaction to well below current costs and require little/no hands-on sample manipulation • Informatics tools to assemble and analyze data intelligently and correctly (!) • Database tools/features to combine different data types in a meaningful way that aids interpretation

  19. PCR amplification DNA sequencing General approach Design exon- and/or intron- specific PCR primers Annotated human sequence from Ensembl - lowered emphasis on readlength, increased emphasis on speed of fragment separation and analysis

  20. Re-sequencing: Data pipeline Sequence Phred Phrap Sequence each end Base-calling Sequencealignment Quality determination Final quality determination of the PCR fragment PolyPhred Mutation/polymorphism detection Consed Sequence viewing Mutation/polymorphism tagging Analysis

  21. Laboratory Workflow Web interface to database ORACLE Database Mutation data laboratory tracking data gene feature data (Courtesy of D. Nickerson) Interactive Visual Tools Data Quality Checking (Courtesy of D. Nickerson)

  22. Challenges for Re-Sequencing Data Analysis • Need improved signal processing software for traces • - better background subtraction to eliminate false • positives in detecting sequence differences 2. Need improved software for detecting differences between aligned sequences - less manual review of traces and alignments - more analytical view of results/output 3. Statistical packages that help make sense of re-sequencing data in the context of genetics, probability, mutation rates, prognosis/outcome, etc.

  23. Trace data examination

  24. Vg software tool is used to cluster and visualize data from re-sequencing of the same genomic regions of multiple individuals Data Organization and Visualization

  25. Acknowledgements • GSC • Matt Hickenbotham - Rick Wilson • Jim Eldred - John McPherson • Darren O’Brien - Bob Waterston • Tom Erb • Joe Strong • Lisa Cook • Donald Williams • Nathan Sander • Josh Conyers • Todd Carter • Lliam Christy • - Pat Minx University of Washington - Debbie Nickerson

More Related