1 / 13

Mining hidden information from your 454 data using modular and database oriented methods

Mining hidden information from your 454 data using modular and database oriented methods. Joachim De Schrijver. Overview. Short introduction on 454 sequencing Variant Identification pipeline Possibilities of a DB oriented pipeline Examples Coverage Improving PCR Fast Q assessment

eben
Download Presentation

Mining hidden information from your 454 data using modular and database oriented methods

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Mining hidden information from your 454 data using modular and database oriented methods Joachim De Schrijver

  2. Overview • Short introduction on 454 sequencing • Variant Identification pipeline • Possibilities of a DB oriented pipeline • Examples • Coverage • Improving PCR • Fast Q assessment • Homopolymers

  3. Introduction (i) • Roche/454 GS-FLX sequencing: • Pyrosequencing • ± 400,000 reads/run • Average length: 200-250bp • Applications: • Resequencing: Variant identification • De novo (genome) sequencing: Assembly of new regions, plasmids or entire genomes • Standard Software: • Variants: Amplicon Variant Analyzer (AVA) • Assembly: Standard 454 assembler

  4. Introduction (ii) • Standard software • + Easy to use • + reproducible results on similar datasets • + GUI (graphical user interface) • - No answer for ‘non-standard’ questions • Methylation experiments • Different types of experiments grouped together • … • - What about ‘hidden’ information? • Homopolymer error rates • Quality score ~ length of sequenced read • ‘Multirun’ information • …

  5. Variant Identification Pipeline (i) • Modular and database oriented pipeline • Modular: • Efficient planning • Scalable • Database (DB): • No loss of data • Grouping several runs together

  6. Variant Identification pipeline (ii) • Basic idea: Data is processed and stored in DB. Results (reports) are calculated ‘on the fly’ using the DB data. • Fast & efficient • Calculations only happen once • Everybody can access the database without risk of data modification • Reporting is independent from the dataprocessing • Paper: De Schrijver et al. 2009. Analysing 454 sequences with a modular and database oriented Variant Identification Pipeline

  7. Possibilities of a DB oriented pipeline • VIP originally developed for variant identification • Now being used in: • Amplicon resequencing • De novo shotgun • Methylation • ~ solexa experiments • ‘Hidden’ data can be extracted using intelligent querying strategies • Results per lane/Multiplex MID/run…

  8. Example: Detailed coverage • Coverage can be calculated per • Lane • MID • Amplicon • Base position • Assessment of errors (PCR dropouts vs. human errors)

  9. Example: Improving PCR • Amplicon Resequencing experiment • Goal: Variant identification • Length distributions • Mapped • Unmapped • ‘Short’ mapped • Additional length separation + Improved PCR • Result: Improved efficiency

  10. Example: Homopolymers • Can the length of a homopolymer be assessed using the Q score? • Yes, when homopolymer length < 6bp

  11. Example: Q assessment • Fast assessment of the quality of a run Lab work OK Errors in lab work

  12. Acknowledgements • Biobix – Ugent Wim Van Criekinge Tim De Meyer GeertTrooskens Tom Vandekerkhove Leander Van Neste GerbenMensschaert • CMG – UZ Gent Jo Vandesompele Jan Hellemans FilipPattyn Steve Lefever Kim Deleeneer Jean-Pierre Renard • NXT-GNT • Paul Coucke • SofieBekaert • Filip Van Nieuwerburgh • Dieter Deforce • Wim Van Criekinge • Jo Vandesompele

  13. Questions ? Joachim.deschrijver@ugent.be

More Related