1 / 28

Comprehensive Guide to Structural Variation Detection Using NGS Technology

Learn about detecting structural variations via NGS technology, software methods, and exercises for practical implementation.

cardenase
Download Presentation

Comprehensive Guide to Structural Variation Detection Using NGS Technology

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Structural Variation Detection Using NGS technology Ke Lin 23rd Feb, 2012

  2. Content •  Introduction •  Methods and software used for SV detection  •  Exercises 

  3. What is Structural Variation? • variation in structure of chromosomes in one species • using FISH to detect and localize the presence or absence of specific DNA sequences Introduction

  4. What is Structural Variation? • a region of DNA include inversions, balanced translocation and genomic imbalances (CNV) • approximately 1kb or greater in size  • many of SVs are associated with genetic diseases Introduction

  5. What can NGS do to detect SV? • hypothesis: the reference genome of the species is available • re-sequencing of other individuals of the species with shallow genome coverage (< 30X) • paired-end sequencing Introduction

  6. What can NGS do to detect SV? Introduction

  7. What can NGS do to detect SV? Introduction

  8.  local (de novo) assembly and then align assembled sequences to reference genomes Methods used for SV detections

  9.  local assembly and then align assembled sequences to reference genomes Methods used for SV detections

  10.  local (de novo) assembly and then align assembled sequences to reference genomes • accurate but costly  • the genomes of individuals within one species should be quite similar on sequence level Methods used for SV detections

  11.  2. map reads to reference genomes and deduce the SV according to expected insert size of the pairs • not accurate enough but much less cost • lots of methods were developed • downstream analysis can help to increase the accuracy  Methods used for SV detections

  12. Signatures used for SV discovery • PEM (Paired End Mapping) Methods used for SV detections

  13. Signatures used for SV discovery • PEM (Paired End Mapping) • paired end reads have to both mapped to references  • reads need to align without gaps Methods used for SV detections

  14. Signatures used for SV discovery • DOC (Depth Of Coverage) Methods used for SV detections

  15. Signatures used for SV discovery • DOC (Depth Of Coverage) • don't know where the copies occur • not able to detect insertions of novel sequence Methods used for SV detections

  16. Signatures used for SV discovery • Split reads Methods used for SV detections

  17. Signatures used for SV discovery • Split reads • gaps introduced is size limited (allow a few base pairs) • novel sequence insertions will not be complete if the local assembly of hanging reads are substantially larger than the insert size Methods used for SV detections

  18. PEM • BreakDancer   Input: BWA mapping output, bam format Command:  bam2cfg.pl -g -h bamfile1 bamfile2 .. > configure_file Output: Configuration file for next process Software of each Methods used for SV detections

  19. PEM • BreakDancer   Software of each Methods used for SV detections

  20. PEM • BreakDancer   Software of each Methods used for SV detections

  21. PEM • BreakDancer   Input: configuration file Command:  breakdancer_max -h -g int.bed -o chromosome cfg_file > output Output: tab delimited file Software of each Methods used for SV detections

  22. 1. Chromosome 12. Position 13. Orientation 14. Chromosome 25. Position 26. Orientation 27. Type of a SV8. Size of a SV9. Confidence Score10. Total number of supporting read pairs11. Total number of supporting read pairs from each bam/library12. Estimated allele frequency (if -h)13 - end. copy number for each bam/library Software of each Methods used for SV detections

  23. DOC • cnD Input: BWA mapping output, bam format Command:  samtools pileup -c bamfile | pileup2win.pl > output_file Output: windows file for next process Software of each Methods used for SV detections

  24. DOC • cnD Input: windows file Command:   cnD.x86-64 --prefix=lib_name --nohet windows_file1 cat lib*_viterbi.txt > viterbi.txt metaCaller.pl --threshold=value viterbi.txt > metacalls.txt extractCNChanges.pl metacalls.txt > output Output: tab delimited file chr    start pos    end pos    Gain/Loss Software of each Methods used for SV detections

  25. Split reads • Pindel Input: configuration file Command:  pindel_x86_64 -f ref.fasta -i cfg_file -c ALL -o name Output: files with indicative names D = deletion, SI = short insertion, INV = inversionTD = tandem duplication, LI = large insertion, BP = unassigned Software of each Methods used for SV detections

  26. Local assembly of SV regions • Annotation of novel insertion • Fine tune potential changed gene model Downstream Analysis after SV detections

  27. Local assembly of SV regions • Annotation of novel insertion • Fine tune potential changed gene model Downstream Analysis after SV detections

  28. Find all deletions in chromosome1 using BreakDancer.  Try to do it using cnD (gene loss) and Pindel respectively. The input file can be found:  /mnt/geninf15/work/bif_course_2012/SV/exercises/ The documentation of each program can be found: /mnt/geninf15/work/bif_course_2012/SV/DOC/ Exercises:

More Related