1 / 21

Sequencing Technologies and Human Genetic Variation

Sequencing Technologies and Human Genetic Variation. By Alfonso Farrugio , Hieu Nguyen, and Antony Vydrin. Overview. Introduction Simulating genomic variation and sequencing Analyzing and comparing different sequencing technologies Algorithms for detecting human genetic variation.

vondra
Download Presentation

Sequencing Technologies and Human Genetic Variation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Sequencing Technologies and Human Genetic Variation By Alfonso Farrugio, Hieu Nguyen, and Antony Vydrin

  2. Overview Introduction Simulating genomic variation and sequencing Analyzing and comparing different sequencing technologies Algorithms for detecting human genetic variation

  3. Introduction Different people have different mutations in their genomes A recent study was done (Nature453, 56-64, 5/1/2008) where 8 human genomes were compared, and 1,695 structural variants were found

  4. Whole-genome shotgun sequencing allows for fast and relatively cheap sequencing of human genomes New technologies are being developed to allow for accurate detection of human genomic variation Most of these technologies use short paired reads. How long should the reads be in order to optimize the process of detecting human genomic variation ? What algorithms can be used to detect variations in a new individual’s genome ?

  5. Simulating Genomic Variation Program to take a human genome and add randomly-distributed inversions, insertions, deletions, and SNPs The number of mutations (and their mean lengths) can be controlled by the user To simplify, no two mutations can overlap each other (the SNPs are an exception)

  6. Original genome Inversions Insertions Deletions “Intermediate” mutated genome

  7. “Intermediate” mutated genome Subtract Deletions “Intermediate” mutated genome

  8. “Intermediate” mutated genome SNPs (output mutated genome)

  9. Simulating Genomic Sequencing Program to take a human genome and create paired reads (output read pairs to a file) The read lengths are all identical, and the separation between reads in a pair is picked randomly based on a normal distribution The program can simulate sequencing errors when creating the paired reads

  10. Simulating Genomic Sequencing The user can control the total number of reads, read lengths, the mean of the read separations, and sequencing error rate

  11. Genome to be sequenced Choose uniformly - distributed random locations

  12. Genome to be sequenced Create read pair at each location. Choose random direction for each read L is a constant while d is random (normally distributed) L L d1 Read direction

  13. L L Read direction d2 L L Read direction d3

  14. Resulting paired reads L L d1 L L d2 L L d3

  15. Paired reads with simulated sequencing errors L L d1 L L d2 L L d3

  16. program runtime 1 ~ window size

  17. 1 0

  18. 50 insertions 100 insertions 500 insertions

More Related