1 / 25

Sequence Alignment in DNA

Sequence Alignment in DNA. Presented By: Lalchand Gaurav Jain. Under the Guidance of : Prof . Kolin Paul. Agenda. Application Domain & objective General Alignment Procedure Scope of parallelism in BWT Selection sort and quick sort implementation Bwt Implementation on GPU

Download Presentation

Sequence Alignment in DNA

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Sequence Alignment in DNA Presented By: Lalchand Gaurav Jain Under the Guidance of : Prof . Kolin Paul

  2. Agenda • Application Domain & objective • General Alignment Procedure • Scope of parallelism in BWT • Selection sort and quick sort implementation • BwtImplementation on GPU • Comparative study

  3. Time-Line • Application Domain & objective • General Alignment Procedure • Scope of parallelism in BWT • Selection sort and quick sort implementation • BwtImplementation on GPU • Comparative study

  4. Time-Line • Application Domain & objective • General Alignment Procedure • Scope of parallelism in BWT • Selection sort and quick sort implementation • BwtImplementation on GPU • Comparative study

  5. Time-Line • Application Domain & objective • General Alignment Procedure • Scope of parallelism in BWT • Selection sort and quick sort implementation • Bwt Implementation on GPU • Comparative study

  6. Time-Line • Application Domain & objective • General Alignment Procedure • Scope of parallelism in BWT • Selection sort and quick sort implementation • BwtImplementation on GPU • Comparative study

  7. Time-Line • Application Domain & objective • General Alignment Procedure • Scope of parallelism in BWT • Selection sort and quick sort implementation • Bwt Implementation on GPU • Comparative study

  8. Application Domain & Objective • Analyzing Gene expression • Mapping variations between individuals • Mapping homologous Proteins • Assembling Genome of Organism To present an efficient implementation (Specially parallel) that effectively aids the problem of searching for short sequences in DNA.

  9. Basic Alignment Procedure To be parallelized Genome Indexing BWT Algorithm Intermediate size :10^18 Mapper Reads BWT : Bwt[i] = Ref(SA[i]-1) {3 GB } Suffix Array : 15GB for human genome {3 billion * 4 B + 3 GB genome} Parallelized O(logG) Searching { Location,Occurance}

  10. Scope of Parallelism in BWT • With BWT , w length string can be find in O(w) time. • The BWT is closely related to the suffix array • Lexicographic sorted list of all suffixes in a genome. • Bwt[i] = ref [ SA[i] -1] {Bwt[i] = $ when S(i) =1} BWT

  11. Implementation of Bwt using Selection Sort OpenMp Initial Step - 1

  12. Selection Sort - Openmp

  13. Initial Step - 2 • Implementation of Bwt using Selection Sort • OpenMp • Implementation of Bwt using Quick Sort • OpenMp

  14. Quick Sort - Openmp

  15. Initial Step - 3 • Implementation of Bwt using Selection Sort • OpenMp • Implementation of Bwt using Quick Sort • OpenMp • Implementing Bwt on GPU • Bitonic sort

  16. Why Bitonic ??... • Concatenations of two sub-sequences sorted in opposite directions • A cyclic shift of elements • Implemented by comparator networks • Work in place • No Communication • Naturally suitable for SIMD architectures • Each thread executing same code but different data • O(log2n) time and O(nlog2n) work

  17. Burrows-Wheeler Transform Basic String Sorting Algorithm Input: A C G T A $ indices: 0 1 2 3 4 5 indices: 5 4 0 1 2 3 Output: A T $ A C G

  18. Steps Performed • Copy Genome from host to device Memory • Indices Array for pointing Reference string • Compare Suffix based on indices array • Swap indices accordingly. • Sorts n elements in log2n Kernel calls. • Each of O(1) time & O(n) work • One more step for BWT from suffix array • Bwt[i] = ref [ SA[i] -1] {Bwt[i] = $ when S(i)= 1}

  19. CPU – GPU Interaction (BWT) Initialise_indices_array O(log2G) Searching Genome Bitonic_sort_step Cuda_Memcpy & kernel call CpuBitonic Suffix Array Suffix_compare Suffix - > BWT

  20. Evaluation Bwtwith Bitonic Sort

  21. Comparison between Expected (GPU) and Exact result (Quick_Sort_time) * 2 ) / 240

  22. References : • Fast in-place sorting with CUDA based on bitonicsort :Hagen Peters • Rapid Parallel Genome Indexing with MapReduce :Rohith K. Menon • M. Burrows and D. Wheeler. A Block-Sorting Lossless Data Compression Algorithm. Technical report • Lightweight Data Indexing and Compression in External Memory :Paolo Ferragina • Parallel Lossless Data Compression on the GPU : Yao Zhang

  23. Thanks

  24. Future Work • Run in limited memory environments • Compute in parts • To use the memory hierarchy of GPU • Sort keys are cached in register or shared memory • Long runs of repeated character • Position indicating end of run • Can only sort sequence,with length power of 2 • 2k+1  2k+1 • Padding with largest symbol

More Related