Parallel Implementation of BWT

Parallel Implementation of BWT Presented By: Lalchand Gaurav Jain Under the Guidance of : Prof . Kolin Paul

Agenda • Application Domain & objective • Use of Bwt in Sequence assembly • Bwt Implementation on GPU • Bwt Implementation for larger Genome • Comparative study

Application Domain & Objective • Analyzing Gene expression • Mapping variations between individuals • Mapping homologous Proteins • Assembling Genome of Organism To present an efﬁcient implementation of BWT for larger Genome.

Use of Bwt in Sequence assembly Genomee Indexing BWT Algorithm Intermediate size :10^18 Assembly Process BWT : Bwt[i] = Ref(SA[i]-1) {3 GB } Suffix Array : 15GB for human genome {3 billion * 4 B + 3 GB genome} SGA Contigs

Burrows-Wheeler Transform Input: A C G T A $ indices: 0 1 2 3 4 5 indices: 5 4 0 1 2 3 Output: A T $ A C G • Bwt[i] = ref [ SA[i] -1] {Bwt[i] = $ when S(i)= 0}

Work Done • Implemented Bwt on GPU • Bitonic sort • Implemented Bwt for larger genome • In mutipass (GPU and CPU)

Concatenations of two sub-sequences sorted in opposite directions A cyclic shift of elements Implemented by comparator networks Work in place No Communication Naturally suitable for SIMD architectures Each thread executing same code but different data O(log2n) time and O(nlog2n) work Why Bitonic ??...

Bwt Procedure For larger Genome Genome 2*CHUNK Read & store (CPU) Bitonic_sort_step Calcualte Gt array Merge Suffix array (CPU) Suffix - > BWT Calcualte Gap array Suffix array (CPU)

Comparison between Parallel BWT(GPU) and serial BWT (CPU) Serial Bwt : Does not work for large files

Comparison between Parallel BWT (GPU) and Parallel BWT (CPU)

Evaluation for larger Genome

References : • Lightweight Data Indexing and Compression in External Memor • Paolo Ferragina 1, Travis Gagie2 , and Giovanni Manzini • Fast in-place sorting with CUDA based on bitonic sort :Hagen Peters • Rapid Parallel Genome Indexing with MapReduce :Rohith K. Menon • M. Burrows and D. Wheeler. A Block-Sorting Lossless Data Compression Algorithm. Technical report • Lightweight Data Indexing and Compression in External Memory :Paolo Ferragina • Parallel Lossless Data Compression on the GPU : Yao Zhang

Thanks

Parallel Implementation of BWT

Parallel Implementation of BWT

Presentation Transcript

Implementation of Parallel Processing Techniques on Graphical Processing Units

Parallel implementation of RAndom SAmple Consensus (RANSAC)

Implementation of a parallel web proxy server with caching

Parallel Beam Back Projection: Implementation

A Parallel Implementation of MSER detection

The PFunc Implementation of NAS Parallel Benchmarks.

Implementation of Parallel Algorithms for Heterogeneous Platforms

Using Tiling to Scale Parallel Datacube Implementation

Bulk-Synchronous Parallel ML Implementation of the Parallel Superposition

Bulk-Synchronous Parallel ML Semantics and Implementation of the Parallel Juxtaposition

Implementation of a Parallel K-Nearest Neighbor Algorithm Using MPI

Memory-aware BWT by Segmenting Sequences

Implementation of Parallel Processing Techniques on Graphical Processing Units

Implementation of Parallel Simulated Annealing

A Parallel Algorithm for Hardware Implementation of Inverse Halftoning

COMPUTATIONALLY EFFICIENT ALGORITHM FOR PARALLEL IMPLEMENTATION OF ZEROTREE CODING

Design and Implementation of the CCC Parallel Programming Language

Implementation of Parallel Processing Techniques on Graphical Processing Units

A Parallel Algorithm for Hardware Implementation of Inverse Halftoning

Implementation of Computational Algorithms using Parallel Programming