Co-processors for speeding up drug design algorithms

Co-processors for speeding up drug design algorithms Advait Jain Priyanka Jindal Pulkit Gambhir

Objective To design FPGA based hardware accelerators for speeding up the energy minimization process.

Bottleneck Functions Eval energy Diff energy Eval Energy for step Iterate over list of bonds {O(N) elements} Iterate over list of angles {O(N) elements} Iterate over list of dihedrals {O(N) elements} Iterate over list of non-bonded pairs {O(N2) elements}

Mathematical operations EvalEnergy_for_step() DiffEnergy()

Non-bonded List Node structure Float A, B, C (4*3 bytes) Int a1, a2 A, B Are a function of radius and epsilon of atoms. 192 distinct pairs of A,B (1 byte) C is a function of charge q1 and q2 of atoms. 471,282 distinct Cs (3 bytes)

New Data Structure New Node structure 3d coordinates of atoms Int a1, a2 Vector of Distinct Cs Unsigned common_index 3 1 Vector of Distinct (A,B) pairs

Generating the proposed Data Structure Given q1, q2 Calculate C Insert (C, C_index) Into the Hash table Key: C, Data: C_Index Repeat for all non-bonded pairs Node.Common index = C_Index (corresponding to C)

Hash Table to distinct C vector Hash table Vector of Distinct Cs (C, C_index) (C_index, C)

Result of new data structure Molecule Size: 2008 • VanderList: 2,008,417 • AB_Vander list: 136 • C_Vanderlist: 21,651

Cache Profiling (old vsnew) D1 misses: 3,158,603,092 ( 3,152,690,177 rd + 5,912,915 wr) D1 misses: 2,872,958,414 ( 2,868,217,925 rd + 4,740,489 wr) L2d misses: 1,270,584,560 ( 1,266,933,599 rd + 3,650,961 wr) L2d misses: 503,167,419 ( 500,920,315 rd + 2,247,104 wr) L2 misses: 1,270,606,180 ( 1,266,955,219 rd + 3,650,961 wr) L2 misses: 503,188,994 ( 500,941,890 rd + 2,247,104 wr)

Bottleneck Functions Eval energy Diff energy Eval Energy for step Iterate over list of bonds {O(N) elements} Iterate over list of angles {O(N) elements} Iterate over list of dihedrals {O(N) elements} Iterate over list of non-bonded pairs {O(N2) elements}

Split Up code

Ongoing Work • Multiple threads operating on the Non-bonded list together. • Floating point precision requirement.

Tentative Schedule • Software Profiling August • No. of calls • Cache misses • Effect of parameters • Control Flow Analysis August - September • Flow Diagram • Data parallelism • Floating point precision requirement • Exploring H/W Options September - October • Platform Selection • S/W H/W Partitioning • Implementation October onwards • Analysis

Co-processors for speeding up drug design algorithms

Co-processors for speeding up drug design algorithms

Presentation Transcript

Design Automation of Co-Processors for Application Specific Instruction Set Processors

Speeding Up Algorithms for Hidden Markov Models by Exploiting Repetitions

Speeding up VirtualDub

Speeding Up

Speeding Up Enumeration Algorithms with Amortized Analysis

SPEEDING UP SFRA AUTHORISATION for HDI’S

Warm-Up Methodology for HW/SW Co-Designed Processors

Speeding Up on Curves

Speeding Up Algorithms for Hidden Markov Models by Exploiting Repetitions

Speeding up VirtualDub

Speeding up transformer design through latest Web tool

Speeding Things Up

Parallel Algorithms for array processors

Speeding up on two string matching algorithms

Two Ways of Speeding Up Transactional Memory Algorithms

Speeding Up Rendering

Co-processors for speeding up drug design algorithms

Speeding up Slowing down

High-Throughput Screening Speeding Up CF Drug Discovery

Parallel Algorithms for array processors

Speeding Things Up

Speeding Up Algorithms for Hidden Markov Models by Exploiting Repetitions