140 likes | 151 Views
Designing hardware accelerators on FPGA for energy minimization in drug design algorithms. Analyzing bottlenecks and proposing new data structures for efficient processing. Ongoing work includes code optimization and exploring hardware options.
E N D
Co-processors for speeding up drug design algorithms Advait Jain Priyanka Jindal Pulkit Gambhir
Objective To design FPGA based hardware accelerators for speeding up the energy minimization process.
Bottleneck Functions Eval energy Diff energy Eval Energy for step Iterate over list of bonds {O(N) elements} Iterate over list of angles {O(N) elements} Iterate over list of dihedrals {O(N) elements} Iterate over list of non-bonded pairs {O(N2) elements}
Mathematical operations EvalEnergy_for_step() DiffEnergy()
Non-bonded List Node structure Float A, B, C (4*3 bytes) Int a1, a2 A, B Are a function of radius and epsilon of atoms. 192 distinct pairs of A,B (1 byte) C is a function of charge q1 and q2 of atoms. 471,282 distinct Cs (3 bytes)
New Data Structure New Node structure 3d coordinates of atoms Int a1, a2 Vector of Distinct Cs Unsigned common_index 3 1 Vector of Distinct (A,B) pairs
Generating the proposed Data Structure Given q1, q2 Calculate C Insert (C, C_index) Into the Hash table Key: C, Data: C_Index Repeat for all non-bonded pairs Node.Common index = C_Index (corresponding to C)
Hash Table to distinct C vector Hash table Vector of Distinct Cs (C, C_index) (C_index, C)
Result of new data structure Molecule Size: 2008 • VanderList: 2,008,417 • AB_Vander list: 136 • C_Vanderlist: 21,651
Cache Profiling (old vsnew) D1 misses: 3,158,603,092 ( 3,152,690,177 rd + 5,912,915 wr) D1 misses: 2,872,958,414 ( 2,868,217,925 rd + 4,740,489 wr) L2d misses: 1,270,584,560 ( 1,266,933,599 rd + 3,650,961 wr) L2d misses: 503,167,419 ( 500,920,315 rd + 2,247,104 wr) L2 misses: 1,270,606,180 ( 1,266,955,219 rd + 3,650,961 wr) L2 misses: 503,188,994 ( 500,941,890 rd + 2,247,104 wr)
Bottleneck Functions Eval energy Diff energy Eval Energy for step Iterate over list of bonds {O(N) elements} Iterate over list of angles {O(N) elements} Iterate over list of dihedrals {O(N) elements} Iterate over list of non-bonded pairs {O(N2) elements}
Ongoing Work • Multiple threads operating on the Non-bonded list together. • Floating point precision requirement.
Tentative Schedule • Software Profiling August • No. of calls • Cache misses • Effect of parameters • Control Flow Analysis August - September • Flow Diagram • Data parallelism • Floating point precision requirement • Exploring H/W Options September - October • Platform Selection • S/W H/W Partitioning • Implementation October onwards • Analysis