190 likes | 206 Views
This research focuses on optimizing loop execution for DSPs using Auto-Increment/Decrement architecture. Features include data ordering, address register allocation, and architectural constraints. Experimental results show improvements in execution cycles compared to TI’s compiler.
E N D
Addressing Optimization for Loop Execution Targeting DSP with Auto-Increment/Decrement Architecture Wei-Kai Cheng Youn-Long Lin* Computer & Communications Research Laboratories *CS Department, NTHU Taiwan
Overview • Features: • Auto-Increment/Decrement for Address Generation • Constraints for Loop Execution • Optimization Methods: • Multi-Phase Data Ordering • Graph-Based Address Register Allocation • Block Access Graph
New Constraints • Loop Execution • Data Ordering Constraint • Address Register Allocation Constraint • Architectural Constraint • Different arrays are stored in disjoint memory space • Multiple auto-increment/decrement ranges in the instruction set architecture
Approach • Split the access sequence into data lists • Array • Iteration Stride • Data Ordering • Address Register Allocation • Data Lists Merging or Splitting
Address Register Allocation • # data lists > # address registers: • data list merging • # data lists < # address registers: • data list splitting
Experimental Results * number of data lists and data ordering applied
Experimental Results (Cont.) * ratio over TI’s compiler in term of inserted instructions T: TI’s compiler O: Our algorithm o: data ordering a: address register allocation
Experimental Results (Cont.) * ratio over TI’s compiler in term of execution cycles T: TI’s compiler O: Our algorithm o: data ordering a: address register allocation
Conclusions • Data ordering is not so effective in loop execution • Data list splitting is more important than data list merging