130 likes | 331 Views
QR Decomposition Algorithms on Various Architectures. Vito Dai Brian Limketkai CS 252 Project – Spring 2000. Outline. QR decomposition Application Algorithm Parallelism Implementation Trimedia VLIW Berkeley VIRAM Performance results. Application of QR. Communication systems
E N D
QR Decomposition Algorithms on Various Architectures Vito Dai Brian Limketkai CS 252 Project – Spring 2000
Outline • QR decomposition • Application • Algorithm • Parallelism • Implementation • Trimedia VLIW • Berkeley VIRAM • Performance results
Application of QR • Communication systems • Beam forming • Adaptive filtering Courtesy: DERA
QR Array – Givens Rotation y n1 n2 n3 n4 xin, y q V q xout, 0 yin V xin, yin V q q V xout, yout yout
xin, y q q xout, 0 Cordic Algorithm
Parallelism for( i = 0; xshift = yshift = itanval = cond High-level parallelism Low-level parallelism Medium-level parallelism
Architectural Study • Parallel architectures • MIPS baseline • Trimedia VLIW • Berkeley VIRAM • Results
V V Trimedia VLIW • 5 instruction slots • Predicated instructions – 2.7 • Automated loop unrolling – 4.1 • 2-way rotate – 4.7 IF RCOND iadd R2V R1VSHIFT->R2V, IF RCOND isub R1V R2VSHIFT->R1V, IF RNCOND isub R2V R1VSHIFT->R2V, IF RNCOND iadd R1V R2VSHIFT->R1V, nop;
V V V Berkeley VIRAM • N-way rotate