1 / 69

Para-CORDIC: Parallel CORDIC Rotation Algorithm and Architecture

Explore the innovative Para-CORDIC system parallelizing CORDIC rotation for optimized performance in various applications. Learn about basic concepts, proposed methods, bottleneck solutions, comparisons, and real-world applications.

maryg
Download Presentation

Para-CORDIC: Parallel CORDIC Rotation Algorithm and Architecture

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Para-CORDIC: Parallel CORDIC Rotation Algorithm and Architecture (IEEE T-CAS I, Vol. 51, No. 8, pp. 1515-1524, Aug. 2004) Tso-Bing Juang, Ph.D VLSI Design LAB, Dept. CSE, NSYSU tsobing@cse.nsysu.edu.tw

  2. My Research – Computer Arithmetic • Applications of arithmetic components • DSP (Digital Signal Processing) • 3-D graphics • Computer communications, etc. • Topics of arithmetic [Ercegovac 2004]: • Addition/Subtraction • Multiplication/Division • Floating-point operations • CORDIC (COordinate Rotation DIgital Computer)

  3. My Publications (1999-2005)

  4. Academic Honors • Best thesis award, Xerox Co. Ltd, 1995 • Join Midwest Symposium of Circuits and Systems (MWSCAS) supported by NSC, 1999 • First prize award of FPGA, National Intellectual Property Contest. FPGA, 2000 • First prize award of Full Custom Design Contest, 2001 • Join Asia-Pacific Conference on Circuits and Systems (APCCAS) supported by MOE, 2002 • 2005 Marquis, Who’s who in Science and Engineering, Edition 2005-2006 • 2006 Marquis, Who’s who in the World

  5. Outline • Basic Concept of CORDIC • Bottleneck of CORDIC Rotation • Proposed Methods • Previous Methods • Comparisons • Applications • Conclusions

  6. 1. Basic Concept of CORDIC

  7. What is CORDIC? • CORDIC (COordinate Rotation DIgital Computer) • Rotate vector (1,0) by f to get (cos f, sin f) • Can evaluate many arithmetic functions • Rotation realized by shift-add operations • Convergence method (iterative) • About n iterations for n-bit accuracy

  8. Conventional CORDIC Rotation . Each iteration, x and y performs one micro-rotation based on the sign of z

  9. CORDIC Functions

  10. Pre-computation of tan(ai) • Find ai such that tan(ai)=2-i(or, ai=tan-1(2-i)) • Possible to writeany angle f = a0  a1  …  anas long as -99.7°  f  99.7° (which covers –90..90)

  11. Conventional CORDIC Rotation • Algorithm: (z is the current angle) • “At each step, try to make z approach to zero” • Initialize x0=K=0.607253,y0=0,z0= • For i = 0 n • i= 1 when zi>=0, else -1 [i.e., i=sign(zi)] • xi+1 = xi–i 2-i yi • yi+1 = yi + i 2-ixi • zi+1 = zi–i ai • End For • Result: xn+1=cos(), yn+1=sin() • Precision: n bits

  12. Example (z0==30=0.1000001102)

  13. CORDIC Hardware

  14. Three Important Factors of CORDIC • Large additions/subtractions • Scaling factor (constant vs. non-constant) • Sequential execution

  15. Research Topics about CORDIC • Redundant CORDIC architecture • Error analysis of CORDIC • Application of CORDIC architectures • CORDIC algorithm with non-constant scaling factors • Parallel CORDIC architecture

  16. 2. Bottleneck of CORDIC Rotation

  17. Conventional CORDIC Rotation (Revisited) . Sequential determination of σi based on zi

  18. The actual speed bottleneck lies in the sequential determination of the value of Sequential CORDIC Rotation Architecture

  19. 3. Proposed Methods

  20. How to parallelize? • Using each bit of input angle to determine σi • Remove the bottleneck (B: bit accuracy) • In the first m-1 iterations  sequential • In other iterations  parallel

  21. For example, B=24 Our Proposed Techniques • MAR (Micro-rotation to Angle Recoding) • Obtain the combinations of tan-1 terms in each 2-i, i=1 to m-1 • BBR (Binary to Bipolar Recoding) • Obtain the polarity{-1,+1} of each binary {1,0} weight of input angle  hardware free

  22. Example (B=24) Phase 1 Three extra micro-rotation stages are required Phase 2

  23. Architecture of a 24-b CORDIC –based SIN/COS Generator

  24. Algorithm of MAR

  25. Our MAR Results

  26. Our MAR Results

  27. Para-CORDIC Architecture -1/2

  28. S(1) σ1 S(5) S(8) R(1) Para-CORDIC Architecture -2/2 R(i)

  29. Carry-save Adder-Based Realization for Micro-Rotation Stages • A 4:2 compressor is exploited to produce the carry save form (a sum and a carry)

  30. Evaluation of the Z Datapath • Delay is: • Area is:

  31. The delay of Z Datapath

  32. Merged Rotations of the Second Half Iterations • Delay savings

  33. 4. Previous Methods

  34. Comments of Previous Proposed CORDIC Rotation – 1/4 • [Wang 1997]: IEEE T-Computers • The first m-1 iterations are sequential • Area saving

  35. Comments of Previous Proposed CORDIC Rotation - 2/4 • [Phatak 1998]: IEEE T-Computers • Double hardware to perform clockwise/counterclockwise rotations • Area cost is high (signed-digit realization of X/Y/Z iterations)

  36. Comments of Previous Proposed CORDIC Rotation - 3/4 • [Kwak 2000] Proc. MWSCAS • Complicated logic circuits to generate the first m-1 rotation directions

  37. Comments of Previous Proposed CORDIC Rotation - 4/4 • [Kuhlmann 2002] : EUROSIP • Using ROM to generate the first m-1 directions

  38. Our Proposed Para-CORDIC • The delay and the area costs of para-CORDIC is: and

  39. 5. Comparisons

  40. Latency Comparisons

  41. Area Comparisons

  42. 6. Applications

  43. ROM-based Implementations for sine/cosine generation • When x1 and y1 are constant (x1=K, y1=0, xB+1=cos(), yB+1=sin()) • Can reduce the extra micro-rotation stages

  44. Optimal Number of ROM Entries

  45. Optimal Number of ROM Entries

  46. 7. Conclusions

  47. Summary • Parallel CORDIC rotation (Para-CORDIC) • Improve the original sequential execution of CORDIC rotation • Complete proof of the proposed theorems • Submission information • 2003/7/11 submitted • 2004/4/21 fully accepted • 2004/8 published • Better latency/area

  48. Future Work • Physical implementation of Para-CORDIC • Dealing with the negative numbers when perform carry-save addition • Floating-point representation of data • Reduced micro-rotation stages in MAR • Parallel CORDIC Vectoring Methods • Must deal with two concurrent variables

  49. Low-Error Fixed-WidthCarry-Free Multipliers Design ( To appear in IEEE T-CAS II, 2005)

  50. Definition • An n nfixed-width multiplier • Has n most significant product bits • Needs a small compensation circuit to generate error compensation value (ECV) • ECV • Constant • Fixed • Simple implementation, large errors • Adaptive • Variable • Complex implementation, lower errors

More Related