1 / 39

Generating Fast Multipliers using Clever Circuits

This research paper by Mary Sheeran at Chalmers University of Technology focuses on generating fast multipliers using clever circuits. The paper introduces a functional language for describing hardware circuits, emphasizing connection patterns and allowing users to write circuit generators. The paper also highlights the use of clever circuits to control the presence or absence of components and the shaping of circuit wiring. The paper explores the structure and layout of the multiplier circuit, including the reduction tree and the use of clever circuits to adapt to delays and wiring constraints. The paper concludes with potential future work and applications of the research.

mcintire
Download Presentation

Generating Fast Multipliers using Clever Circuits

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Generating Fast Multipliers using Clever Circuits Mary Sheeran Chalmers University of Technology Research funded by SRC in an Intel-custom project, and by Vetenskapsrådet

  2. Using a functional language to describe hardware Gives a style of circuit description and analysis Emphasises connection patterns User writes circuit generators

  3. Interleave f f ilv f unriffle ->- two f ->- riffle

  4. Butterfly bfly (n-1) circ bfly (n-1) circ

  5. Defining Butterfly bfly 0 circ = id bfly n circ = ilvN (n-1) circ ->- two (bfly (n-1) circ) two copies of smaller butterfly circuit

  6. Butterfly Layout on an FPGA

  7. BUT High performance data paths are in reality NOT regular! Start out regular and become less so as design proceeds -- end with analogue design of each instance of each cell! ”It’s all in the wires”

  8. Shadow Values gen. bfly bafter Info. about what is bigger/smaller (98 comparators) updated by components (dynamic) Only necessary sub-sorters included

  9. in1 a1 Clever Circuits decide what component to be based on on shadow values produced when a particular component is used Try it and see during generation

  10. Clever circuits give control over Presence or absence of components (Charme03) Shape of circuit wiring (this paper) Circuit topology (next paper)

  11. Multiplication 11010 01001 11010 00000 00000 11010 00000 0011101010

  12. Multiplication msb 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 0

  13. Multiplication lsb 0 1 0 1 1 0 0 0 0 0 0 0 0 0 0 0 1 0 1 1 0 0 0 0 0

  14. Structure of multiplier

  15. multBin comps (as,bs) = p1:ss where ([p1]:[p2,p3]:ps) = prods_by_weight (as,bs) is = redArray comps ps ss = binaryAdder ([p2,p3]:is) redArray comps ps = is where (is,[]) = row (compress comps) ([],ps)

  16. Reduction tree for multiplier 5 4 4 3 3 carries 2 Fast Adder

  17. Will concentrate on the reduction tree (a row of compress cells) Assume partial products already generated (e.g. using and gates). May also include recoding to reduce size of tree (cf. Booth)

  18. f-cell n Compress (diff=2) n-2 2

  19. diff > 2 diff < 2 k k wcell hcell k+2 k-1

  20. compress comps (as,bs) | (diff > 2) = (compress comps |- hcell comps) (as,bs) | (diff == 2) = column (fcell comps) (as,bs) | (diff < 2) = (compress comps -| wcell comps) (as,bs) where diff = length bs - length as

  21. possible fcell c fullAdd s halfAdd cells similar. Gives standard array multiplier. Not great!

  22. fullAdd c s Only need to vary wiring!Make it explicit iC s3 cc iS

  23. Dadda-like c fullAdd toEnd (a,as) = as++[a] s Excellent log depth reduction tree , but known for irregularity, difficult layout

  24. pictureby Henrik Eriksson, Chalmers

  25. Delay model for half adder halfAddI (as, bs, ac, bc) [a,b] = [s,cout] where s = max (as+a) (bs+b) cout = max (ac+a) (bc+b) as is delay between a input and sum output etc. hI as = halfAddI (10,10,5,5) as fI as = fullAddI (20,20,10,10,10,10) as

  26. comps, tuple of building blocks Checking gate delay dDadG n = simulate (redArray (hI,fI,toEnd,toEnd,id,sep2,sep3)) (ppzs n) Gate delay models wiring cells (allow . inclusion of wiring delay) Main> dDadG 16 [[0,10],[5,20],[20,30],[30,40],[40,50],[50,50],[50,60],[60,70],[70,70], [70,70],[70,80],[70,80],[80,90],[90,90],[90,90],[90,90],[90,90],[90,90], [80,90],[80,80],[70,80],[70,80],[70,70],[60,70],[60,60],[50,60],[50,50], [40,20],[0,20]]

  27. Promising, but we can do better! Choose what wiring cells to use dynamically, during circuit generation, rather than in advance Base choice on delay behaviour of both wires and components

  28. cleverInsert s3 c fullAdd cc s cleverInsert Idea: Harden the wiring during circuit generation using clever circuits. Shadow values estimate delay through wires and cells.

  29. cswap((a,x),(b,y)) = if (x>y) then ((b,y),(a,x))else((a,x),(b,y))

  30. cleverInsert = row cswap ->- apr forms necessary wiring based on context (delays on shadow wires)

  31. Structure of circuit generator remains unchanged adapt (hAdd, fAdd, cc) (d,pds) = mmark pds ->- redArray (hAdd // hIB, fAdd // fIB, Haskell level circuit level cInsert, cInsert, cc // cross d, sep2, sep3) ->- unmark

  32. Result (multiplication) Simple parameterised description of fast adaptive multiplier. Like Three Dimensional Method except that wire-length, and not only gate-delay is taken into account in choosing which connections to make Promises to perform well (better than modified Dadda and TDM)

  33. Result (multiplication) Adaption to incoming delay profile can be arranged (clever circuits again). Can also easily adapt description to take account of limitations on cross-cell tracks (see paper) Much remains to be done (e.g. insertion of buffers, fine delay modelling, transistor sizing, other layouts, the rest of the multiplier...).

  34. Result (general) Non-standard interpretation used after generation (as we have long done) and now also to guide synthesis. Circuit generators short and sweet and LOOK LIKE circuit descriptions. High degree of parameterisation. Application areas? Module generation for full custom / SoC / FPGA Ideas are completely compatible with Intel’s IDV system (see talk by Greg Spirakis at this conference)

  35. Result (general) Clever circuits a good idiom. Can control choice of components, wiring and topology. Greatly increase expressive power of the connection patterns approach. Gives a way to allow non-functional properties to influence design (even early on) Vital as we move to deep sub-micron Separation of concerns becoming less and less possible

  36. Formal Verification?? Have verified small-sized versions of multipliers (Bjesse, Synopsys) Should verify generators (see Hunt’s seminal work) Investigating generation of FOL for verification of Haskell programs (Cover project at Chalmers)

  37. What next? Want to go the whole hog and generate layouts for high performance arithmetic circuits from Wired Need help with the formal verification of generators And it is time to return to refinement

More Related