1 / 46

26 September 2000 James C. Lyke

Cellular Automata Based Reconfigurable Systems as a Transitional Approach to Gigascale Electronic Architectures. 26 September 2000 James C. Lyke. Outline. Introduction Background Description of reconfigurable cellular automata (RCA) arrays Summary of current status. Introduction.

bendek
Download Presentation

26 September 2000 James C. Lyke

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Cellular Automata Based Reconfigurable Systems as a Transitional Approach to Gigascale Electronic Architectures 26 September 2000 James C. Lyke

  2. Outline • Introduction • Background • Description of reconfigurable cellular automata (RCA) arrays • Summary of current status

  3. Introduction • Constraints common to all molecular systems • Limited interconnection fan-in/fan-out • No effective lithography approach capable of adequate throughput • Design intolerance to random defects

  4. The smaller the devices, the bigger the problems • Building the devices: very hard • Building the architectures: not easy • How do you harness 1012 simple devices? • Design capture, synthesis, verification? • How do you wire them together? • How do you assemble and package them? • How do you test finished devices? • How do you address yield issues? • How do you rectify design errors in “gigascale” designs?

  5. Trends in architectures • Interconnection growth • Increased use of programmable logic devices in digital design • Field programmable gate array (FPGA) devices

  6. Pad-limiting due to terminal count explosion

  7. Factors contributing to explosion in interconnections with diminishing scale • Non-scale-ability of resistance (R~L/A) • Packing considerations force minimum average length of interconnections to increase • Dimensionality of design • Hierarchy of design

  8. The challenge of interconnect • Rent’s rule establish the growth of terminals (interface signals) as a function of gate count1 • An empirical explanation T = A G p T – terminal count A- terminals per sub-module G- gate count p- Rent’s exponent (0<p<1) 1Bakoglu, Interconnections and Packaging for VLSI, Addison-Wesley, Reading MA, 1990

  9. Complex integrated circuits usually have p=1 p=0 p=0.8 p=0.5

  10. Nanoscale Interconnect • In order to be manageable at large scales of complexity, exponents of Rent’s rule must be be consistent with dimensionality • Two-dimensional (planar) systems p<(1/2) • Three-dimensional systems: p<(2/3) • Rent’s rule is a statistical observation and a guideline, but must be used with care • A complex design may have different Rent’s exponents at different hierarchical levels and regions of design

  11. Field Programmable Gate Arrays (FPGAs)

  12. Requirements for Complex Digital Design • Ability to form arbitrary arrangement of: • Logic • Memory • Interconnect • Field programmable gate arrays (FPGAs) emulate complex systems and allow these arrangements to be programmed

  13. CLB CLB CLB CLB CLB Structure of RAM-based FPGA Configuration logic Block (CLB) unspecified interconnection

  14. Adding USER memory to LUT a b c Short either, but not both a f LUT b D Q c f

  15. Routing in FPGA Devices pass transistor memory bit

  16. Design problem: F = A AND B G = C AND D Simple example G

  17. Typical FPGA (corner of XC3020) Many details suppressed Source: Xilinx datasheet

  18. Binary Cellular Automata:A lattice of computing points • Lattice of uniformly spaced point sites in 1,2, or 3 dimensions • Each point has a value of {0} or {1} • Value of each site updated at discrete time intervals • Updates are computed as a function of local neighborhood only

  19. Conversion of 1-D CA into a 2-D spatial computation structure

  20. Cellular Automata for Molecular Electronics • Cell behavior normally fixed and homogeneous across entire array • Turing complete • Normally perfect (mathematical abstraction) • Not practical for molecular implementation due to defects • Recover from this by relaxing first assumption • Allow rules at any site to be chosen from a set that is “Boolean complete”

  21. Redefine CA sites as look-up tables (LUTs) • An 3-input LUT (LUT-3) can implement all cellular automata rules of neighborhood 3 LUT3 A C B

  22. Boolean functions as CA rules

  23. Reconfigurable Cellular Array (RCA)

  24. reconfigurable cellular automata -advantages for molecular architecture • Periodic structure • Amenable to chemical self-assembly • Reconfigurability-Logical behavior of each cell independently and repetitively programmable after fabrication • Low interconnection demand • Defect tolerance

  25. 3LUT 3LUT 3LUT C D A B Equivalence (functional isomorphisms) between CA and random logic forms A A B B F C C F D D (b) (a) C D A B C D A B F F F (d) (e) (c)

  26. Defect tolerance: before and after

  27. How template choice affects implementation of 4-input majority gate on three different RCA templates size: 21 size: 12 size: 20

  28. 3LUT tile of single cell type A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A

  29. 2-LUT system based on two cell types

  30. Combinational logic x Y(t) Y(x,h) register array h h(t) Example of more complex architectures using RCA tiles Equivalent representations

  31. Another RCA Architecture (Example) register file (bit array) clock m x n tile of LUTs register file (bit array) m x n tile of LUTs clock register file (bit array) Creates feedback path necessary for general clock-mode sequential behavior

  32. Detail of bit array between tiles

  33. Configuration of RCA • LUT memories implemented as shift register (2-phase clock) • Multiple configuration chains for large device • Not fault-tolerant method of bit-stream distribution

  34. A computation result has a limited range of propagation Dead zones Inputs: cannot be reached Outputs: no results can be used A side effect - “cones of influence”

  35. Routing heuristics for RCA structures may require simple modifications • (left) netlist example to be routed • (center) results (incorrect) from typical FPGA routing tool (error node in red) • (right) corrected results for RCA • Requires definition of a node resolution function (node in yellow)

  36. Training neural nets to design circuits:Results of experimental neural-net based design tools, demonstrating combined “heuristics” (simultaneous technology mapping, placement, and routing)

  37. An n-input look-up table is adequately modeled by a perceptron network with n neurons in its hidden layer * • based on analysis of Vapnik-Chervonenkis dimension • proved with brute force simulation for n = 3 case

  38. 3LUT tile and neural network model

  39. How neural nets are trained to design circuits • Abstract a neural net model • Use truth table as the training set (= test set) • Build back-propagation system around tile to train (adjust weights of neurons) • Train / re-train with randomized version of training set until convergence occurs (if it occurs)

  40. Neural network circuit designer tally compare offset

  41. NN-produced results for 2-bit multiplier • Designs are not optimal, but they work • Could be improved with post-processing to remove nonsense constructs

  42. Comparison of Conventional FPGAs to Reconfigurable Cellular Arrays (RCAs) • Similarities • Both use LUTs • Both are software configured with serial bitstreams • Differences • RCAs have no programmable routing • RCAs support only nearest neighbor connections • RCAs has much simpler (periodic) structure

  43. Benchmark rationale • The design of FPGA architectures and their ability to express architectures is empirical • Benchmark suites exist (e.g. MCNC, PREP) to permit comparison of FPGA architectures and algorithms • Comparitive findings in benchmarking is the best current known way of establishing some yardstick, given that most steps in CAD are NP-complete and optimality cannot be proved in general

  44. Other issue #1 • Departure from regularity • Small world (0<p<1 fraction of connections dislocated from lattice) • Semi-structured (Most LUTs point in the right direction) • Amorphous / random structure • Challenge: find O(N) algorithms for “structure discovery” • Question: Can we establish statistical evidence that semi-structured / amorphous cellular networks are adequate as media for hosting complex designs?

  45. Other issue #2 1 cm2 chip 100 um ~840 molecular gates • Signal delivery from “outside world” • X-Y signal grid of 100 microns • Molecular (fractal?) distribution network may be required to combat signal starvation at nanoscale network level Signal terminal from “outside world”

  46. Summary • Reconfigurable cellular arrays are promising as a molecular-scale architecture • Interconnect, defect tolerance, self-assembly • Templates can be tuned to specific molecular concept • Even as abstract approach, some important loose ends need to be dealt with • Configuration bitstream • Hierarchical assembly specifics • Proof that media is competitive with a standard FPGA approach if it could be scaled to molecular levels

More Related