860 likes | 885 Views
Arguably. The Imminent Practicality of Reversible Computing. Michael P. Frank University of Florida Departments of CISE and ECE mpf@cise.ufl.edu Talk at IBM Research Yorktown Heights, New York August 28, 2003. Abstract.
E N D
Arguably The Imminent Practicality of Reversible Computing Michael P. FrankUniversity of FloridaDepartments of CISE and ECE mpf@cise.ufl.edu Talk at IBM ResearchYorktown Heights, New YorkAugust 28, 2003
Abstract • The practicality of reversible computing (RC), even for the long-term, has historically been very controversial. • But, numerous deal-breakers conjectured for RC have already succumbed to the steady march of engineering progress. • The remaining few research-level engineering problems with RC do not appear to be fundamentally insoluble. • E.g., I discuss some simple design concepts for addressing the leakage and power supply problems. • A comprehensive numeric analysis forecasts orders-of-magnitude cost-efficiency benefits from RC growing over the next few decades, starting very soon! • I conclude with a technology R&D plan to achieve this goal.
My Background w.r.t. RC • First heard of RC from K. Eric Drexler’s early nanotech writings & class taught @ Stanford in ’88 • KED had studied issues in reversible nanomechanical logics • Designed one of the first universal DNA computers @ MIT LCS in ’94-’95 • Found that the thermochemistry of our process required the machine design to be reversible! • Ph.D. on adiabatic VLSI and RC theory under Tom Knight & Norm Margolus @ MIT AI/LCS, ’96-99 • Group produced numerous RC-related innovations • Head of “Reversible & Quantum Computing Group” at U. of Fla., Coll. of Eng., CISE/ECE depts., ’99-present • Emphasis on engineering needed to make RC practical
Important Contributions to Date • Found & fixed an adiabaticity bug in SCRL • 1st fully-adiabatic sequential CMOS logic style • First proof of asymptotic speedups from RC • Required novel physically-based models of computing • Architecture / detail design of 1st truly adiabatic CPUs • Primary designer, FlatTop, major contribs. to Pendulum • Simpler, more efficient adiabatic logic/memory styles • 6-tick SCRL, 4-rail 2LAL, simple SRAM/DRAM • Increasingly sophisticated cost-efficiency analyses • Taking algorithmic/adiabatic overheads into account • Design concepts for addressing remaining challenges
Moore’s Law vs. the Fundamental Physical Limits of Computing
We are here ITRS Feature Size Projections 1000 Bacterium uP chan L DRAM 1/2 p min Tox max Tox 100 Virus XFET Protein Feature Size (nanometers) 10 molecule DNA molecule 1 thickness Atom 0.1 1995 2000 2005 2010 2015 2020 2025 2030 2035 2040 2045 2050 (From 1999 roadmap) Year of First Product Shipment
(½CV2 gate energy calculated from ITRS ’99 geometry/voltage data)
Fundamental Physical Limits of Computing ImpliedUniversal Facts Affected Quantities in Information Processing Thoroughly ConfirmedPhysical Theories Speed-of-LightLimit Communications Latency Theory ofRelativity Information Capacity UncertaintyPrinciple Information Bandwidth Definitionof Energy Memory Access Times QuantumTheory Reversibility 2nd Law ofThermodynamics Processing Rate Adiabatic Theorem Energy Loss per Operation Gravity
s″0 s0 0 0 Landauer’s 1961 principle from basic quantum theory Before bit erasure: After bit erasure: Ndistinctstates … … … sN−1 s″N−1 0 0 2Ndistinctstates Unitary(1-1)evolution s′0 s″N 1 0 Ndistinctstates … … … … s′N−1 s″2N−1 1 0 Increase in entropy: S = log 2 = k ln 2. Energy lost to heat: ST = kT ln 2
Adiabatic Cost-Efficiency Benefits Scenario: $1,000/3-years, 100-Watt conventional computer, vs. reversible computers w. same capacity. ~100,000× ~1,000× Best-case reversible computing Bit-operations per US dollar Worst-case reversible computing Conventional irreversible computing All curves would →0 if leakage not reduced.
Quantum Computing • Relies on coherent, global superposition states • Required for speedups of quantum algorithms, but… • Cause difficulties in scaling physical implementations • Invokes externally-modulated Hamiltonian • Low total system energy dissipation is not necessarily guaranteed, if dissipation in control system is included • Known speedups for only a few problems so far… • Cryptanalysis, quantum simulations, unstructured search, a small handful of others. Progress is hard… • QC might not ever have very much impact on the majority of general-purpose computing.
Reversible Computing • Requires only an approximate, local coherence of ‘pointer’ states & direct transitions between them • Ordinary signal-restoration plus classical error correction techniques suffice; fewer scaling problems • Emphasis is on low entropy generation due to quantum evolution that is locally mostly coherent • Requires we also pay attention to dissipation in the timing system, integrate it into the system model. • Benefits nearly all general-purpose computing • Except fully-serial, or very loosely-coupled parallel, when the cost of free energy itself is also negligible.
Fundamental Physical Limits of Computing ImpliedUniversal Facts Affected Quantities in Information Processing Thoroughly ConfirmedPhysical Theories Speed-of-LightLimit Communications Latency Theory ofRelativity Information Capacity UncertaintyPrinciple Information Bandwidth Definitionof Energy Memory Access Times QuantumTheory Reversibility 2nd Law ofThermodynamics Processing Rate Adiabatic Theorem Energy Loss per Operation Gravity
s″0 s0 0 0 Landauer’s 1961 Principle from basic quantum theory Before bit erasure: After bit erasure: Ndistinctstates … … … sN−1 s″N−1 0 0 2Ndistinctstates Unitary(1-1)evolution s′0 s″N 1 0 Ndistinctstates … … … … s′N−1 s″2N−1 1 0 Increase in entropy: S = log 2 = k ln 2. Energy lost to heat: ST = kT ln 2
Focus of most of the work on adiabatics to date Some Loss-Inducing Interactions For ordinary voltage-coded electronics: • Interactions whose dissipation scales with speed: • Parasitic EM emission from dynamic (C,L) reactances • Scattering of ballistic electrons from lattice imperfections, causing Ohmic resistance • Interactions having different scaling laws: • Interference from outside EM sources • Thermally-activated leakage of electrons over potential energy barriers • Quantum tunneling of electrons through narrow barriers (sub-Fermi wavelength) • Losses due to intentional treatment of known physical information as entropy (bit erasure)
Some Ways to Reduce Losses • EM interference / emission:Add shielding, use high-Q MEMS/NEMS oscillators • Scattering/resistance:Ballistic FETs, superconductors • Thermal leakage:avoid low VT and/or high temps • Tunneling:thick tunnel barriers, high-κ dielectrics, conductors w. low Fermi-level/high electron affinity, vacuum-gap barriers? • Intentional bit erasure:reduce voltages, use mostly-reversible adiabatic logic designs
Adiabatic Circuits and Reversible Computing Commonly Encountered Myths, Fallacies, and Pitfalls (in the Hennessy-Patterson tradition)
“Someone proved that computing with <<kT free-energy loss per bit-operation is impossible.” “Physics isn’t reversible.” “An energy-efficient adiabatic clock/power supply is impossible to build.” “True adiabaticity doesn’t require reversible logic.” “Sequential logic can’t be done adiabatically.” “Adiabatic circuits require many clock/power rails and/or voltage levels.” “Adiabatic design is necessarily difficult.” Myths about Adiabatic Circuits & Reversible Computing
“Since speed scales with energy dissipation in adiabatic circuits, they aren’t good for high-performance computing.” “If I tried and failed to invent an efficient adiabatic logic, it must be impossible.” “The algorithmic overheads of reversible computing mean it can never be cost-effective.” “Since leakage gets worse in nanoscale devices, adiabatics is doomed.” Fallacies about Adiabatic Circuits and Reversible Computing
Using diodes in the charge-return path. Forgetting to obey one of the transistor rules. Using traditional models of computational complexity. Restricting oneself to an asymptotically inefficient design style. Assuming that the best reversible and irreversible algorithms are similar. Failing to optimize the degree of reversibility of a design. Ignoring charge leakage in low-power/adiabatic design. Pitfalls in Adiabatic Circuits and Reversible Computing
Adiabatic/Reversible Computing Basic Models and Concepts
Bistable Potential-Energy Wells • Consider any system having an adjustable, bistable potential energy surface (PES) in its configuration space. • The two stable states form a natural bit. • One state represents 0, the other 1. • Consider now the P.E. well havingtwo adjustable parameters: • (1) Height of the potential energy barrierrelative to the well bottom • (2) Relative height of the left and rightstates in the well (bias) (Landauer ’61) 0 1
Possible Parameter Settings • We will distinguish six qualitatively different settings of the well parameters, as follows… BarrierHeight Direction of Bias Force
One Mechanical Implementation Stateknob Rightwardbias Barrierwedge Leftwardbias spring spring Barrier up Barrier down
Possible Adiabatic Transitions (Ignoring superposition states.) • Catalog of all the possible transitions in these wells, adiabatic & not... “1”states 1 1 1 leak 0 “0”states 0 leak 0 BarrierHeight N 1 0 Direction of Bias Force
Ordinary Irreversible Logics • Principle of operation: Lower a barrier, or not, based on input. Series/parallel combinations of barriers do logic. Major dissipation in at least one of the possible transitions. 1 Input changes, barrier lowered 0 • Amplifies input signals. Example: Ordinary CMOS logics Outputirreversiblychanged to 0 0
Ordinary Irreversible Memory • Lower a barrier, dissipating stored information.Apply an input bias.Raise the barrier to latch the new informationinto place.Remove inputbias. Retractinput 1 1 Dissipationhere can bemade as low as kT ln 2 Retractinput Barrierup 0 0 Barrier up (3) (1) Input“1” Input“0” Example:ordinaryDRAM N 1 0 (2) (2)
Input-Bias Clocked-Barrier Logic Can amplify/restore input signalin the barrier-raising step. • Cycle of operation: • (1) Data input applies bias • Add forces to do logic • (2) Clock signal raises barrier • (3) Data input bias removed (3) 1 1 (4) Can reset latch reversibly (4) given copy ofcontents. (3) 0 0 (2) (4) (4) (4) (2) Examples:AdiabaticQDCA, SCRL latch, Rod logic latch, PQ logic,Buckled logic (1) (1) N 1 0 (4) (4)
Input-Barrier, Clocked-Bias Retractile • Barrier signal amplified. • Must reset output prior to changing input. • Combinational logic only! • Cycle of operation: • (1) Inputs raise or lower barriers • Do logic w. series/parallel barriers • Clock applies bias force, which changes state, or not 0 0 0 (1) Input barrier height Examples:Hall’s logic,SCRL gates,Rod logic interlocks N 1 0 (2) Clocked force applied
Input-Barrier, Clocked-Bias Latching ● Cycle of operation: • Input conditionally lowers barrier • Do logic w. series/parallel barriers • Clock applies bias force; conditional bit flip • Input removed, raising the barrier &locking in the state-change • Clockbias canretract 1 (4) (4) 0 0 0 (2) (2) (3) (1) Examples:Mike’s4-cycle 2-level adiabaticCMOS logic (2LAL) (2) (2) N 1 0
Full Classical-Mechanical Model Sleeve Claim: The following components are sufficient for a complete, scalable, parallel, pipelinable, linear-time, stable, classical reversible computing system: (a) Ballistically rotating flywheel driving linear motion. (b) Scalable mesh to synchronize local flywheel phases in 3-D. (c) Sinusoidal to flat-topped waveform shape converter. (d) Non-amplifying signal inverter (NOT gate). (e) Non-amplifying OR/AND gate. (f) Signal amplifier/latch. (a) (c) (b) (f) (d) Primary drawback: Slow propagationspeed of mechanical (phonon) signals. (e) cf. Drexler ‘92
Common Mistakes to Avoid In Adiabatic Design
Common Mistakes to Avoid: • Don’t use diodes in charge-return path! • Built-in voltage drop kills adiabaticity • Don’t disobey adiabatic transistor rules by: • Turning on transistor with voltage across it • Turning off transistor with current thru it! • This one is often neglected • Use mostly-reversible logic! • Optimize degree of reversibility for application • Don’t over-constrain the design family! • Asymptotically efficient circuits should be possible
Adiabatic Rules for Transistors • Rule 1: Never turn on a transistor if it has a nonzero voltage across it! • I.e., between its source & drain terminals. • Why: This erases info. & causes ½CV2 disspation. • Rule 2: Never apply a nonzero voltage across a transistor even during any onoff transition! • Why: When partially turned on, the transistor has relatively low R, gets high P=V2/R dissipation. • Corollary: Never turn off a transistor if it has a nonzero current going through it! • Why: As R gradually increases, the V=IR voltage drop will build, and then rule 2 will be violated.
Adiabatic Rules, continued… • Transistor Rule 3: Never suddenly change the voltage applied across any on transistor. • Why: So transition will be more reversible; dissipation will approach CV2(RC/t), not ½CV2. Adiabatic rules for other components: • Diodes: Don’t use them at all! • There is always a built-in voltage drop across them! • Resistors: Avoid moderate network resistances, if poss. • e.g. stay away from range >10 k and <1 M • Capacitors: Minimize, reliability permitting. • Note: Dissipation scales with C2!
Transistor Rules Summarized Legal adiabatic transitions in green. (For n- or p-FETs.)Dissipative states and transitions in red. off high low off off high high low low off low high on on low high high low on on low low high high
SCRL: Split-level Charge Recovery Logic The First Pipelined Fully-Adiabatic CMOS Logic(Younis & Knight, MIT, ’94)
Transformation of local state:
Retractile Logic w. SCRL gates • Simple combinational logic of any depth N: • Requires N timing phases • Non-pipelined • No sequential reuse ofHW (even worse) • We needsequentiallogic! Time
P Simple Reversible CMOS Latch • Uses a standard CMOS transmission gate • Sequence of operation: (1) input initially matches latch contents (output) (2) input changesoutput changes (3) latch closes (4) input removed Before Input Inputinput: arrived: removed:inoutinoutinouta a a a a a b b a b P in out
Resetting a Reversible Latch • Can reversibly unlatch data as follows: (exactly the reverse of the latching process) • (1) Data value d stored on memory node M. • (2) Present an exact copy of d on input. • (3) Open the latch (connecting input to M). • No dissipation since voltage levels match • (4) Retract the copy of d from the input. • Retracts copy stored in latch also.