CSE 599 Lecture 3: Digital Computing

CSE 599 Lecture 3: Digital Computing • In the previous lectures, we examined: • Theory of Computation • Turing Machines and Automata • Computability and Decidability • Time and Space Complexity • Today: Theory and Implementation of Digital Computers • Guest Lecture by Prof. Chris Diorio on silicon integrated-circuit technology • Digital logic • Digital computer organization and design • Moore’s law and technology scaling

History of Digital Computing • ~1850: George Boole invents Boolean algebra • Maps logical propositions to symbols • Allows us to manipulate logic statements using mathematics • 1936: Alan Turing develops the formalism of Turing Machines • 1945: John von Neumann proposes the stored computer program concept • 1946: ENIAC: 18,000 tubes, several hundred multiplications per minute • 1947: Shockley, Brattain, and Bardeen invent the transistor • 1956: Harris introduces the first logic gate • 1972: Intel introduces the 4004 microprocessor • Present: <0.2 m feature sizes; processors with >20-million transistors

The mathematics: Boolean algebra • A Boolean algebra consists of • A set of elements B • Binary operations {+ , •} • A unary operation { ' } • And the following axioms: 1. The set B contains at least two elements, a, b, such that a  b 2. Closure: a + b is in B a • b is in B 3. Commutative: a + b = b + a a • b = b • a 4. Associative: a + (b + c) = (a + b) + c a • (b • c) = (a • b) • c 5. Identity: a + 0 = a a • 1 = a 6. Distributive: a + (b•c)=(a + b)•(a + c) a•(b + c)=(a•b) + (a•c) 7. Complementarity: a + a' = 1 a • a' = 0

Binary logic is a Boolean algebra • Substitute • {0, 1} for B • OR for +, AND for • • NOT for ' • All the axioms hold for binary logic • Definitions • Boolean function: Maps inputs from the set {0,1} to the set {0,1} • Boolean expression: An algebraic statement of Boolean variables and operators

What is digital hardware? • Physical quantities (voltages) represent logical values • If (0V < voltage < 0.8V), then symbol is a “0” • If (2.0V < voltage < 5V), then symbol is a “1” • Physical devices compute logical functions of their inputs • E.g. AND, OR, NOT • Set of n wires allow binary integers from 0 to 2n - 1 • How do we compute using digital hardware?

Lowest Level: Transistors • Transistors implement switches e.g. NOT, NAND, etc.

A B AND Z = A and B A OR Z = A or B B Switches allow digital logic • Map problems (e.g. addition) to logical expressions • Map logical expressions to switching devices

R Q X Y Z0 0 10 1 01 0 0 1 1 0 X Q' S Z Y Digital logic allows computation • A NOR gate: • NOR or NAND each form a complete operator • Can form any Boolean expression using either of them • Using only NOR gates and wire, you can build a general purpose digital computer • E.g. A one-bit memory (flip-flop)

Why do digital computers work like this? • There is no compelling theoretical reason. • Nothing from physics or chemistry, information theory, or CS • The reason is mere expediency • We build computers this way because we can. • All the technology “fits”

The Digital Computing Hierarchy • A hierarchical approach allows general-purpose digital computing: • Transistors  switches  gates  combinational and sequential logic  finite-state behavior  register-transfer behavior  …

Logic in digital computer design • Digital logic: Circuit elements coding binary symbols • Transistor switches have 2 simple states (on/off) • Encode binary symbols implicitly • Combinational logic: Circuits without memory • Logic devices act as Boolean primitives • Example: a NOR gate • Allow arithmetic operators such as ADD to be constructed • Sequential logic: Circuits with memory • Feedback stores logic values • Example: a flip-flop (also known as a latch) • Allows registers and memory to be implemented

Inputs Outputs System Combinational versus sequential systems • Combinational systems are memoryless • The outputs depend only on the present inputs • Sequential systems have memory • The outputs depend on the present inputs and on the previous inputs Inputs Outputs System

X Y0 01 1 X Y0 11 0 X Y Z0 0 00 1 01 0 0 1 1 1 X Z Y X Y X Y Z0 0 00 1 11 0 1 1 1 1 X Y X Z Y Combinational logic gates • AND X • Y • OR X + Y • Buffer X • NOT

X Y Z0 0 10 1 11 0 1 1 1 0 X Z Y X Y Z0 0 10 1 01 0 0 1 1 0 X Z Y X Y Z0 0 00 1 11 0 1 1 1 0 X Z Y Combinational logic gates (cont.) • NAND • NOR • XOR

Complete operators • Can implement any logic function using only NOR or only NAND • E.g. Logical inversion (NOT) • NOR with both inputs tied together gives NOT • Noninverting functions • Example: (X or Y) = not (X nor Y) • In the above, use “not” constructed from a “nor” gate • Can implement NAND and NOR from each other • Example: X nand Y = not ((not X) nor (not Y)) X Y X nor Y0 0 11 1 0

Mapping Boolean expressions to logic gates • Example:

A binary decoder circuit • Input: 2-digit binary number; Output: turn on 1 of 4 wires • Truth Table:

A binary decoder circuit • Input: 2-digit binary number AB; Output: 1 of 4 wires • Circuit:

A multiplexer circuit • Goal: Select one of 4 input lines and pass the information on that line to the single output line • Circuit: Uses binary decoder plus an OR gate

A multiplexer circuit

Exercise: An Adder Circuit • Design a circuit for adding two binary numbers • First, write the truth table (input bits A and B, output bits SUM and CARRY) • Construct circuit using logic gates

An Adder Circuit • Truth table: • Circuit: • Pick gates that match the two outputs SUM = A xor B CARRY = A • B (i.e. A and B)

A Full Adder • Suppose you want to add 2 n-bit numbers • Can you do this by using the previous 1-bit adder with two inputs and two outputs?

A Full Adder • No, you need a 1-bit adder with three inputs: A, B and the CARRY bit from the previous digit • Then, to add 2 n-bit numbers, you can chain n 1-bit adders together, with the CARRY output of one adder feeding into the next adder

A Full Adder • Truth Table: • SUM = ? • CARRY = ?

A Full Adder • Truth Table: • SUM = (A xor B) xor C • CARRY = (A • B) + (A • C) + (B • C)

An Aside: Reversible logic gates • Most Boolean gates are not reversible: Cannot construct input from output (exceptions: NOT and buffer) • Destroying information consumes energy – we will address this later when discussing thermodynamics and quantum computers • Two reversible gates: controlled not (CN) and controlled controlled not (CCN). A B C A’ B’ C’0 0 0 0 0 0 0 0 1 0 0 10 1 0 0 1 0 0 1 1 0 1 1 1 0 0 1 0 0 1 0 1 1 0 11 1 0 1 1 1 1 1 1 1 1 0 A B A’ B’0 0 0 00 1 0 11 0 1 1 1 1 1 0 CCN is complete: we can form any Boolean function using only CCN gates: e.g. AND if C = 0

Sequential logic • The devices • Flip-flops • Shift registers • Finite state machines • The concepts • Sequential systems have memory • The memory of a system is its state • Sequential systems employ feedback • Present inputs affect future outputs

RS Flip-Flops • Inputs: Set and Reset, Output: 2 stored bits that are complementary. Example: Using NOR gates S Not(Q) Q R

The D flip-flop • At the clock edge, the D flip-flop takes D to Q • Internal feedback holds Q until next clock edge Clock is a periodic signal

Shift registers • Chain of D flip-flops: Stores sequences of bits • Assume ABC stores some binary number xyz initially • Stores 1 bit per clock cycle: ABC = xyz, 0yz, 10z, 010

Finite state machines (FSMs) • Consists of combinational logic and storage elements • Localized feedback loops • Sequential logic allows control of sequential algorithms CombinationalLogic Inputs Outputs State Inputs State Outputs Storage Elements

outputlogic Outputs Inputs next statelogic Next State Current State Generalized FSM model • State variables (state vector) describes circuit state • We store state variables in memory (registers) • Combinational logic computes next state and outputs • Next state is a function of current state and inputs

Synchronous design using a clock • Digital designs are almost always synchronous • All voltages change at particular instants in time • At a rising clock edge • The computation is paced by the clock • The clock hides transient behavior • The clock forces the circuit to a known state at regular intervals • Error-free sequencing of our algorithms The circuit transitions to one among a finite number of states at every clock edge

Computer organization and design • Computer design is an application of digital logic design • Combinational and sequential logic • Computer = Central processing unit + memory subsystem • Central processing unit (CPU) = datapath + control • Datapath = functional units + registers • Functional units = ALU, adders, multipliers, dividers, etc. • Registers = program counter, shifters, storage registers • Control = finite state machine • Instructions (fetch, decode, execute) tell the FSM what to do

address Memory System Processor read/write data central processing unit (CPU) control signals Control Data Path data conditions instruction unit: instruction fetch and interpretation FSM execution unit: functional units registers Computer structure

The processing unit • First topic: The datapath • Functional units operate on data • ALU, adders, multipliers, ROM lookup tables, etc. • Registers store and shift data and addresses • Program counter, shifters, registers • Second topic: The controller (control FSM) • Finite state machine coordinates the processor’s operations • Instructions (fetch, decode, execute) tell the FSM what to do • Inputs = machine instruction, datapath conditions • Outputs = register-transfer control signals, ALU op codes

OE Q7 Q6 Q5 Q4 Q3 Q2 Q1 Q0 LD D7 D6 D5 D4 D3 D2 D1 D0 CLK Datapath: Registers • A collection of synchronous D flip-flops • Load selectively using LD • Read using OE (output enable) 8 bit register

Datapath: Register files • Collections of registers • Two-dimensional array of flip-flops • An address indexes a particular word • Can have separate read and write addresses • Can read and write simultaneously • Example: 8 by 8 register file • Uses 64 D flip-flops or eight 8-bit registers (as in previous slide) • Can store 8 words of 8 bits each

A B 16 16 Operation 16 N S Z Datapath: ALU • General-purpose arithmetic logic unit • Input: data and operation (derived from an op-code) • Output: result and status • Built from combinational logic like our ADDER circuit Data Result and status

Controlling the datapath: The control FSM • Top level state diagram • Reset • Fetch instruction • Decode • Execute • 3 classes of instructions • Branch • Load/store • Register-to-register operation • Different sequence of states for each instruction type (PC = program counter) Reset Init InitializeMachine FetchInstr. Load/Store Reg- Reg Branch Register-to-Register Branch Taken BranchNot Taken Incr.PC

Inside the control FSM • Standard state-machine elements • State registers • Next-state combinational logic • Output combinational logic (datapath/control signaling) • “Control" registers • Instruction register (IR) • Program counter (PC) • Inputs/outputs • Outputs control datapath • Inputs from datapath may alter program flow (e.g. branch if (x-y) = 0)

load path 16 REG AC rd wr storepath 16 16 data Data Memory (16-bit words) OP addr N 16 Z 16 IR PC data Inst Memory (8-bit words) 16 16 ControlFSM OP addr 16 Instructions versus Data: Harvard architecture • Instructions and data stored in two separate memories • OP from control FSM specifies ALU operation

Communication: Buses • Real processors have multiple buses • Trade communication bandwidth versus hardware complexity • Control FSM coordinates data transfer between registers

The Key Points • Digital computers are built from simple logic devices • NOR, NAND, or other logic gates built from switches, which are built from transistors, which are built on silicon wafers • Hierarchy allows us to build complex computers • Datapath comprises combinational circuits and registers • Controller comprises finite state machines • With NORs and wire, you can build the entire internet, with every computer on it!

So, where is digital computing headed? • Technology has scaled exponentially the past few decades in accordance with Moore’s law • Chip complexity (transistor density) has doubled every 1.5 years, as “feature” sizes on a chip keep decreasing Graph: Transistor density versus minimum feature size (feature size = width of wire on a chip)

Clock speed has scaled exponentially • Clock frequencies have doubled every ~3 years Graph: Clock speed versus minimum feature size From Sasaki, Multimedia: Future and impact for semiconductor scaling, IEDM, 1997

Drivers of semiconductor scaling • Shrinking feature dimensions reduces energy consumption, physical size, and interconnect length • Energy consumption and physical size • Power dissipation dominates chip design • Power dissipation and size drive computer infrastructure • Fans, heat sinks, etc. to get rid of heat • More chips  bigger boards • Interconnect (wire) • Wire parasitics can dominate chip speed • Resistance, capacitance, inductance, delay • Increased noise (switching, coupling) and delay

But, there are problems… • Approaching physical, practical, and economic limits. • Photolithography: etching circuits on silicon wafers • Component sizes (~ 0.1 m) getting close to the wavelength of light used for etching (mercury, pulsed excimer laser, x-rays (?)…) • Tunneling effects: tiny distances between wires cause electrons to leak across wire, corrupting the circuit… • Clock speed so fast, signals can only travel a fraction of a mm in one cycle – can’t reach all components… • Component sizes at atomic scale – quantum laws take effect • Economics: Fab lines too expensive, transistors too cheap…

The end of scaling? • Reasonable projections: We will be able to engineer devices down to 0.03µm feature sizes • ~10 more years of scaling • Projected transistor density at a 0.03µm: 5 million / mm2 • A 15mm×15mm die can have ~ 1 billion transistors • Issue 1: Power loss increases • Issue 2: Building the interconnect becomes hard • Projected clock rate at 0.03µm: 40GHz • Signals can travel only 4mm in one clock period: can’t reach other components? • More details in the handouts…

CSE 599 Lecture 3: Digital Computing