540 likes | 559 Views
Explore the inception and evolution of digital computing, from Boolean algebra to contemporary processors. Learn about hardware fundamentals, logic gates, and the hierarchical design approach. Delve into the workings of combinational and sequential systems in digital computer design.
E N D
CSE 599 Lecture 3: Digital Computing • In the previous lectures, we examined: • Theory of Computation • Turing Machines and Automata • Computability and Decidability • Time and Space Complexity • Today: Theory and Implementation of Digital Computers • Guest Lecture by Prof. Chris Diorio on silicon integrated-circuit technology • Digital logic • Digital computer organization and design • Moore’s law and technology scaling
History of Digital Computing • ~1850: George Boole invents Boolean algebra • Maps logical propositions to symbols • Allows us to manipulate logic statements using mathematics • 1936: Alan Turing develops the formalism of Turing Machines • 1945: John von Neumann proposes the stored computer program concept • 1946: ENIAC: 18,000 tubes, several hundred multiplications per minute • 1947: Shockley, Brattain, and Bardeen invent the transistor • 1956: Harris introduces the first logic gate • 1972: Intel introduces the 4004 microprocessor • Present: <0.2 m feature sizes; processors with >20-million transistors
The mathematics: Boolean algebra • A Boolean algebra consists of • A set of elements B • Binary operations {+ , •} • A unary operation { ' } • And the following axioms: 1. The set B contains at least two elements, a, b, such that a b 2. Closure: a + b is in B a • b is in B 3. Commutative: a + b = b + a a • b = b • a 4. Associative: a + (b + c) = (a + b) + c a • (b • c) = (a • b) • c 5. Identity: a + 0 = a a • 1 = a 6. Distributive: a + (b•c)=(a + b)•(a + c) a•(b + c)=(a•b) + (a•c) 7. Complementarity: a + a' = 1 a • a' = 0
Binary logic is a Boolean algebra • Substitute • {0, 1} for B • OR for +, AND for • • NOT for ' • All the axioms hold for binary logic • Definitions • Boolean function: Maps inputs from the set {0,1} to the set {0,1} • Boolean expression: An algebraic statement of Boolean variables and operators
What is digital hardware? • Physical quantities (voltages) represent logical values • If (0V < voltage < 0.8V), then symbol is a “0” • If (2.0V < voltage < 5V), then symbol is a “1” • Physical devices compute logical functions of their inputs • E.g. AND, OR, NOT • Set of n wires allow binary integers from 0 to 2n - 1 • How do we compute using digital hardware?
Lowest Level: Transistors • Transistors implement switches e.g. NOT, NAND, etc.
A B AND Z = A and B A OR Z = A or B B Switches allow digital logic • Map problems (e.g. addition) to logical expressions • Map logical expressions to switching devices
R Q X Y Z0 0 10 1 01 0 0 1 1 0 X Q' S Z Y Digital logic allows computation • A NOR gate: • NOR or NAND each form a complete operator • Can form any Boolean expression using either of them • Using only NOR gates and wire, you can build a general purpose digital computer • E.g. A one-bit memory (flip-flop)
Why do digital computers work like this? • There is no compelling theoretical reason. • Nothing from physics or chemistry, information theory, or CS • The reason is mere expediency • We build computers this way because we can. • All the technology “fits”
The Digital Computing Hierarchy • A hierarchical approach allows general-purpose digital computing: • Transistors switches gates combinational and sequential logic finite-state behavior register-transfer behavior …
Logic in digital computer design • Digital logic: Circuit elements coding binary symbols • Transistor switches have 2 simple states (on/off) • Encode binary symbols implicitly • Combinational logic: Circuits without memory • Logic devices act as Boolean primitives • Example: a NOR gate • Allow arithmetic operators such as ADD to be constructed • Sequential logic: Circuits with memory • Feedback stores logic values • Example: a flip-flop (also known as a latch) • Allows registers and memory to be implemented
Inputs Outputs System Combinational versus sequential systems • Combinational systems are memoryless • The outputs depend only on the present inputs • Sequential systems have memory • The outputs depend on the present inputs and on the previous inputs Inputs Outputs System
X Y0 01 1 X Y0 11 0 X Y Z0 0 00 1 01 0 0 1 1 1 X Z Y X Y X Y Z0 0 00 1 11 0 1 1 1 1 X Y X Z Y Combinational logic gates • AND X • Y • OR X + Y • Buffer X • NOT
X Y Z0 0 10 1 11 0 1 1 1 0 X Z Y X Y Z0 0 10 1 01 0 0 1 1 0 X Z Y X Y Z0 0 00 1 11 0 1 1 1 0 X Z Y Combinational logic gates (cont.) • NAND • NOR • XOR
Complete operators • Can implement any logic function using only NOR or only NAND • E.g. Logical inversion (NOT) • NOR with both inputs tied together gives NOT • Noninverting functions • Example: (X or Y) = not (X nor Y) • In the above, use “not” constructed from a “nor” gate • Can implement NAND and NOR from each other • Example: X nand Y = not ((not X) nor (not Y)) X Y X nor Y0 0 11 1 0
Mapping Boolean expressions to logic gates • Example:
A binary decoder circuit • Input: 2-digit binary number; Output: turn on 1 of 4 wires • Truth Table:
A binary decoder circuit • Input: 2-digit binary number AB; Output: 1 of 4 wires • Circuit:
A multiplexer circuit • Goal: Select one of 4 input lines and pass the information on that line to the single output line • Circuit: Uses binary decoder plus an OR gate
Exercise: An Adder Circuit • Design a circuit for adding two binary numbers • First, write the truth table (input bits A and B, output bits SUM and CARRY) • Construct circuit using logic gates
An Adder Circuit • Truth table: • Circuit: • Pick gates that match the two outputs SUM = A xor B CARRY = A • B (i.e. A and B)
A Full Adder • Suppose you want to add 2 n-bit numbers • Can you do this by using the previous 1-bit adder with two inputs and two outputs?
A Full Adder • No, you need a 1-bit adder with three inputs: A, B and the CARRY bit from the previous digit • Then, to add 2 n-bit numbers, you can chain n 1-bit adders together, with the CARRY output of one adder feeding into the next adder
A Full Adder • Truth Table: • SUM = ? • CARRY = ?
A Full Adder • Truth Table: • SUM = (A xor B) xor C • CARRY = (A • B) + (A • C) + (B • C)
An Aside: Reversible logic gates • Most Boolean gates are not reversible: Cannot construct input from output (exceptions: NOT and buffer) • Destroying information consumes energy – we will address this later when discussing thermodynamics and quantum computers • Two reversible gates: controlled not (CN) and controlled controlled not (CCN). A B C A’ B’ C’0 0 0 0 0 0 0 0 1 0 0 10 1 0 0 1 0 0 1 1 0 1 1 1 0 0 1 0 0 1 0 1 1 0 11 1 0 1 1 1 1 1 1 1 1 0 A B A’ B’0 0 0 00 1 0 11 0 1 1 1 1 1 0 CCN is complete: we can form any Boolean function using only CCN gates: e.g. AND if C = 0
Sequential logic • The devices • Flip-flops • Shift registers • Finite state machines • The concepts • Sequential systems have memory • The memory of a system is its state • Sequential systems employ feedback • Present inputs affect future outputs
RS Flip-Flops • Inputs: Set and Reset, Output: 2 stored bits that are complementary. Example: Using NOR gates S Not(Q) Q R
The D flip-flop • At the clock edge, the D flip-flop takes D to Q • Internal feedback holds Q until next clock edge Clock is a periodic signal
Shift registers • Chain of D flip-flops: Stores sequences of bits • Assume ABC stores some binary number xyz initially • Stores 1 bit per clock cycle: ABC = xyz, 0yz, 10z, 010
Finite state machines (FSMs) • Consists of combinational logic and storage elements • Localized feedback loops • Sequential logic allows control of sequential algorithms CombinationalLogic Inputs Outputs State Inputs State Outputs Storage Elements
outputlogic Outputs Inputs next statelogic Next State Current State Generalized FSM model • State variables (state vector) describes circuit state • We store state variables in memory (registers) • Combinational logic computes next state and outputs • Next state is a function of current state and inputs
Synchronous design using a clock • Digital designs are almost always synchronous • All voltages change at particular instants in time • At a rising clock edge • The computation is paced by the clock • The clock hides transient behavior • The clock forces the circuit to a known state at regular intervals • Error-free sequencing of our algorithms The circuit transitions to one among a finite number of states at every clock edge
Computer organization and design • Computer design is an application of digital logic design • Combinational and sequential logic • Computer = Central processing unit + memory subsystem • Central processing unit (CPU) = datapath + control • Datapath = functional units + registers • Functional units = ALU, adders, multipliers, dividers, etc. • Registers = program counter, shifters, storage registers • Control = finite state machine • Instructions (fetch, decode, execute) tell the FSM what to do
address Memory System Processor read/write data central processing unit (CPU) control signals Control Data Path data conditions instruction unit: instruction fetch and interpretation FSM execution unit: functional units registers Computer structure
The processing unit • First topic: The datapath • Functional units operate on data • ALU, adders, multipliers, ROM lookup tables, etc. • Registers store and shift data and addresses • Program counter, shifters, registers • Second topic: The controller (control FSM) • Finite state machine coordinates the processor’s operations • Instructions (fetch, decode, execute) tell the FSM what to do • Inputs = machine instruction, datapath conditions • Outputs = register-transfer control signals, ALU op codes
OE Q7 Q6 Q5 Q4 Q3 Q2 Q1 Q0 LD D7 D6 D5 D4 D3 D2 D1 D0 CLK Datapath: Registers • A collection of synchronous D flip-flops • Load selectively using LD • Read using OE (output enable) 8 bit register
Datapath: Register files • Collections of registers • Two-dimensional array of flip-flops • An address indexes a particular word • Can have separate read and write addresses • Can read and write simultaneously • Example: 8 by 8 register file • Uses 64 D flip-flops or eight 8-bit registers (as in previous slide) • Can store 8 words of 8 bits each
A B 16 16 Operation 16 N S Z Datapath: ALU • General-purpose arithmetic logic unit • Input: data and operation (derived from an op-code) • Output: result and status • Built from combinational logic like our ADDER circuit Data Result and status
Controlling the datapath: The control FSM • Top level state diagram • Reset • Fetch instruction • Decode • Execute • 3 classes of instructions • Branch • Load/store • Register-to-register operation • Different sequence of states for each instruction type (PC = program counter) Reset Init InitializeMachine FetchInstr. Load/Store Reg- Reg Branch Register-to-Register Branch Taken BranchNot Taken Incr.PC
Inside the control FSM • Standard state-machine elements • State registers • Next-state combinational logic • Output combinational logic (datapath/control signaling) • “Control" registers • Instruction register (IR) • Program counter (PC) • Inputs/outputs • Outputs control datapath • Inputs from datapath may alter program flow (e.g. branch if (x-y) = 0)
load path 16 REG AC rd wr storepath 16 16 data Data Memory (16-bit words) OP addr N 16 Z 16 IR PC data Inst Memory (8-bit words) 16 16 ControlFSM OP addr 16 Instructions versus Data: Harvard architecture • Instructions and data stored in two separate memories • OP from control FSM specifies ALU operation
Communication: Buses • Real processors have multiple buses • Trade communication bandwidth versus hardware complexity • Control FSM coordinates data transfer between registers
The Key Points • Digital computers are built from simple logic devices • NOR, NAND, or other logic gates built from switches, which are built from transistors, which are built on silicon wafers • Hierarchy allows us to build complex computers • Datapath comprises combinational circuits and registers • Controller comprises finite state machines • With NORs and wire, you can build the entire internet, with every computer on it!
So, where is digital computing headed? • Technology has scaled exponentially the past few decades in accordance with Moore’s law • Chip complexity (transistor density) has doubled every 1.5 years, as “feature” sizes on a chip keep decreasing Graph: Transistor density versus minimum feature size (feature size = width of wire on a chip)
Clock speed has scaled exponentially • Clock frequencies have doubled every ~3 years Graph: Clock speed versus minimum feature size From Sasaki, Multimedia: Future and impact for semiconductor scaling, IEDM, 1997
Drivers of semiconductor scaling • Shrinking feature dimensions reduces energy consumption, physical size, and interconnect length • Energy consumption and physical size • Power dissipation dominates chip design • Power dissipation and size drive computer infrastructure • Fans, heat sinks, etc. to get rid of heat • More chips bigger boards • Interconnect (wire) • Wire parasitics can dominate chip speed • Resistance, capacitance, inductance, delay • Increased noise (switching, coupling) and delay
But, there are problems… • Approaching physical, practical, and economic limits. • Photolithography: etching circuits on silicon wafers • Component sizes (~ 0.1 m) getting close to the wavelength of light used for etching (mercury, pulsed excimer laser, x-rays (?)…) • Tunneling effects: tiny distances between wires cause electrons to leak across wire, corrupting the circuit… • Clock speed so fast, signals can only travel a fraction of a mm in one cycle – can’t reach all components… • Component sizes at atomic scale – quantum laws take effect • Economics: Fab lines too expensive, transistors too cheap…
The end of scaling? • Reasonable projections: We will be able to engineer devices down to 0.03µm feature sizes • ~10 more years of scaling • Projected transistor density at a 0.03µm: 5 million / mm2 • A 15mm×15mm die can have ~ 1 billion transistors • Issue 1: Power loss increases • Issue 2: Building the interconnect becomes hard • Projected clock rate at 0.03µm: 40GHz • Signals can travel only 4mm in one clock period: can’t reach other components? • More details in the handouts…