210 likes | 299 Views
Chapter 1 An Introduction to Processor Design. 부산대학교 컴퓨터공학과. 1.1 Processor Architecture & Organization. All modern general-purpose computers employ “ stored program concept ” IAS computer by von Neumann at Princeton Institute for Advanced Studies (in 1946)
E N D
Chapter 1An Introduction to Processor Design 부산대학교 컴퓨터공학과
1.1 Processor Architecture & Organization • All modern general-purpose computers employ “stored program concept” • IAS computer by von Neumann at Princeton Institute for Advanced Studies (in 1946) • First implemented in ‘Baby Machine’ at Univ. of Manchester, England (in 1948) • [Figure 1.1] The state in a stored-program digital computer PNU Computer Eng.
1.1 Processor Architecture & Organization • 50 years of development: • performance of processorsh • cost i • cost-effective computers (principles of operation not changed much) • Most of improvements: • Advances in technology of electronics • Vacuum tubes -> transistors -> ICs -> VLSI • New insights: • Virtual memory (early 1960s) • Cache memory • Pipelining • RISC PNU Computer Eng.
1.2 Abstraction in Hardware Design • Transistors (elementary component) • Logically act as inverters • Logic gates • CMOS NAND gate (using 4 trs) • If A = B = Vdd, output = Vss • If either A or B (or both) = Vss, output =Vdd • => output = not(A.B) • Transistor circuit, logic symbol, truth table PNU Computer Eng.
1.2 Abstraction in Hardware Design • The gate abstraction • Simplify the process of designing circuits with great number of trs • Removes the need to know that the gate is built from trs • Free from implementation technology in function level • Eg. Field effect tr, bipolar tr, etc. • However, performance difference exists • Levels of abstraction • Trs • Gates, memory cells • Adder, MUX, decoder, registers • ALUs, shifters, memory blocks • Processors, peripherals, memories • ICs • PCBs • PCs, controllers, mobile phones PNU Computer Eng.
1.3 MU0 – a simple processor • A simple form of processor can be built from a few basic components • PC (program counter) • ACC (accumulator) • ALU (arithmetic-logic unit) • IR (instruction register) • Instruction decoder, control logic • The MU0 instruction set • A 16-bit machine with a 12-bit address space (4K x 2 bytes: 8K bytes memory) • Instructions: 16 bits long (op: 4 bits, address field: 12 bits) PNU Computer Eng.
1.3 MU0 – a simple processor • [Table 1.1] The MU0 instruction set PNU Computer Eng.
1.3 MU0 – a simple processor • Datapath • A register transfer level (RTL) design style based on registers, MUXs, and so on • [Figure 1.5] MU0 datapath example PNU Computer Eng.
RTL level design • [Figure 1.6] MU0 register transfer level organization • Control signals: • enables on all of regs • function select lines to ALU • select control lines for two MUXs • control for a tri-state driver to send ACC value to memory • MEMrq (memory request) • RnW (read/write control lines) PNU Computer Eng.
1.4 Instruction set design • To build a high-performance processor (beyond MU0 inst. set), inst. set design is important. • 4 address insts (the most general form) • Ex) add d, s1, s2, next_i; d := s1 + s2 • 3 address insts • Make address of the next inst. implicit using PC (except for branch) • Ex) add d, s1, s2; d := s1 + s2 PNU Computer Eng.
1.4 Instruction set design • 2 address insts • Make destination reg. the same as one of source reg. • Ex) add d, s1; d := d + s1 • 1 address insts • AC is used as destination • Ex) add s1; AC := AC + s1 • 0 address insts (using a stack) • Ex) add; tos := tos + next on stack PNU Computer Eng.
1.4 Instruction set design • Addressing modes • Immediate addressing: immediate data • Absolute addressing: inst. contains full address for data • Indirect addressing: inst. contains address of location that contains address of data • Register addressing: data is in a reg. • Register indirect addressing • Index addressing • Stack addressing PNU Computer Eng.
1.4 Instruction set design • Control flow instructions • Branch, jump • Conditional branch • Subroutine calls & returns • System calls • Branch to an operating system routine • Exceptions • Error handling PNU Computer Eng.
1.5 Processor design trade-offs • CISC vs RISC • CISC • To reduce semantic gap b/w high level language & machine instruction • Complex sequence of operations • Make compiler’s job easy • RISC • ARM’s middle name: from RISC • Reducing semantic gap is not the right way to make an efficient computer • [Table 1.3] Typical dynamic instruction usage PNU Computer Eng.
1.5 Processor design trade-offs • Data movement b/w regs and memory: almost half • Control flow such as branches & procedure calls: almost quarter • Arithmetic operations: only 15% • Complex arithmetic insts do not help much • The most important tech: pipelining, cache memory • To make processors go faster PNU Computer Eng.
1.5 Processor design trade-offs • Pipelines • Fetch • Decode • REG: get operands from register bank • ALU • MEM: access memory for an operand, if necessary • RES: write result back to register bank • [Figure 1.13] Pipelined instruction execution PNU Computer Eng.
1.5 Processor design trade-offs • Pipeline hazards • Read after write hazard (data hazard) • Result from one inst is used as an operand by the next inst => inst2 must stall until the result is available • [Figure 1.14] Read-after-write pipeline hazard PNU Computer Eng.
1.5 Processor design trade-offs • Branch hazard • Solution: • Compute branch target earlier (if possible) • The target may be computed speculatively • Delayed branch • [Figure 1.15] Pipelined branch behavior • Pipeline efficiency • The deeper the pipeline, the worse the problems get: RISC approach is better PNU Computer Eng.
1.6 RISC • In 1980, Patterson: RISCI project • RISCI arch • Fixed (32-bit) inst size with few formats • Load-store arch: • Insts that process data operate only on regs • Separate insts to access memory • A large register bank (32 32-bit regs) to allow load-store arch to operate efficiently • RISCI organization • Hard-wired inst decode logic • Pipelined execution • Single cycle execution • RISCI advantages • A smaller die size • A shorter development time • A higher performance (controversial) PNU Computer Eng.