300 likes | 455 Views
CSCI-641/EENG-641 Computer Architecture. Khurram Kazi. Major sources of the slides for this lecture. http://fourier.eng.hmc.edu/e85/lectures/instruction/node7.html http://www.ece.ucdavis.edu/~vojin/CLASSES/EEC180B/Spring99/lab6.pdf http://fourier.eng.hmc.edu/e85/lectures/r3000-isa.html
E N D
CSCI-641/EENG-641Computer Architecture Khurram Kazi CSCI 641 – EENG 641 1
Major sources of the slides for this lecture • http://fourier.eng.hmc.edu/e85/lectures/instruction/node7.html • http://www.ece.ucdavis.edu/~vojin/CLASSES/EEC180B/Spring99/lab6.pdf • http://fourier.eng.hmc.edu/e85/lectures/r3000-isa.html • Digital Design and Computer Architecture book by David Money Harris and Sarah L. Harris. Chapter 6 – Architecture • http://www.hirstbrook.com/cod/Chapter2B.pdf K Kazi CSCI 641/EENG 641 2
Assembly language • Assembly language is the human readable representation of computer’s native language • Each instruction specifies both the operations to perform and the operands on which to operate • add is called the mnemonic and indicates what operation to perform • The operation is performed on b and c, the source operands, and results is written to a, the destination operand Design Principle 1: Simplicity favors regularity Instructions with a consistent number of operands – in this case, two sources and one destination – are easier to encode and handle in hardware. More complex high-level code translates into multiple MIPS instructions K Kazi CSCI 641/EENG 641 3
Assembly language • To execute complex operation multiple assembly language instructions are performed Design Principle 2: Make the common case fast The MIPS instruction set makes the common case fast by including only simple, commonly used instructions. The number of instructions is kept small so that the hardware required to decode the instructions and its operands can be simple, small and fast. Less frequent more elaborate operations are performed using sequence of multiples simple instructions K Kazi CSCI 641/EENG 641 4
Assembly language: Operands: Registers, Memory, and Constants • Instruction operates on Operands • Variables a, b, and c, all are called operands. • Computer operates on 0’s and 1’s • Operands are stored in registers or memory, or they may be constants stored in the instruction itself • Computers use various locations to hold operands, to optimize for speed and data capacity • Registers are accessed quickly compared to memory • Registers can hold very limited amount of data whereas memories hold large amounts of data • MIPS architecture uses 32 register, called register set or register file Design Principle 3: Smaller is faster K Kazi CSCI 641/EENG 641 5
Translating high-level code to assembly K Kazi CSCI 641/EENG 641 6
MIPS register set K Kazi CSCI 641/EENG 641 7
Registers within MIPS Processor • Register file (RF): 32 registers ($0 through $31), each for a word of 32 bits (4 bytes); • $0 always holds zero • $sp (29) is the stack pointer (SP) which always points to the top item of a stack in the memory; • $ra (31) always holds the return address from a subroutine • The table in the previous shows the conventional usage of all 32 registers K Kazi CSCI 641/EENG 641 8
Description of Register File • There are two • read data buses, a_dout and b_dout, • two read address buses, a_addr and b_addr. • one write data bus, wr_dbus and • one write address bus, wr_addr. • Each of these address buses is used to specify one of the 32 registers for either reading or writing. • The write operation takes place on the rising edge of the clk signal when the wr_en signal is logic 1. • The read operation, however, is not clocked - it is combinational. Thus, the value on the a_dout should always be the contents of the register specified by the a_addr bus. • Similarly, the value on the b_dout should always be the contents of the register specified by the b_addr bus. • So, with this register file, you can write to a register and read two registers simultaneously. It is also possible to read a single register on both of the read buses simultaneously. It essence it is a 3-port memory element that allows two reads and one write simultaneously. K Kazi CSCI 641/EENG 641 9
Memory K Kazi CSCI 641/EENG 641 10
Memory • Compared to registerfile, memory is large and slow • MIPS uses byte-addressable memory • MIPS architecture uses 32-bit memory address and 32-bit data words • Memory array is word-addressable, i.e., each 32-bit data word has a unique 32-bit address • MIPS uses load word instruction, lw, to read data from memory into a register • lw instrcution specifies the effective address in memory as sum of base address and offset, e.g. • lw $s0, 0($0) # read data word 0 into $s0 • lw $s1, 4($0) # read data word 1 into $s1 • lw $s2, 0xC($0) # read data word 3 into $s2 offset Base address K Kazi CSCI 641/EENG 641 11
Memory • MIPS uses store word instruction, sw, to write data from a register to a memory • sw instrcution specifies the effective address in memory as sum of base address and offset, e.g. • sw $s0, 0($0) # write $s0 to memory data word 0 • sw $s1, 4($0) # write $s1 to memory data word 1 • sw $s2, 0xC($0) # write $s2 to memory data word 3 offset Base address K Kazi CSCI 641/EENG 641 12
Instruction Set of MIPS Processor Instruction set: each instruction in the instruction set describes one particular CPU operation. Each instruction is represented in both assembly language by the mnimonics and machine language (binary) by a word of 32 bits subdivided into several fields. rs – is short for “register source.” rt comes after rs alphabetically and usually indicates second register source. rd – is short for “register destination.” shamt – shift and mix operation Op - Opcode K Kazi CSCI 641/EENG 641 13
Instruction Set of MIPS Processor: R-type instruction • Arithmetic/Logical Instructions in MIPS • Logical operations are and, or, xor, and nor • R-type instructions operate bit-by-bit on two source registers and the result is written to the destination address • and is used in masking bits (i.e. forcing unwanted bits to 0) • or is useful in combing bits from two registers • MIPS does not provide a NOT instruction, NOR can be used for NOT operation, e.g., • A NOR $0 = not A K Kazi CSCI 641/EENG 641 14
Instruction Set of MIPS Processor: Machine code for R-type instruction K Kazi CSCI 641/EENG 641 15
Instruction Set of MIPS Processor: I-type instruction • Immediate type or I-type instruction use two register operands and one immediate operand. • Similar to R-type instruction • Operation is solely defined by the opcode • rs and imm are always used as source operands • rt is used as a destination for some instructions, but never a source for others K Kazi CSCI 641/EENG 641 16
Instruction Set of MIPS Processor: Machine code for I-type instructions K Kazi CSCI 641/EENG 641 17
Load word (lw) instruction • MIPS uses load word instruction, lw, to read a data word from memory into a register • lw $s3 1($0) #read memory word 1 into $s3 • lw instruction specifies effective address in memory as sum of base address and an offset. • Base address (written in parentheses in the instruction) is a register • Offset is constant (written before the parentheses) • Base address is $0 and offset is 1 => instruction reads from memory address 1. After instruction register S3 = F2F1AC07 K Kazi CSCI 641/EENG 641 18
lw instruction rt is used as a destination in this instruction K Kazi CSCI 641/EENG 641 19
store word (sw) instruction • MIPS uses load word instruction, lw, to read a data word from memory into a register • sw $s7 5($0) #write $s7 to memory word 5 • sw instruction is used to write data from register to memory. • Base address (written in parentheses in the instruction) is the value stored in the register • Offset is constant (written before the parentheses) • Base address is $0 and offset is 5 => instruction writes data from register $7 to memory word 5. • Keep in mind that MIPS memory model is byte addressable, not word addressable K Kazi CSCI 641/EENG 641 20
Classical five-stage RISC pipeline K Kazi CSCI 641/EENG 641 21
MIPS R3000 Instruction Set Summary K Kazi CSCI 641/EENG 641 22
MIPS R3000 Instruction Set Summary K Kazi CSCI 641/EENG 641 23
MIPS R3000 Instruction Set Summary K Kazi CSCI 641/EENG 641 24
MIPS R3000 Instruction Set Summary K Kazi CSCI 641/EENG 641 25
Pipeline stages: 5 stages • Fetch • Reads instructions from instruction memory • Decode • Reads the source operands from the register file and decodes the instructions to produce the control signals • Execute • Performs a computation with the ALU • Memory • Processor reads or write data memory • Writeback • Processor writes the results to the register file, when applicable K Kazi CSCI 641/EENG 641 26
Pipelined logic of MIPS K Kazi CSCI 641/EENG 641 27
Snippet of MIPS simulation Instruction Dest. reg immediate Write addr Write enable Write data K Kazi CSCI 641/EENG 641 28
MIPS Microarchitecture • Single-cycle microarchitecture • Executes an entire instruction in one cycle • Multi-cycle microarchitecture • Executes instructions in a series of shorter cycles • Simpler instructions execute in fewer cycles than complicated ones • Reuse complicated hardware blocks such as adders • Executes only one instruction at a time over multiple cycles • Pipelined microarchitecture • Applies pipelining to the single-cycle microarchitecture • Can execute several instructions simultaneously • Improve throughput significantly • All high-performance commercial processors use pipelining K Kazi CSCI 641/EENG 641 29
MIPS Microarchitecture • Many ways to measure performance of a computer system • Intel and Advanced Micro Devices (AMD) both sell compatible processors conforming to IA-32 architecture • Intel offered higher clock frequencies than its competitors • AMD’s Athlon, Intel’s main competitor, executed programs faster than Intel’s chips at the same clock frequency • The only gimmick-free way to measure performance is by measuring the execution time of a program of interest to you • Computer that executes your program fastest has the highest performance • Next best thing would be to measure total execution time of a collection of programs • Such collections of programs are called benchmarks and execution times of such programs are commonly published Execution time = (#instructions)(cycle/instruction)(seconds/cycle) • Cycle per instruction, CPI is the number of clock cycles required to execute an average instruction K Kazi CSCI 641/EENG 641 30