340 likes | 359 Views
This review covers the fundamentals of RISC CPU designs, processor instruction sets, components of a computer system, software-hardware interaction, and designing pipelined processors. Taught by Lecturer 吳安宇 at 台灣大學, this course delves into instruction set architectures, processor design, memory systems, and the MIPS ISA. Emphasizing the importance of processor innovation and compatibility, the course explores the history of ISAs from CISC to RISC to post-RISC eras, highlighting the shift towards cleaner, efficient architectures for higher performance. Learn how processors, memory, and I/O systems work together, and discover the significance of MIPS architecture in modern computing, with practical examples from the PlayStation 2's Emotion Engine. Dive into the complexities of ISA design and its impact on software-hardware abstraction, enabling advancements in CPU technology while maintaining software compatibility.
E N D
Review of RISC CPU Designs Lecturer:吳安宇 Date:2005/3/4
Computer Architecture • After this course, you should: • Have a firm grasp of processor instruction sets. • Recognize the main components of a computer and how they interact. • Be able to design a simple pipelined processor. • Have the HW knowledge necessary for later courses in the curriculum. 台灣大學 吳安宇 教授
Why should you care? • It is interesting. • How do you make a processor that runs at 3Ghz? 台灣大學 吳安宇 教授
What do we cover? • MIPS is roughly split into three parts. • The first third discusses instruction set architectures • Next, we on processor implementations. • Finally, we talk about memory systems, I/O, and how to connect it all together. 台灣大學 吳安宇 教授
Instruction set architectures • An instruction set describes the basic functions that a processor can perform. • It serves as an interface between hardware and software; programs are sequences of instructions that get executed by hardware. • Several important issues: • The instruction set in CA lacked many features, such as support for function calls. We’ll work with a larger, more realistic processor. • We’ll also see more ways in which the instruction set architecture affects the hardware design. • We (i.e., you) will do more assembly-language programming too. 台灣大學 吳安宇 教授
Processor design • The second part of the semester will address two other limitations of the single-cycle processor from CA. • Supporting more complex instructions would increase the cycle time. • The CPU hardware is not fully utilized, so it runs slower than it could. • We will focus on pipelining, which is one of the most important ways of speeding up processors. • The idea behind pipelining is very simple, but there are many details and special cases that must be handled. • Every modern processor uses pipelining. 台灣大學 吳安宇 教授
Memory and I/O • Memory and I/O are often bottlenecks in modern machines. • Processor speeds far outpace memory and I/O speed (network). • A 4GHz processor won’t help you browse the web any faster if you’re stuck on a 56kbps modem. • The issues associated with memory and I/O (NOT covered in this course) • How caches can dramatically improve the speed of memory accesses. • How processors, memory and peripheral devices can be connected, and CPU support for I/O communications. 台灣大學 吳安宇 教授
MIPS • In this class, we’ll use the MIPS instruction set architecture (ISA) to illustrate concepts in assembly language and machine organization • Of course, the concepts are not MIPS-specific • MIPS is just convenient because it is realistic, yet simple (unlike x86, CISC) • MIPS was one of the first RISC ISA’s. It is still used in many places today. Primarily in embedded systems, like: • Various routers from Cisco • Game machines like the Nintendo 64 and Sony Playstation 2 (PS2) 台灣大學 吳安宇 教授
SoC ExampleEmotion Engine in PS2 台灣大學 吳安宇 教授
PS2 and IP • Emotion Engine • MIPS R3000A Based Design • MPEG decoder • Vector generator (co-processor) • Reach 6.2G Flops 台灣大學 吳安宇 教授
Instruction Set Architecture (ISA) • As mentioned earlier, the ISA is the interface between hardware and software. • The ISA serves as an abstraction layer between the HW and SW • Software doesn’t need to know how the processor is implemented • Any processor that implements the ISA appears equivalent • An ISA enables processor innovation without changing software • This software compatibility has made billions of dollars for Intel. • Before ISA is finalized, software was re-written for each new machine. 台灣大學 吳安宇 教授
A little ISA history • 1964: IBM System/360, the first computer “family” • IBM wanted to sell a range of machines that ran the same software • 1960’s, 1970’s: Complex Instruction Set Computer (CISC) era • Much assembly programming, compiler technology immature • Simple machine implementations • Complex instructions simplified programming, little impact on design • 1980’s: Reduced Instruction Set Computer (RISC) era • Most programming in high-level languages, mature compilers • Aggressive machine implementations • Simpler, cleaner ISA’s facilitated pipelining, higher clock frequencies • 1990’s: Post-RISC era • ISA complexity largely relegated to non-issue • CISC and RISC chips use same techniques (pipelining, superscalar, ..) • ISA compatibility outweighs any RISC advantage in general purpose • Embedded processors prefer RISC for lower power, cost 台灣大學 吳安宇 教授
Basic MIPS Architecture • We started with how instruction set architectures (ISA) abstract away the hardware implementation details, enabling software compatibility across processor generations. • Today we’ll begin our discussion of the MIPS ISA, which will be our example system for much of this semester. • We present the basic instruction set architecture. • This also involves some discussion of the CPU hardware. • This architecture is mostly a superset of the one from CA, so today’s lecture should also serve as a quick review. 台灣大學 吳安宇 教授
MIPS: register-to-register, three address • MIPS is a register-to-register, or load/store, architecture. • The destination and sources must all be registers. • Special instructions, which we’ll see later today, are needed to access main memory. • MIPS uses three-address instructions for data manipulation. • Each ALU instruction contains a destination and two sources. • For example, an addition instruction (a = b + c) has the form: 台灣大學 吳安宇 教授
Register file review • Here is a block symbol for a general 2kx n register file. • If Write = 1, then D data is stored into D address. • You can read from two registers at once, by supplying the A address and B address inputs. The outputs appear as A data and B data. • Registers are clocked, sequential devices. • We can read from the register file at any time. • Data is written only on the positive edge of the clock. 台灣大學 吳安宇 教授
MIPS register file • MIPS processors have 32 registers, each of which holds a 32-bit value. • Register addresses are 5 bits long. • The data inputs and outputs are 32-bits wide. • More registers might seem better, but there is a limit to the goodness. • It’s more expensive, because of both the registers themselves as well as the decoders and muxes needed to select individual registers. • Instruction lengths may be affected, as we’ll see on Friday. 台灣大學 吳安宇 教授
MIPS register names • MIPS register names begin with a $. There are two naming conventions: • By number: $0 $1 $2 … $31 • By (mostly) two-character names, such as: $a0-$a3 $s0-$s7 $t0-$t9 $sp $ra • Not all of the registers are equivalent: • E.g., register $0 or $zero always contains the value 0 • (go ahead, try to change it) • Other registers have special uses, by convention: • E.g., register $sp is used to hold the “stack pointer” • You have to be a little careful in picking registers for your programs. 台灣大學 吳安宇 教授
Basic arithmetic and logic operations • The basic integer arithmetic operations include the following: add sub mul div • And here are a few logical operations: and or xor • Remember that these all require three register operands; for example: add $t0, $t1, $t2 # $t0 = $t1 + $t2 mul $s1, $s1, $a0 # $s1 = $s1 _ $a0 台灣大學 吳安宇 教授
Larger expressions • More complex arithmetic expressions may require multiple operations at the instruction set level. t0 = (t1 + t2) x (t3 - t4) • Temporary registers may be necessary, since each MIPS instructions can access only two source registers and one destination. • In this example, we could re-use $t3 instead of introducing $s0. • But be careful not to modify registers that are needed again later. 台灣大學 吳安宇 教授
Immediate operands • The ALU instructions we’ve seen so far expect register operands. How do you get data into registers in the first place? • Some MIPS instructions allow you to specify a signed constant, or “immediate” value, for the second source instead of a register. For example, here is the immediate add instruction, addi: addi $t0, $t1, 4 # $t0 = $t1 + 4 • Immediate operands can be used in conjunction with the $zero register to write constants into registers: addi $t0, $0, 4 # $t0 = 4 • MIPS is still considered a load/store architecture, because arithmetic operands cannot be from arbitrary memory locations. They must either be registers or constants that are embedded in the instruction. 台灣大學 吳安宇 教授
We need more space! • Registers are fast and convenient, but we have only 32 of them, and each one is just 32-bits wide. • That’s not enough to hold data structures like large arrays. • We also can’t access data elements that are wider than 32 bits. • We need to add some main memory to the system! • RAM is cheaper and denser than registers, so we can add lots of it. • But memory is also significantly slower, so registers should be used whenever possible. • In the past, using registers wisely was the programmer’s job. • For example, C has a keyword “register” that marks commonly-used variables which should be kept in the register file if possible. • However, modern compilers do a pretty good job of using registers intelligently and minimizing RAM accesses. 台灣大學 吳安宇 教授
Memory review • Memory sizes are specified much like register files; here is a 2k x n RAM. • A chip select input CS enables or “disables” the RAM. • ADRS specifies the memory location to access. • WR selects between reading from or writing to the memory. • To read from memory, WR should be set to 0. OUT will be the n-bit value stored at ADRS. • To write to memory, we set WR = 1. DATA is the n-bit value to store in memory. 台灣大學 吳安宇 教授
MIPS memory • MIPS memory is byte-addressable, which means that each memory address references an 8-bit quantity. • The MIPS architecture can support up to 32 address lines. • This results in a 232 x 8 RAM, which would be 4 GB of memory. • Not all actual MIPS machines will have this much! 台灣大學 吳安宇 教授
Loading and storing bytes • The MIPS instruction set includes dedicated load and store instructions for accessing memory, much like the CA example processor. • The main difference is that MIPS uses indexed addressing. • The address operand specifies a signed constant and a register. • These values are added to generate the effective address. • The MIPS “load byte” instruction lb transfers one byte of data from main memory to a register. lb $t0, 20($a0) # $t0 = Memory[$a0 + 20] • The “store byte” instruction sb transfers the lowest byte of data from a register into main memory. sb $t0, 20($a0) # Memory[$a0 + 20] = $t0 lb $t0, const($a0) 台灣大學 吳安宇 教授
Indexed addressing and arrays • Indexed addressing is good for accessing contiguous locations of memory, like arrays or structures. • The constant is the base address of the array or structure. • The register indicates the element to access. • For example, if $a0 contains 0, then lb $t0, 2000($a0)reads the first byte of an array starting at address 2000. • If $a0 contains 8, then the same instruction would access the ninth byte of the array, at address 2008. • This is why array indices in C and Java start at 0 and not 1. lb $t0, const($a0) 台灣大學 吳安宇 教授
Arrays and indexed addressing • You can also reverse the roles of the constant and register. This can be useful if you know exactly which array or structure elements you need. • The register could contain the address of the data structure. • The constant would then be the index of the desired element. • For example, if $a0 contains 2000, then lb $t0, 0($a0) accesses the first byte of an array starting at address 2000. • Changing the constant to 8 would reference the ninth byte of the array, at address 2008. lb $t0, 8($a0) 台灣大學 吳安宇 教授
Loading and storing words • You can also load or store 32-bit quantities—a complete word instead of just a byte—with the lw and sw instructions. lw $t0, 20($a0) # $t0 = Memory[$a0 + 20] sw $t0, 20($a0) # Memory[$a0 + 20] = $t0 • Most programming languages support several 32-bit data types. • Integers • Single-precision floating-point numbers • Memory addresses, or pointers • Unless otherwise stated, we’ll assume words are the basic unit of data. 台灣大學 吳安宇 教授
Memory alignment • Keep in mind that memory is byte-addressable, so a 32-bit word actually occupies four contiguous locations of main memory. • The MIPS architecture requires words to be aligned in memory; 32-bit words must start at an address that is divisible by 4. • 0, 4, 8 and 12 are valid word addresses. • 1, 2, 3, 5, 6, 7, 9, 10 and 11 are not valid word addresses. • Unaligned memory accesses result in a bus error, which you may have unfortunately seen before. • This restriction has relatively little effect on high-level languages and compilers, but it makes things easier and faster for the processor. 台灣大學 吳安宇 教授
The array example revisited • Remember to be careful with memory addresses when accessing words. • For instance, assume an array of words begins at address 2000. • The first array element is at address 2000. • The second word is at address 2004, not 2001. • Revisiting the earlier example, if $a0 contains 2000, then lw $t0, 0($a0) accesses the first word of the array, but lw $t0, 8($a0) would access the third word of the array, at address 2008. 台灣大學 吳安宇 教授
Computing with memory • So, to compute with memory-based data, you must: • Load the data from memory to the register file. • Do the computation, leaving the result in a register. • Store that value back to memory if needed. • For example, let’s say that an integer array A starts at address 4096. How can we do the following using MIPS assembly language? A[2] = A[1] x A[1] 台灣大學 吳安宇 教授
Basic MIPS Summary • We introduced the MIPS architecture. • The MIPS processor has thirty-two 32-bit registers. • Three-address, register-to-register instructions are used. • Immediates can be used to load or compute with constants • Loads and stores use indexed addressing to access RAM. • Memory is byte-addressable, and words must be aligned. • In section, we’ll begin discussing control flow. • In next lecture, we’ll continue with control flow and some other new instructions that will let us write more interesting programs. 台灣大學 吳安宇 教授
More MIPS Summary • W saw several additional MIPS features. • Assemblers can translate more powerful pseudo-instructions into the simpler instructions actually supported in hardware. • Branches and jumps help to implement various high-level control flow structures, like conditional statements and loops. • We also studied MIPS machine language. • All instructions are the same length, 32 bits. • The three instruction formats are I-type, R-type and J-type. • The 16-bit constant field in I-type instructions is enough for most common situations. In other cases, we can always resort to longer code fragments. 台灣大學 吳安宇 教授
Functions in MIPS Summary • We focused on implementing function calls in MIPS. • We call functions using jal, passing arguments in registers $a0-$a3. • Functions place results in $v0-$v1 and return using jr $ra. • Managing resources is an important part of function calls. • To keep important data from being overwritten, registers are saved according to conventions for caller-save and callee-save registers. • Each function call uses stack memory for saving registers, storing local variables and passing extra arguments and return values. • MIPS programmers must follow many conventions. Nothing prevents a rogue program from overwriting registers or stack memory used by some other function. • In section, we’ll look at writing recursive functions. 台灣大學 吳安宇 教授