Week 4 Lecture slides

Cosc 3P92 Week 4 Lecture slides It is the mark of an educated mind to be able to entertain a thought without accepting it. Aristotle

Op-code Operand(s) R0 100 ADD Computer Instruction Set . • An instruction has two components: • e.g • The operand field may have the following formats: 1) zero-address 2) one-address 3) two-address 4) three-address • The total number of instructions and the types and formats of the operands determine the length of an instruction.

Computer Instruction Set • The shorter the instruction, the faster the time that it can be fetched and decoded. • Shorter instructions are better than longer ones: • (i) take up less space in memory • (ii) transferred to the CPU faster • A machine with 2^N instructions must require at least N-bit to encode all the op-codes.

Bits/cell (word)

Instruction sets • Byte ordering • Big-endian: bytes in word ordered from left-to-right eg. Motorola • Little-endian: bytes in word ordered right-to-left eg. Intel • Creates havoc when transferring data; need to swap byte order in transferred words.

Big/Little Endian

instruction 0 instruction 1 instruction 2 3-to-8 decoder instruction 3 3-bit Op-code instruction 4 instruction 5 instruction 6 instruction 7 Op-code Encoding • 1. Block-code technique • To each of the 2K instructions a unique binary bit pattern of length K is assigned. • An K-to-2K decoder can then be used to decode all the instructions. For example,

Address 2 Address 3 Op-code Address 1 Address 1 Address 2 1 1 1 1 Op-code Op-code Address 1 1 1 1 1 1 1 1 1 1 1 1 1 Op-code 1 1 1 1 1 1 1 1 Op-code Encoding • 2. Expanding op-code technique • Consider an 4+12 bit instruction with a 4-bit op-code and three 4-bit addresses. • It can at most encode 16 three-address instructions. • If there are only 15 such three-address instructions, then one of the unused op-code can be used to expand to two-address, one-address or zero address instructions • Again, this expanded op-code can encode at most 16 two-address instructions. And if there are less than 16 such instructions, we can expand the op-code further.

Opcode Encoding • Note that the three address fields may not necessarily be used to encode a three-address operand; they can be used as a single 12-bit one-address operand. • Can have some part of the op-code to specify the instruction format and/or length. • if there are few two-address instructions, we may attempt to make them shorter instead and to use the first two bits to indicate the instruction length, e.g., 10 means two-address and 11 means three address.

Fewer bits are used for most frequently used instructions and more for the least frequently used ones. 1 0 1/2 0 1 1 1/4 1 0 1/2 1/8 1/4 1/8 1 1 0 1 1 0 0 0 1/16 1/16 1/16 1/16 1/8 1/8 1/4 1/4 HALT JUMP SHIFT NOT AND ADD STO LOAD 0000 0001 0010 0011 010 011 10 11 Op-code Encoding • Huffman encoding • Given the probability of occurrences of each instruction, it is possible to encode all the instructions with minimal number of bits, and with the following property:

Opcode encoding, Huffman codes • Huffman encoding algorithm: • 1. Initialize the leaf nodes each with a probability of an instruction. All nodes are unmarked. • 2. Find the two unmarked nodes with the smallest values and mark them. Add a new unmarked node with a value equal to the sum of the chosen two. • 3. Repeat step (2) until all nodes have been marked except the last one, which has a value of 1. • 4. The encoding for each instruction is found by tracing the path from the unmarked node (the root) to that instruction. • may mark branches arbitrarily with 0, 1

Opcode encoding, Huffman codes • Advantage: • minimal number of bits • Disadvantage: • must decode instructions bit-by-bit, (can be slow). • to decode, must have a logical representation of the encoded tree, and follow branches as you decipher bits • Fact is, most decoding is done in parallel • Gives a speed advantage

Addressing modes • inherent • an op-code indicates the address of its operandCLI ; clear the interrupt flag • immediate • an instruction contains or immediately precedes its operand valueADD #250, R1 % R1 := R1 + 250; • Absolute/Direct • an instruction contains the memory address of its operandADD 250, R1 % R1 := R1 + *(250); • register • an instruction contains the register address of its operandADD R2, R1 % R1 := R1 + R2;

Addressing Modes • register indirect • the register address in an instruction specifies the address of its operandADD @R2, @R1 % *R1 := *R1 + *R2; • auto-decrement or auto-increment • The contents of the register is automatically decremented or incremented before or after the execution of the instruction MOV (R2)+, R1 % R1 := *(R2); R2 := R2 + k; MOV -(R2), R1 % R2 := R2 - k; R1 := *(R2);

Addressing Modes • indexed • an offset is added to a register to give the address of the operandMOV 2(R2), R1 % R1 := R2[2]; • base-register • a displacement is added to an implicit or explicit base register to give the address of the operand • relative • same as base-register mode except that the instruction pointer is used as the base register

Addressing modes • Indirect addressing mode in general also applies to absolute addresses, not just register addresses; the absolute address is a pointer to the operand. • The offset added to an index register may be as large as the entire address space. On the other hand, the displacement added to a base register is generally much smaller than the entire address space. • The automatic modification (i.e., auto-increment or auto-decrement) to an index register is called autoindexing. • Relative addresses have the advantage that the code is position-independent.

Instruction Types • Instructions, of most modern computers, may be classified into the following six groups: • Data transfer (40% of user program instructions) MOV, LOAD • Arithmetic ADD, SUB, DIV, MUL • Logical AND, OR, NOT, SHIFT, ROTATE • System-control Test-And-Set • I/O Separate I/O space input/output

Instruction Types • Program-control • may be classified into the following four groups: • Unconditional branch BRB NEXT % branch to the label NEXT • Conditional branch SOBGTR R5, LOOP % repeat until R5=0 ADBLEQ R5, R6, LOOP % repeat until R5>R6 • Subroutine call CALL SUB % push PC; branch to SUB RET % pop PC • Interrupt-handling TRAP % generate an internal interrupt

Instruction types • Typical branch instructions • test the value of some flags called conditions. • Certain instructions cause these flags to be set automatically. • linkage registers • Used in implementing a subroutine. • Typically include the instruction pointer and stack pointer.. • The parameters passed between the caller and the called subroutine are to be established by programming conventions. • Very few computers support parameter-passing mechanisms in the hardware. • An external interrupt may be regarded as a hardware generated subroutine call • Can happen asynchronously. • When it occurs, the current state of the computation must be saved either by • the hardware automatically • or by a program (interrupt-service routine) control.

Examples: Intel Pentium X • back-compatible to 8088 (16 bit, 8 bit data bus), 8086 (16 bit), 80286 (16 bit, larger addr), 80386 (32 bit), ... • Based on IA-32 (instruction archetecture 32bit) • 3 operating modes: 1. real mode - acts like 8088 (unsafe -- can crash) 2. virtual 8086 - protected 3. protected - acts like Pentium II + 4 privilege levels too (kernel, user, ...) • little endian words • registers: [5.3] • EAX, EBX, ECX, EDX - general purpose, but have special uses (eg. EAX = arithmetic, ...) • ESI, EDI, EBP, ESP - addr registers

P2 Registers E = Extended Registers, designed for backwards compatibility to older Intel CPUs, used for arithmatic. Pointers to memory, copying and manipulating strings in memory Segment registers for backwards compatibility. P4 uses a flat address space Points to base of current stack frame, also called the frame pointer Stack pointer PC PSW or Flags

Ultra SPARC III • Single linear 2^64 memory space • Registers: • 32 64-bit general regs, 32 FP regs • global var regs: used by all procedures • register windows: param. passing done via registers (more later on RISC vs CISC) • CSW – current window pointer (register set swapping for procedure calls).

Overview of the UltraSPARC III ISA Level (2) Operation of the UltraSPARC III register windows.

8051 • Runs in 1 mode, • Single process, directly interfacing h/w, • No OS. • Ram & Rom are (can be) on chip • 4 sets of 8 GP registers, • 2 bits in PSW determine which set is active. • Interrupts do not cause a save of registers on a stack but a context switch. • Registers are directly mapped to memory • R0 = 0x0000 in memory etc. • 127 bit addressable memory locations • Bits correspond nicely to switches and LED outputs • Some specialized registers • Interupts • Timers • All mapped to memory 128 to 255.

Overview of the 8051 ISA Level (a) On-chip memory organization for the 8051. (b) Major 8051 registers.

Pentium: Instruction formats

Pentium: Instruction formats • formats are complex, irregular, with variable-sized fields (due to historical evolution) • no memory-to-memory instructions • 8088/286 - 1 byte opcode • 386 - expanding 1111 -> 2 byte opcode • Some fields: • 2-bit MOD - 4 modes,8 regs, 8 combination regs • 3 bit register REG, R/M • SIB (scale, index, base) array manipulation codes • 1,2,4 more bytes for operands, constants • Not all registers, modes applicable to all instructions: highly non-orthogonal

Example: PDP-11 instruction formats • • CISC machine • 16 bit instn size • • possibly 1 or 2 16 bit address words follow • • 8 modes, 8 regs -- regs 6 & 7 are stack, PC • • "orthogonal" addressing -- addressing and opcodes • are independent. • • some instns use expanding opcode • x111 -> use longer opcode

Example: UltraSPARC inst. formats • 32-bit instructions; 31 RISC instructions • first 2 bits help decode instruction format • to encode a 32 bit constant, need to do it in 2 separate instructions!

Example: Pentium addressing • 8088/286 are very non-orthogonal, and addressing possibilities are arbitrary for different registers

Example: Pentium Addressing • 386 -- if 16-bit segments used, then use previous - if 32-bit segments, use following...

Addressing: Pentium • new modes are more regular, general

Addressing: Pentium • • SIB mechanism: [5.27] --> arrays • scale = 1, 2, 4, 8 • multiply scale to Index register • adding to Base register • and then 8 or 32-bit displacement

Examples of addressing PDP-11 5.33 • power comes from ability of addressing modes to treat stack ptr, PC like any other registers • eg. mode 6 with PC (reg 7)

Examples of addressing PDP-11 • power comes from ability of addressing modes to treat stack ptr, PC like any other registers • eg. mode 6 with PC (reg 7)

Addressing & PDP 11 • orthogonality permits many variations with one opcode

Example: UltraSPARC addressing • all instructions use immediate or register addressing, except those that address memory • only 3 instructions address memory: Load, Store, and a multiproc. synch • use indirect addressing • register: 5 bits tell which register • 13 bit constants for immediate Example: 8051 addressing • 5 modes • Inherent (Accumulator) • Register • Direct • Register Indirect • Immediate

Can be used due to limited size of the CPU Discussion of Addressing Modes A comparison of addressing modes. Cpu is not designed for processing arrays, main purpose is i/o and control Not used, since Direct addressing is a slow memory reference.

Addressing: Discussion • PDP-11 is clean, simple; some waste • Pentium: specialized formats, addressing schemes • 386 - 32 bit addressing is more general • RISC (Ultra): simpler instructions, fewer modes • Compilers will generate required addressing, so a simple scheme will suffice • Specialized modes, formats makes instruction parallelism (pipelining) more difficult

Addressing: Discussion • Compact Instructions • Advantages. • smaller resource usage • faster fetch, execution • Disadvantages • reduce robustness • Larger instructions: • Advantages. • simpler formats • less constrained • Disadvantages • performance • waste

The end

Week 4 Lecture slides