580 likes | 591 Views
CS/COE0447 Computer Organization & Assembly Language. Chapter 2 Part 2. Continuing from Chapter 2 Part 1. Starts with modified versions of Part 1’s last 5 slides. But first: errors on green card p1. lbu I 0/24hex lbu I 24hex lhu I 0/25hex lbu I 25hex
E N D
CS/COE0447Computer Organization & Assembly Language Chapter 2 Part 2
Continuing from Chapter 2 Part 1 Starts with modified versions of Part 1’s last 5 slides
But first: errors on green card p1. • lbu I 0/24hex lbu I 24hex • lhu I 0/25hex lbu I 25hex • lw I 0/23hex lw I 23hex • EG: machine code for lbu $8,2($7) • sll and srl: replace “rs” with “rt” (example later)
ANDI and ORI lui I R[rt] = {immediate,16’b0} andi I R[rt] & ZeroExtImm (3) (3) ZeroExtImm = {16{1’b0},immediate} In Verilog: 16'h704f // a 16-bit hex number 1b‘0 // a 1-bit binary number lui $t1, 0x7F40 addi $t2, $t1, 0x777 andi $t3, $t2, 0x5555 Above and ori version in lecture
$t0 1010101001010101 0000000000000000 $t0 1010101001010101 1100110000110011 Long Immediates (e.g., memory addresses!) • Sometimes we need a long immediate, e.g., 32 bits • MIPS requires that we use two instructions • lui $t0, 0xaa55 • ori $t0, $t0, 0xcc33 • Note: this wouldn’t work if ori used SignExtImm rather than ZeroExtImm! (“or” with 0 == copy; like adding 0 in arithmetic)
Loading a memory address • .data places values in memory starting at 0x10010000. 32 bits are needed to specify memory addresses. • 1 instruction is impossible: the address would take up the entire instruction!! (no room for the opcode!!) • la $t0, 0x1001008 is a pseudo instruction – not implemented in the hardware lui $1, 4097 la $t0, 0x10010008 ori $8, $1, 8 lw $t1, 0($t0) #look at green card to #understand this addrmode
A program • Get sample1.asm from the schedule • Load it into the simulator • Figure out the memory contents, labels • Trace through the code
.data # sample1.asm a: .word 3,4 c: .word 5,6 .text la $t0,c # address of c la $t1,k # address of k lw $s0,0($t0) # load c[0] lw $s1,4($t1) # load k[1] slt $s3,$s0,$s1 # if c[0] < k[1], $s3 = 1, else $s3 = 0 beq $s3,$0,notless # if c[0] < k[1] swap their values sw $s0,4($t1) sw $s1,0($t0) notless: .data k: .word 0xf,0x11,0x12
Quick Exercise • From A-57: • load immediate • li rdest, imm e.g., li $t0,0xffffffff • “Move the immediate imminto register rdest” • [nothing is said about sign or 0 extension] • What type of instruction is this? E.g., is this an R-format instruction? Perhaps an I-format one? … Please explain.
Quick Exercise Answer • In class
Memory Transfer Instructions • To load/store a word from/to memory: • LOAD: move data from memory to register • lw $t3, 4($t2) # $t3 M[$t2 + 4] • STORE: move data from register to memory • sw $t4, 16($t1) # M[$t1 + 16] $t4 • Support for other data types than 32-bit word is needed • 16-bit half-word • “short” type in C • 16-bit processing is common in signal processing • lhu and sh in MIPS • 8-bit byte • “char” type in C • 8-bit processing is common in controller applications • lbu and sb
void swap(int v[], int k) { int temp; temp = v[k]; v[k] = v[k+1]; v[k+1] = temp; } Machine Code Example $a0: pointer to array $a1: k swap: sll $t0, $a1, 2 add $t1, $a0, $t0 lw $t3, 0($t1) lw $t4, 4($t1) sw $t4, 0($t1) sw $t3, 4($t1) jr $ra
BYTE 5 BYTE 4 BYTE 3 BYTE 2 BYTE 1 BYTE 0 Memory View • Viewed as a large, single-dimensional 8-bit array with an address (“byte address”) • A memory address is an index into the array …
Half Words 32-bit Words Bytes Addr. 0000 0000 • Addresses Specify Byte Locations • Address of first byte • Addresses are aligned: addresses are multiples of X, where X is the number of bytes. • word (4 bytes) addresses are multiples of 4; half-word (2 bytes) addresses are multiples of 2 Addr = ?? 0001 0002 0000 0002 0003 0004 0004 Addr = ?? 0005 0006 0004 0006 0007 0008 0008 Addr = ?? 0009 000A 000A 0008 000B 000C 000C In Class: match up with format of memory shown by simulator Addr = ?? 000D 000E 000C 000E 000F
Example of Memory Allocation .data b2: .byte 2,3,4 .align 2 .word 5,6,7 .text la $t0,b2 lbu $t2,0($t0) # $t2 = 0x02 lbu $t2,1($t0) # $t2 = 0x03 lbu $t2,3($t0) # $t2 = 0x00 (nothing was stored there) lbu $t2,4($t0) # $t2 = 0x00 (top byte of the 5 word) lbu $t2,7($t0) # $t2 = 0x05 (bottom byte of the 5 word)
Byte Ordering • How should bytes within multi-byte words be ordered in memory? • Conventions • “Big Endian” machines (including MIPS machines) • Least significant byte has highest address .data .word 3 (0x00000003; 03 is the least sig. byte) 03 is in 10010003 • “Little Endian” machines • Least significant byte has lowest address • 03 would be in 10010000
Big Endian 0x100 0x101 0x102 0x103 01 01 23 23 45 45 67 67 Little Endian 0x100 0x101 0x102 0x103 67 67 45 45 23 23 01 01 Byte Ordering Example • Big Endian • Least significant byte has highest address • Little Endian • Least significant byte has lowest address • Example • Suppose variable x has 4-byte representation 0x01234567 • Suppose address of x is 0x100
… 20 WORD 16 WORD 12 WORD 8 WORD 4 WORD 0 WORD Memory Organization • 232 bytes with byte addresses from 0 to 232 – 1 • 230 words with byte addresses 0, 4, 8, …, 232 – 4 • 231 half-words with byte addresses 0, 4, 8, …, 232 – 2 • Suppose addresses were 5 bits. • Bytes: 0…31; • Words: 0,4,8,12,14,16,20,24,28, i.e., 0,4,..,2^5 – 4; • Half-words: 0,2,4,…, 28, 30, i.e., 0,2,…,2^5-2. Unsigned numbers in n bits: 0 to 2^n - 1 Eg: 3 bits: 0 to 7 4 bits: 0 to 15 5 bits: 0 to 31
Misalignment Example • Misaligned accesses (errors!) • lw $t1, 3($zero) • lbu $t1, 1($zero) • Alignment issue does not exist for byte accesses 0 0 1 2 3 4 4 5 6 7 8 8 9 10 11 …
Shift Instructions • Bit-wise logic operations • <op> <rdestination> <rsource> <shamt> • Examples • sll $t0, $s0, 4 # $t0 = $s0 << 4 • srl $s0, $t0, 2 # $s0 = $t0 >> 2 • These are the only shift instructions in the core instruction set • Green card is wrong for sll and srl; “rs” should be “rt”
Shift Instructions • Variations in the MIPS-32 instruction set: • Shift amount can be in a register (“shamt” field not used) • sllv, srlv, srav • Shift right arithmetic (SRA) keeps the sign of a number • sra $s0, $t0, 4 • Pseudo instructions: • Rotate right/left: ror, rol
.text li $t0,0x77 li $t1,-8 li $t2,3 sll $t3,$t0,3 sllv $t3,$t0,$t2 srl $t3,$t0,2 srl $t3,$t1,2 sra $t3,$t1,2 li $s0,0x00000002 li $s1,1 ror $s0,$s0,$s1 ror $s0,$s0,$s1 ror $s0,$s0,$s1
if (i == h) h =i+j; (i == h)? bne $s0, $s1, LABEL add $s3, $s0, $s1 LABEL: … h=i+j; LABEL: Control • Decision-making instructions • Alter the control flow, i.e., change the “next” instruction to be executed • MIPS conditional branch instructions • bne $t0, $t1, LABEL • beq $t0, $t1, LABEL • Example YES NO
if (i == h) f=g+h; else f=g–h; bne $s4, $s5, ELSE add $s3, $s4, $s5 j EXIT ELSE: sub $s3, $s4, $s5 EXIT: … YES NO (i == h)? f=g+h; f=g–h EXIT Control, cont’d • MIPS unconditional branch instruction (jump) • j LABEL • Example • f, g, and h are in registers $s3, $s4, and $s5
# while (save[i] == k) # i += 1 .data save: .word 5,5,5,5,7,6,6,7 .text li $s3,1 # $s3 is 1; suppose i is 1 la $s6,save # $s6 is base address of array li $s5,5 # $s5 is k; suppose k is 5 loop: sll $t1,$s3,2 add $t1,$t1,$s6 #$t1 points to save[i] lw $t0,0($t1) #$t0 = save[i] bne $t0,$s5,exitloop addi $s3,$s3,1 # i += 1 j loop exitloop: add $s6,$s3,$zero
# sum = 0 # for (i = 0; i < n; i++) # sum += i addi $s0,$zero,0 # $s0 sum = 0 addi $s1,$zero,5 # $s1 n = 5; arbitrary value addi $t0,$zero,0 # $t0 i = 0 loop: slt $t1,$t0,$s1 # i < n? beq $t1,$zero,exitloop # if not, exit add $s0,$s0,$t0 # sum += i addi $t0,$t0,1 # i++ j loop exitloop: add $v0,$zero,$s0 # $v0 has the sum
# sum = 0 # for (i = 0; i < n; i++) # sum += i addi $s0,$zero,0 # $s0 sum = 0 addi $s1,$zero,5 # $s1 n = 5; a random value addi $t0,$zero,0 # $t0 i = 0 loop: bge $t0,$s1,exitloop # i < n? PSEUDO OP! add $s0,$s0,$t0 # sum += i addi $t0,$t0,1 # i++ j loop exitloop: add $v0,$zero,$s0 # $v0 has the sum
4 17 18 25 Instruction Format • Address in the instruction is not a 32-bit number – it’s only 16-bit • The 16-bit immediate value is in signed, 2’s complement form • Addressing in branch instructions • The 16-bit number in the instruction specifies the number of “instructions” to skip • Memory address is obtained by adding this number to the PC • Next address = PC + 4 + sign_extend(16-bit immediate << 2) • Example • beq $s1, $s2, 100
0x00400024 0x00400028 0x0040002c 0x00400030 bne $t0,$s5,exitloop addi $s3,$s3,1 j loop exitloop: add $s6,$s3,$zero
2 2500 Instruction Format, cont’d • The address of next instruction is obtained by concatenating with PC • Next address = {PC[31:28],IMM[25:0],00} • Address boundaries of 256MB • Example • j 10000
op op op op op op rs rs rs rs rs shamt rd rt rt rt rt rt immediate immediate offset offset funct 26-bit address MIPS Addressing Modes • Immediate addressing (I-format) • Register addressing (R-/I-format) • Base addressing (load/store) [register + offset] • PC-relative addressing (beq/bne) [PC + 4 + offset] • Pseudo-direct addressing (j) [concatenation w/ PC]
Stack and Frame Pointers • Stack pointer ($sp) • Keeps the address to the top of the stack • $29 is reserved for this purpose • Stack grows from high address to low • Typical stack operations are push/pop • Procedure frame • Contains saved registers and local variables • “Activation record” • Frame pointer ($fp) • Points to the first word of a frame • Offers a stable reference pointer • $30 is reserved for this • Some compilers don’t use $fp
void swap(int v[], int k) { int temp; temp = v[k]; v[k] = v[k+1]; v[k+1] = temp; } swap: muli $2, $5, 4 add $2, $4, $2 lw $15, 0($2) lw $16, 4($2) sw $16, 0($2) sw $15, 4($2) jr $31 00000000101000010… 00000000000110000… 10001100011000100… 10001100111100100… 10101100111100100… 10101100011000100… 00000011111000000… assembler compiler “C Program” Down to “Numbers”
To Produce an Executable source file .asm/.s object file .obj/.o assembler source file .asm/.s object file .obj/.o executable .exe assembler linker source file .asm/.s object file .obj/.o library .lib/.a assembler
An Assembler • Expands macros • Macro is a sequence of operations conveniently defined by a user • A single macro can expand to many instructions • Determines addresses and translates source into binary numbers • Start from address 0 • Record in “symbol table” addresses of labels • Resolve branch targets and complete branch instructions’ encoding • Record instructions that need be fixed after linkage • Packs everything in an object file • “Two-pass assembler” • To handle forward references
An Object File • Header • Size and position of other pieces of the file • Text segment • Machine codes • Data segment • Binary representation of the data in the source • Relocation information • Identifies instructions and data words that depend on absolute addresses • Symbol table • Keeps addresses of global labels • Lists unresolved references • Debugging information • Contains a concise description of the way in which the program was compiled
Assembler Directives • Guides the assembler to properly handle following codes with certain considerations • .text • Tells assembler that codes follow • .data • Tells assembler that data follow • .align • Directs aligning the following items • .global • Tells to treat the following symbol as global • .asciiz • Tells to handle the following as a “string”
Code Example .text .align 2 .globl main main: subu $sp, $sp, 32 … loop: lw $t6, 28($sp) … la $a0, str lw $a1, 24($sp) jal printf … jr $ra .data .align 0 str: .asciiz “The sum from 0 … 100 is %d\n”
.data int_str: .asciiz “%d” .text .macro print_int($arg) la $a0, int_str mov $a1, $arg jal printf .end_macro … print_int($7) la $a0, int_str mov $a1, $7 jal printf Macro Example
Procedure Calls • Argument passing • First 4 arguments are passed through $a0~$a3 • More arguments are passed through stack • Result passing • First 2 results are passed through $v0~$v1 • More results can be passed through stack • Stack manipulations can be tricky and error-prone • More will be discussed in recitations
Compiler Structure • Compiler is a software with a variety of functions implemented inside it • Front-end • Deals with high-level language constructs and translates them into more relevant tree-like or list-format internal representation (IR) • Symbols (e.g., a[i+8*j]) are still available • Back-end • Back-end IR that more or less correspond to machine instructions • With more steps, IR becomes machine instructions
Compiler Structure, cont’d • Front-end • Scanning • Takes the input source program and chops the programs into recognizable “tokens” • Also known as “lexical analysis” and “LEX” tool is used • Parsing • Takes the token stream, checks the syntax, and produces abstract syntax trees • “YACC” or “BISON” tools are used • Semantic analysis • Takes the abstract syntax trees, performs type checking, and builds a symbol table • IR generation • Similar to assembly language program, but assumes unlimited registers, etc.
Compiler Structure, cont’d • Back-end • Local optimization • Optimizations within a basic block • Common Sub-expression Elimination (CSE), copy propagation, constant propagation, dead code elimination, … • Global optimization • Optimizations that deal with multiple basic blocks • Loop optimizations • Loops are so important that many compiler and architecture optimizations target them • Induction variable removal, loop invariant removal, strength reduction, … • Register allocation • Try to keep as many variables in registers as possible • Machine-dependent optimizations • Utilize any useful instructions provided by the target machine
Code Example (AST) while (save[i] == k) i+=1;