890 likes | 1.07k Views
Introduction to ISAs and Assembly Language. Instruction Set Architecture Classification. According to the type of internal storage in a processor the basic types are Stack Accumulator General Purpose register Register – Memory Register – Register/ Load – Store Memory – Memory (obsolete)
E N D
Introduction to ISAs and Assembly Language
Instruction Set Architecture Classification According to the type of internal storage in a processor the basic types are • Stack • Accumulator • General Purpose register • Register – Memory • Register – Register/ Load – Store • Memory – Memory (obsolete) • Extended accumulator/extended general purpose register
Stack • A Top-of-Stack (TOS) register points to the top input operand, which is combined with the operand below • The first operand is removed from the stack, and the result is stored in the place of the second operand, while TOS is updated
Accumulator One operand implicitly in accumulator, the other in memory and result in accumulator
Register - Memory • One operand is in a register the other in memory and the result is stored in a register
Register – Register/ Load – Store • Both operands and result are stored in registers
Addressing Modes • Register • ADD R4, R3 ; Regs[R4] ←Regs[R4] + Regs[R3] • Immediate • ADD R4, #3 ; Regs[R4] ←Regs[R4] + 3 • Displacement • ADD R4, 100(R1) ; Regs[R4] ←Regs[R4] + Mem[100+Regs[R1]] • Register Indirect • ADD R4, (R1) ; Regs[R4] ←Regs[R4] + Mem[Regs[R1]] • Indexed • ADD R3, (R1+R2) ; Regs[R3] ←Regs[R3] + Mem[Regs[R1]+ Regs[R2]] • Direct or absolute • ADD R1, (1001) ; Regs[R1] ←Regs[R1] + Mem[1001] • Memory indirect • ADD R1, @(R3) ; Regs[R1] ←Regs[R1] + Mem[Mem[Regs[R3]]] • Autoincrement • ADD R1, (R2)+ ; Regs[R1] ←Regs[R1] + Mem[Regs[R2]] Regs[R2] ← Regs[R2] + d • Autodecrement • ADD R1, -(R2) ; Regs[R2] ← Regs[R2] - d Regs[R1] ←Regs[R1] + Mem[Regs[R2]] • Scaled • ADD R1, 100(R2)[R3] ; Regs[R1] ←Regs[R1] + Mem[100 + Regs[R2] + Regs[R3]*d]
Instruction encoding • Variable length (x86) • Fixed length (ARM, MIPS, PowerPC) • Hybrid (MIPS16, Thumb, TI TMS320C54x)
Taking orders • A computer does what you tell it to do • Not necessarily what you want it to do... • We give computers orders by means of instructions • Instructions tell the computer what it should be doing, right now • Arithmetic • Logic • Data movement • Control 3.1
Binary review Binary representations of numbers consist of only 1’s and 0’s 01012 = 510 10002 = 810 11111112 = 12710 Binary Facts: Unsigned (always positive)binary numbers 210 = 1024 = 1K» 1,000 220= 1,048,576 = 1M» 1,000,000 230= 1,073,741,824 = 1G» 1,000,000,000
Converting between binary and hex Hexadecimal 0 1 2 3 4 5 6 7 8 9 A B C D E F Binary 0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 1100 1101 1110 1111 Group bits together in groups of 4 10001100101100100001112 --> 10001100101100100001112 Assign the appropriate hex digit to each group 10001100101100100001112 --> 232C87 Done = 232C8716
The Translation Process Computers speak in binary. We don’t. High-level language A = B + C Compilers and Assemblerstranslate from one languageto another. Assembly language Compiler add $1, $2, $3 Assembler Machine language 000000 00010 00011 00001 00000 100000 3.2
MIPS Instructions Operation Destination Sources addA, B, C A = B+C In MIPS, All register-to-registerarithmetic instructions have three operands. subD, A, B D = A-B 3.2
Operands add A, B, C What are A, B, and C? The operands of arithmetic instructions are always registers add $17, $18, $19 Add contents of registers 18 and 19 and put result in register 17 sub $19, $19, $18 Subtract $19 - $18 and put the result back in $19 3.3
47 $2 Registering • MIPS has 32 general-purpose registers • $0 through $31 • Each register holds 32 bits • 0 to 232-1 (4 billion) if unsigned • -231to +231-1 (-2 billion to +2 billion) if signed • Most registers can hold any value, for any purpose • Exception: $0 is always zero! 3.3
Register Naming and Conventions • In MIPS, all registers (except $0) can be used for any purpose desired • However, there are standard use conventions that make it easier to write software
Reflections on Registers • Registers are just “special” memory locations • A small number of registers, as opposed to a huge number of memory locations • Because there are a small number of registers, accessing them is fast • Principle:Smaller is usually faster. • Trade-offs • More registers --> More data in fast memory --> Faster execution • Fewer registers --> Registers are faster --> Faster execution • Compromise: 16 to 32 registers works well 3.3
Complicated arithmetic F = (A + B) - (C + D) $12 = ($8 + $9) - ($10 + $11) Assume: A is in $8 B is in $9 C is in $10 D is in $11 F is in $12 We don’t have a 5-operandadd/subtract instruction! Use temporary variables tosolve the problem. Note: Typically, the compiler assigns variables to registers add $13, $8, $9 # $13 <-- A + B add $14,$10, $11 # $14 <-- C + D sub $12, $13, $14 # F <-- (A+B) - (C+D) $13 and $14 aretemporary variables 3.3
OpcodeRSRTRDShAmtFunction 6 bits 5 bits 5 bits 5 bits 5 bits 6 bits 0 = Add $8 $9 $13 0 32=Add Getting to the bits of it all We’ve looked at assembly language (briefly) The CPU wants bits. R-Type Instruction add $13, $8, $9 Assembler 00000001000010010110100000100000 08913032 32 bits, total 3.4
OpcodeRSRTRDShAmtFunction 6 bits 5 bits 5 bits 5 bits 5 bits 6 bits Staying Regular R-Type instructions all have the same format: Add has an opcode of ‘0’, a function of ‘32’ add $13,$8,$9: 00000001000010010110100000100000 Subtract has an opcode of ‘0’, a function of ‘34’ sub $13,$8,$9: 00000001000010010110100000100010 The instructions differ only in one bit! Regularity: Regularity is a key to high-performance Similar functions should be similar in format. 3.4
OpcodeRSRTImmediate Data 810124 00100001010011000000 0000 0000 0100 Constants • Many times, an instruction needs to use a constant value • Multiply by 4 • Add 3 • Instructions with constant data in them are called immediate instructions • addi$12, $10, 4 # Reg. 12 <-- Reg. 10 + 4 I-Type Instruction addimmediate I-Type instructions all have the same format: 6 bits 5 bits 5 bits 16 bits 3.8
Doing Double Duty • You desire to copy the value in register $8 to $10 • Called a “move” in computer terms • move $10, $8 #copy register $8 to $10 • Doesn’t exist in MIPS assembly language! • add $10, $8, $0 # adds zero and $8, result in $10 • Does the same thing as a move • Allows the add instruction to serve double duty! • Many instructions have double/triple functions • sub $8, $0, $8 # negate $8 • addi $12, $0, 4 # load register$12 with value 4 3.4
Thanks for all the Memory! When 32 registers just won’t do. Many times (almost all the time, actually), you can’t fit all of your data into 32 registers. What do you do? Put some (most) of the data in main memory. Remember: Smaller is faster. Registers: Small and Fast Memory: Big and Slow In MIPS, all operations (i.e. arithmetic) are done on registers, only. Memory is used only for storing what won’t fit in registers. 3.3
0996 3288 1000 43 1004 234 Loading and Storing 44 So you’ve got some data in memory. Big deal. You need it in a register to do anything useful. You need to loada register with a value from memory. lw $10, 1000($0) # copy memory location 1000 to $10 Afterwards, $10 has the value 43 Load Word - Loads a whole 32-bit word This value is added to 1000 - for now, it is zero Say you’ve added 1 to register $10 (now it has the value 44). Now you want to put it back in memory again. You need to storethe register’s value back tomemory. sw $10, 1000($0) # copy $10 to memory location 1000 3.3
Aside: Load and Store Architectures • The only way to communicate with memory is through a LW or SW instruction • If you want to operate on memory, you have to use at least three instructions (usually) lw $15, 4500($0) # load M[4500] into $15 add $15, $15, $3 # add $3 to $15 sw $15, 4500($0) # store $15 back into M[4500] • It doesn’t have to be this way • Contrast this with the Motorola 68000 • ADD D3, 4500 ; add register D3 to M[4500] • Is the grass greener on the other side? • MIPS: Takes more, simpler instructions... RISC • MC68000: Takes fewer, complex instructions... CISC 3.3
2000 2004 2008 2012 2016 2020 2024 2028 Data Structures - Arrays A single-dimensional array (vector) is a simple linear data structure int A[5]; /* integers are 4 bytes each */ start of array (2004 in this example) A[0] A[1] A[2] A[3] For 4-byte integers: Location of A[n] = Start + n*4; A[4] For data items of size s bytes: Location of A[n] = Start + n*s;
... 6000 List[0] 123 6004 List[1] 3288 6008 List[2] 43 List[3] 6012 1 6016 List[4] 45 ... Accessing data in memory Assume that the variable List points to the beginning of an arrayof 32-bit integers. List=6000 Move List[0] into $3: lw $3, List($0) # $3 <-- List[0] Move List[1] into $4: addi $8, $0, 4 # $8 <-- 4 lw $4, List($8) # $4 <-- List[1] Note: Memory addresses refer to 8-bit bytes! We usually reference 32-bit words. All lw/sw instructions must use an address that is a multiple of 4! To get proper index, have to multiply by 4. List and contents of $8 are added together to form address Move List[4] into $5: addi $8, $0, 16 # $8 <-- 16 lw $5, List($8) # $5 <-- List[4] 3.3
OpcodeRSRTImmediate Data 3595 240 10001101001001010000 0000 1111 0000 Load/Store Format • What instruction format do LW and SW have? • lw $5, 240($9) # load M[240+$9] into $5 • Needs • Opcode • Source register ($9) • Immediate Operand (240) • Destination register ($5) • Hmmm, we’ve seen this before.... I-Type Instruction Opcodefor LW: 35 6 bits 5 bits 5 bits 16 bits Opcodefor SW: 43 ThinkRegularity! 3.4
Assembly Language Programming and Control Logic
Assembler directives • Somehow, we’ve got to get data into memory • User input • Involves system calls (we’ll get to that later) • Constant data • Constant data is data that is in memory before our program starts executing • Machine-language instructions don’t give much help • The only way is to use Immediate instructions • The assembler helps us here! • Assembler directivesare special commands to the assembler. The most common directives put data into memory. A.10
Buffer: 00 00 00 01 Buffer+ 4: 00 00 00 02 .word Assembler Directive Buffer:.word 01, 02 Label: A name for this memory to go by. Acts as a variable name. Data to be stored. .word: Directive to store words in memory here. Remember: Words are 4 bytes each! Loads from Buffer+0 Loads from Buffer+4 lw $12, Buffer($0) # $12 <-- 00 00 00 01 addi $10, $0, 4 # $10 <-- 4 lw $13, Buffer($10) # $13 <-- 00 00 00 02 A.10
The Assembler Location Counter The assembler keeps track of where to put things by using a location counter. The location counter just points to the memory location to put the “next” item. For this example, assume the location counter starts at 4000 Hex Constants a denoted by the “0x” prefix Loc. Ctr. Label Table 4004 4008 4000: buffer1: .word 12 buffer2: .word 3, 4, 0x20,0x5 add $9, $0, $0 firstld: lw $8, buffer1($9) addi $9, $9, 4 secld: lw $10, buffer2($9) 4004: buffer1 = 4000 4016 buffer2 = 4004 4012 4020: firstld = 4024 4024: secld = 4032 4028: 4032:
Data Structures? No, thanks • Assembly has no concept of “data structures” • You can access a “variable” using another “variable” .data Tonto: .word 0x44, 0x22 Tonto2: .word 0x32 .text main: add $9, $0, $0 # clear $9 lw $8, Tonto($9) # put Tonto[0] in $8 addi $9, $9, 4 # increment $9 lw $10, Tonto($9) # put Tonto[1] in $10 addi $9, $9, 4 # increment $9 lw $10, Tonto($9) # put Tonto[2] ???? in $10 addi $v0,$0,10 syscall
borg: 10 12 22 33 ? ? 1 8 greeting: 52 65 73 69 ... 6C 65 2E 0 greeting2: 59 6F 75 20 ... 65 64 2E -- Other Memory Assembler Directives borg: .byte 33, 22, 12, 10, 8, 1 .byte - reserves bytes in memory .asciiz - reserves Null-terminated ASCII chars greeting: .asciiz “Resistance is small.” Null-terminated .ascii - reserves ASCII characters (no NULL) greeting2: .ascii “You will be informed.” A.10
Meeting all your needs for space Sometimes, we need to allocate (empty) space to be used later. inputbuffer: .space 100 Allocates 100 bytes of space for an input buffer. Space allocated this way is just reserved by the assembler.You have to make your own use of it. addi $12, $0, 6 sw $12, inputbuffer($0) # stores 6 in buffer A.10
Our first program! # This is our first program! Yeah! .data Tonto: .word 0x44, 0x22 .text main: add $9, $0, $0 # clear $9 lw $8, Tonto($9) # put Tonto[0] in $8 addi $9, $9, 4 # increment $9 lw $10, Tonto($9) # put Tonto[1] in $10 addi $v0,$0,10 syscall .datameans that data follows .textmeans that code follows main:tells SPIM where to start these two instructions end the program A.10
Logic Instructions • and $10, $8, $6 # bitwise and between $8 and $6, result in $10 • Example: • $8 = 0010 0001 1100 0001 0011 1100 1010 0000 • $6 = 1101 1110 0011 0001 1111 0000 1100 0001 • $10=0000 0000 0000 0001 0011 0000 1000 0000 • or $10, $8, $6 # bitwise or between $8 and $6, result in $10 • xor $10, $8, $6 bitwise xor between $8 and $6, result in $10 • The above are R-Type instructions • andi $10, $8, 6 # bitwise and between $8 and 6, result in $10 • Example: • $8 = 0010 0001 1100 0001 0011 1100 1010 0000 • 6 = 0000 0000 0000 0000 0000 0000 0000 0110 • $10=0000 0000 0000 0000 0000 0000 0000 0000 • ori $10, $8, 6 # bitwise or between $8 and 6, result in $10 • xori $10, $8, 6 bitwise xor between $8 and 6, result in $10 • The above are I-Type instructions
Example • Write at least 5 ways of clearing a register • Write at least 4 ways of copying the contents of a register to another register
Pseudoinstructions • Some “missing” instructions are commonly composed of others • The assembler “implements” these by allowing the “missing” instructions to be entered in assembly code. • When machine code is generated, the pseudoinstructions are converted to real instructions. • Pseudoinstructions are assembler-dependent • They can be turned-off in SPIM move $5, $3 add $5, $3, $0 neg $8, $9 sub $8, $0, $9 li $8, 44 addi $8, $0, 44 or ori $8, $0, 44 A.10
ActionCode (in $v0)Parameters Print an Integer 1 $a0 = value to print Print a String 4 $a0 = location of string Input an Integer 5 (after syscall) $v0 contains integer Input a String 8 $a0 = location of buffer, $a1 = length Exit program 10 SPIM I/O • SPIM I/O uses the SYSCALL pseudoinstruction • Set up parameters • Place correct code in $v0 • Execute SYSCALL To display a stringprompt: .asciiz “hello world” la $a0,prompt li $v0, 4 syscall To print the value in $t3: move $a0, $t3 li $v0, 1 syscall
A = B+C D = B+F M[18]=D D = B+F Y D>23? N M[22] = D C = B+A Boring • Straight-line code is nice, but boring • Just arithmetic and loads/stores based on a predetermined sequence • Decision-making elements add some spice to the equation • Control allows programs to make decisions based on their current state • The most common control structure is the branch 3.5
Going places Consider the GoTo if (x == y) q = 13; if (x != y) GoTo Next; q = 13; Next: ... if (p > q) r = 3; else r=2; if (p>q) GoTo R3; r = 2; GoTo Next; R3: r = 3; Next: ... while (y < 2) y = y+1; Loop: if (y >=2) GoTo End; y = y+1; GoTo Loop; End: ... if (condition) GoTo location and GoTo location are all we need 3.5
Opcode Immediate Data 6 bits 26 bits Branching out if ($9 == $10) GoTo Label; beq - Branch if EQual beq $9, $10, Label if ($7 != $13) GoTo Next; bne - Branch if Not Equal bne $7, $13, Next Branches use I-Type Instruction Format(need two registers and 16-bit data) GoTo Exit; j Exit j - Jump (unconditionally) Jumps need only an opcode and data -There is a lot more room for the data... J-Type Instruction 3.5
IF-Then Structures if$x == $ythenS1 S2 S1 should be executed if $x == $y is True If $x != $y, or after S1 is executed, S2 is executed bne $x, $y, False # if $x != $y, skip S1 S1 # $x == $y, execute S1 False: S2 # either way we get here, execute S2 If you can’t express the condition as a negative, try this: beq $x, $y, True # if $x == $y, then execute S1 j False # $x != $y, so exit True: S1 # $x == $y, execute S1 False: S2 # either way we get here, execute S2 3.5
IF-Then-Else Structures if$x == $ythenS1elseS2 S3 S1 should be executed if $x == $y S2 should be executed if $x != $y After executing S1 or S2, execute S3 beq $x, $y, IF # if $x == $y, goto S1, skip S2 S2 # $x != $y, execute S2 j Finish # now execute S3 IF: S1 # $x == $y, so execute S1 Finish: S3 # either way, we do S3 afterwards 3.5
More Complicated If-Then-Else structures if$x == $y and $z == $wthenS1elseS2 S3 S1 should be executed if both$x == $y and $z == $w S2 should be executed if either$x != $yor$z != $w After executing S1 or S2, execute S3 The simplest way to figure out complex if-then-else structures is to first write the conditions, then all the statements and finally figure out the branches and jumps The simplest way to figure out complex if-then-else structures is to first write the conditions, then all the statements and finally figure out the branches and jumps STEP 1: conditions bne $x, $y, ____ bne $z, $w, ____ STEP 1: conditions bne $x, $y, ____ bne $z, $w, ____ STEP 1: conditions bne $x, $y, ____ bne $z, $w, ____ STEP 1: conditions bne $x, $y, ____ bne $z, $w, ____ STEP 1: conditions bne $x, $y, ____ bne $z, $w, ____ STEP 1: conditions bne $x, $y, ____ bne $z, $w, ____ STEP 2: statements bne $x, $y, ____ bne $z, $w, ____ S1 S2 S3 STEP 2: statements bne $x, $y, ____ bne $z, $w, ____ S1 S2 S3 STEP 2: statements bne $x, $y, ____ bne $z, $w, ____ S1 S2 S3 STEP 2: statements bne $x, $y, ____ bne $z, $w, ____ S1 S2 S3 STEP 3: branches bne $x, $y, false bne $z, $w, false S1 j skip false: S2 skip: S3 STEP 3: branches bne $x, $y, false bne $z, $w, false S1 j skip false: S2 skip: S3
Example if$x == $y or $z == $wthenS1else if $x == $z or $y == $wthenS2else S3 STEP 1: conditions bne $x, $y, ____ bne $z, $w, ____ STEP 1: conditions bne $x, $y, ____ bne $z, $w, ____ STEP 1: conditions beq $x, $y, ____ beq $z, $w, ____ beq $x, $z, ____ beq $y, $w, ____ STEP 2: statements beq $x, $y, ____ beq $z, $w, ____ beq $x, $z, ____ beq $y, $w, ____ S1 S2 S3 STEP 3: branches beq $x, $y, then1 beq $z, $w, then1 beq $x, $z, then2 beq $y, $w, then2 j skip then1: S1 j skip then2: S2 skip: S3
Interlude: Comments on Comments • Code should be thoroughly commented • Code may be modified by different programmers • Code may be reused a long time after writing it • There are two types of commenting style depending on the reader of the code the comments are intended for • Comments that explain what the instruction does • Typical for programming language books and lecture notes • The purpose is to learn the language • Worthless if used in a real program • Comments that explain the logic behind the instruction • Typical for real programs, assumes reader knows language • The purpose is to understand the program • Not helpful for learning the language • Examples: • addi $7, $7, 4 #add 4 to R7, R7<-R7+4 #incrementing address • lw $5, ARRAY($7) #R5 <- [ARRAY+4] #fetching next ARRAY #element
While Loops while$x == $ydoS1 Execute S1 repeatedly as long as $x == $y is true Repeat: bne $x,$y, Exit # exit if $x != $y is False S1 # execute body of loop j Repeat # do it all over again Exit: # end of the loop Warning: The following loop always executes at least once, no matter what $x and $y are: Repeat: S1 # execute body of loop beq $x,$y, Repeat # do it again if $x == $y Exit: # end of the loop 3.5