1.18k likes | 1.2k Views
ECE 371 Microprocessors Chapter 5 x86 Assembly Language 2. Herbert G. Mayer, PSU Status 11/2/2015 For use at CCUT Fall 2015. Syllabus. Motivation Integer Multiply Integer Divide Conditional Branch Loop Constructs Memory Access Call and Return Procedure PutDec Summary Appendix
E N D
ECE 371 Microprocessors Chapter 5x86 Assembly Language 2 Herbert G. Mayer, PSU Status 11/2/2015 For use at CCUT Fall 2015
Syllabus • Motivation • Integer Multiply • Integer Divide • Conditional Branch • Loop Constructs • Memory Access • Call and Return • Procedure PutDec • Summary • Appendix • Bibliography
Motivation • In another handout about x86 assembly language we cover modules, character- and string-output, and writing assembler procedures • Here we cover integer arithmetic, loops, and develop a more complex program to output signed integer numbers • Since integer multiplication can generate results that are twice as long in bits as any of the source operands, the machine instructions for integer multiply –conversely for integer divide– must make special provisions for the length of operands
X86 Integer Multiply and Divide
Integer Multiply • Our first project is 16-bit signed integer multiplication • To track all minute detail of the result, including overflow, sign of the result, etc. we use the small x86 machine model, which uses 16-bit operands • In that model, the smallest negative integer is -32768, the largest is 32767 • The same principles apply to the newer model with 64-bit precision
Integer Multiply • Under Microsoft’s assembler the opcodes are mul for unsigned, and imul for signed integer multiplication • One operand is the ax register; it is always implied • The other operand may be a memory location, or another register • A literal operand is not permitted in the small mode; i.e. on the 16-bit architecture version, ok on 32-bit • The result/product is in the register pair ax and dx • There exists also a byte-version of the multiply, in which case the implied operand is in al, the other operand is a byte memory location or a byte register, and the result/product is in ax • In the code sample below we multiply literal 10, moved into register bx, with the contents of the second, implied operand: register ax
Integer Multiply ; integer multiplication on x86, small mode: ; multiply literal 10 with contents of ax ; ax holds a copy of memory location MAX mov bx, 10 ; a literal is in bx mov ax, MAX ; signed word at location MAX imul bx ; product is in ax + dx ; hi order 16 bits in dx . . .
Integer Divide: cwd • Just as the integer multiply creates a signed integer double-word result in register pair ax and dx, the integer divide instruction assumes the numerator to be in the register pair ax and dx • But if the numerator happens to be a single precision operand, it will have to be explicitly extended • The denominator may be in a register or memory • To create a sign-extended double-register operand in the ax-dx pair from the single-precision operand in ax, the x86 architecture provides the convert-to-double instruction cwd • The cwd has no explicit operand • Assumed operand is the value in ax, ax is unchanged • The sign of ax is extended into dx
Integer Divide: cwd ; memory location B_word holds operand ; that operand is copied i.e. moved into register ax ; to be used as numerator in divide ; but first convert single- to double-precision mov ax, B_word ; signed word at B_word in ax cwd ; convert word to double-word ; sign of ax extended in dx ; ditto with byte-sized operands mov al, a_byte ; signed byte a_byte into ah cbw ; convert byte to word ; now the numerator can be used as operand in divide . . .
Integer Divide • Integer divide needs 2 operands • Numerator is in ax extended to dx double word • Other operand may be memory location or register • Opcode div is for unsigned and idiv for signed integer division • In example assume numerator to be in memory location A_wd • Denominator is at memory location B_wd • Quotient ends up in ax, and remainder in dx
Integer Divide ; signed integer divide on x86: ; assume operands to be in locations A_wd and B_wd ; mov ax, A_wd ; signed word at A_wd in ax cwd ; sign of A_wd 16 times in dx idiv B_wd ; quotient A_wd/B_wd in ax ; remainder A_wd/B_wd in dx ; flags set to see: negative?
Memory Access On the x86 Microprocessor
Memory Access • Key components of any computer architecture are the processor and memory • Memory is referenced implicitly and explicitly by instructions that read and write data to and from memory • Explicit accesses are called loads(for reading) and stores(for writing) • Assemblers provide explicit instructions for these operations • Implicit memory accesses occur in machine instructions whose operands may be memory cells • On RISC systems these implicit references generally do not exist; instead all memory traffic is exclusively funneled through loads and stores on RISC
Memory Access • In an assembler program, memory locations (both for data and code) are generally referred to symbolically • This improves readability and allows for relocation; i.e. the linker and loader have a certain degree of freedom of placement in physical memory • However, explicit memory addressing via a hard coded numbers is also possible; for example, on a hypothetical machine ld r1, 1000 could mean: load the word in memory location 1000 into register r1 • Some assembly languages provide syntax to render the indirection explicit, for example the load operation: ld r1, (1000) uses parentheses to allude to this indirection
Memory Access • A common paradigm of referencing symbolic memory names (labels) is through what is called indirectly. This means, the label value (memory address) is not what is wanted, but the contents of memory at that label location • For example, if the offset of the data name foo is 10000, then the operation ld r2, foo does generally not mean to load the value 10000 into a register • Instead, foo is used indirectly, the word at that address is referenced, loaded into register r2. When the address is really wanted, the IBM 370 architecture for example uses a special type of load, called load address, while the masm assembler for the x86 architecture uses the seg -or offset- operator to allude to the fact that indirection is not wanted • Instead, the segment register portion of the address -or the offset portion of the address- is wanted
Memory Access • During indirect memory references it is sometimes desirable to index • Indexing means that one wishes to modify an otherwise fixed memory address. Typically, such a modifier resides in a register • And if the value in that register is modified from iteration to iteration, the indexing operation can access memory in some sequential order, say in increasing (or decreasing) fashion • This access to sequential memory addresses in equal steps is known as stride. For example, if r2 is a register loaded with a value (say, 2) then the instruction ld r1, foo[r2] means: fetch the word which is located 2 bytes further in memory than the offset expressed by foo • Load that word into register r1
Memory Access • In addition to indexing through a register, many architectures (and thus their assemblers) allow the offset to be modified by an additional literal index • The literal value is encoded into the instruction, referred to as an immediate operand • Immediate values are usually small, since the architectures often provide just a few bits to hold it • On some architectures this immediate operand may be signed, on others only unsigned literal modifiers are possible
Memory Access • Memory holds the data being manipulated • Also intermediate results must be stored somewhere • Registers usually are in short supply, contrasted with the size of memory • Before completing a computation, data must be brought from peripherals to memory • After computation, data must be sent from memory to peripherals, e.g. printers • Often a cache helps overcome the speed bottleneck of memory accesses
Memory Access Indexing on x86 • Indirect memory references are the default semantics on assemblers for the x86 architecture • On nasm and masm this can also be expressed explicitly via the [ ] operator pair • For example the move instruction --this mov is really a load: • data_seg1 segment • foo dw -1999, 0, . . . • data_seg1 ends • . . . • mov ax, foo ; indirection implied in masm • mov ax, [foo] ; explicit indirection in masm
Memory Access Indexing on x86 • The above mov code loads the word at data segment location foo into register ax, regardless of whether the [] operator is used • In the nasm assembler the instruction mov ax, foo ; load offset of address in nasm • loads the address of the memory location into register ax, while the nasm mnemonic: mov ax, [foo]; loads contents at address in nasm • loads the contents of memory location foo into ax; assembler differences can be very subtle!!
Memory Access Indexing on x86 • A handy programming tool that makes indexing so convenient is the ability to modify address labels by registers, literals, or a combination of both • Clearly, the underlying computer architecture must support this, i.e. there must be instructions in place that allow index or multiply indexed load and store operations • Some architectures (including IBM 360 and x86) allows multiple registers to be used to modify (to index) the address label • These registers are referred to as base- and as index-registers • Note that the term base-register often means that the base address sits in that register
Memory Access Indexing on x86 • However, in the x86 architecture, as long as an address expression includes a data memory label, that label is the base address • With the following provisos: If l1, l2 are address labels, and c1, c2 are numeric literal constants, then: • l1 + c1 ; is address of location l1 plus c1 • l1 – c2 ; is address of location l1 minus c2 • l1 – l2 ; is a pure numeric value: l1 – l2 • [l2 + c1] ; is the memory content at l2 + c1 • [l1 – c2] ; is the memory content at l1 – c2 • l1 + l2 ; is illegal on x86
Memory Access Indexing on x86 • On orthogonally designed architectures, a user visible register is usable as an index (or base-) register • Practical limitations often forced compromises. For example, on the x86 architecture, only certain registers can be used for indexing, listed below: address expression + one of ( bx, bp, si, and di ) on x86 • An address expression, being indexed by one (even 2) of these index registers, may also contain a literal modifier, or both, making the indexing operation practical and easy to use for array indexing. Note that it is possible to use up to 2 index- and base-registers in a single address expression, but only with the following restriction: address expression + two of ( ( bx or bp, and ( si or di ) ) on x86
Memory Access Indexing on x86 • An address expression such as [min_data+bx+si+2] is allowed, while the expression [min_data+bx+bx] is not permitted due to multiple uses of the bx register • These samples assume that min_data is a legal label in the date segment • A complete expression with all typical arithmetic operators is allowed on x86 assemblers, as long as the resulting value is computable (and reducible to) a single numeric value at the time of assembly • Thus, an expression like [chars+bx+si+2*3+4] is legal, provided that chars is a legal data label
Memory Access Implicit Segment Register • Data declared in the data segment below are digits hex ‘0’ .. ‘f’ • The user-designed Put_Char macro uses system service call 02h for single-character output • Using bx as index register • Note that only base and index registers can be used for this purpose, e.g. not cx • Memory operands (data labels) are used indirectly • Indirection is explicitly expressed via [ and ] operator • But not necessarily needed for memory operands in Microsoft SW, as indirection is most common case • Since it is needed in nasm and Unix systems, we recommend use of the [ ] operation
Memory Access Implicit Segment Register • Benefit is also improved readability to use explicit brackets to allude to indirection, such as [chars] • Note that indirect offset and index register are both allowed • Either or both or none may be modified by an immediate operand • Immediate operand are limited to 16 bits in size • Order of offset and index arbitrary • The output of program below is: hm02012452267
Memory Access ; Purpose: memory references, indexing ; HM for use at CCUT start macro ; no parameters mov ax, @data ; @data predefined macro mov ds, ax ; now data segment reg set endm ; end macro: start termin macro ret_code ; no parameters, assume 0 mov ah, Term_Code ; we wanna terminate, ah + al mov al, ret_code ; any errors? If /= 0 int 21h ; call DOS for help endm ; end macro: termin Put_Char macro char ; output passed character mov ah, Cout_Code ; tell DOS: Char out mov dl, char ; char into required byte reg int 21h ; and call DOS endm ; end macro Put_Char Cout_Code = 2h Term_Code = 4ch .model small .data chars db "0123456789abcdef"
Memory Access .code main: start mov bx, 2 ; index char '2' in chars mov cl, 'h' Put_Char cl ; o.k. since cl holds char Put_Char 'm' Put_Char chars ; not good programming Put_Char chars[bx] ; shows partial indirection Put_Char [chars] ; explicit Put_Char [chars+1] ; explicit Put_Char [chars+bx] Put_Char chars[bx+2] Put_Char [chars+bx+3] Put_Char [bx][chars] Put_Char [chars]+[bx] Put_Char [bx+4][chars] Put_Char [bx+3][chars+2] done: termin 0 ; no errors if we reach
Memory Access Explicit Segment Register • Again the data in the data segment is character string are: “0123456789abcdef” • Macros as in the example earlier • Use bx as index register again • Note: no implicit segment register used • Instead, cs used explicitly • Note syntax: seg:offset • The output of program below is: h02012452267
Memory Access ; Source file: mem2.asm; use explicit segment reg ; Purpose: memory ref, indexing with explicit ds: .model small .data chars db "0123456789abcdef“ .code main: start mov bx, 2 ; index '2' in chars
Memory Access Put_Char 'h' Put_Char ds:chars ; not good programming Put_Char ds:chars[bx] ; only partial indirect Put_Char ds:[chars] ; explicit Put_Char ds:[chars+1] ; explicit Put_Char ds:[chars+bx] Put_Char ds:chars[bx+2] Put_Char ds:[chars+bx+3] Put_Char ds:[bx][chars] Put_Char ds:[chars]+[bx] Put_Char ds:[bx+4][chars] Put_Char ds:[bx+3][chars+2] . . .
Word Access • Goal: to reference memory as words • Output these integers as decimal numbers • Use the yet to be designed PutDec() assembler procedure to print decimal numbers • Macros start and termin as before • Use register bx again as index register • Data segment defines some decimal and some hex literals • Data label nums defines an array of integer words
Word Access • Observe that modifications to index register is done in steps of 2 • Stride of word is 2 on x86! • Note that words initialized via hex literals are still printed as signed integers • Intended output shown below: Output: 511 512 512 513 1025 -8531 -8531 -17730 -17730
Word Access ; Purpose: word memory references, indexing start macro ; no parameters mov ax, @data ; @data predefined macro mov ds, ax ; now data segment reg set endm ; end macro: start termin macro ret_code ; no parameters, assume 0 mov ah, Term_Code ; terminate, ah + al mov al, ret_code ; any errors? If /= 0 int 21h ; call DOS for help endm ; end macro: termin Term_Code = 4ch .model small .data nums dw 511, 512, 513, 1023, 1024, 1025 w1 dw 0deadh w2 dw 0beefh w3 dw 0c0edh w4 dw 0babeh
Word Access, Cont’d .code extrn PutDec : near main: start mov bx, 2 ; use bx as index register mov ax, nums call PutDec ; output is: 511 mov ax, [nums + 2] call PutDec ; output is: 512 mov ax, [nums + bx] call PutDec ; output is: 512 mov ax, [nums][bx + 2] call PutDec ; output is: 513 mov ax, [nums+2][bx+6] call PutDec ; output is: 1025
Word Access, Cont’d mov nums, 0deadh mov ax, nums call PutDec ; output is: -8531 mov ax, w1 call PutDec ; output is: -8531 mov ax, [w2+bx+2] call PutDec ; output is: -17730 mov ax, [w1+6] call PutDec ; output is: -17730 done: termin 0 ; no errors if we reach end main ; start here!
Comparison • By default, a machine executes one instruction after another, in sequence • That sequence can be changed via branches • Branches are also known as jumps, depending on manufacturer • Unconditional branches transfer control to their defined destination • Conditional branches make this change in control flow only if their associated condition holds
Comparison • How does the microprocessor “know” when or whether a condition is true? • The CPU has flags that specify this condition, and instructions that test for the condition • Typical conditions are zero, negative, positive, overflow, carry, etc. • Symbolic flags are CF, ZF, OF • These can be used as operands in conditional branches, conditional calls etc.
Conditional Branch -- high-level source program snippet if a > b then max := a; else max := b; end if; ; corresponding x86 assembler snippet: mov ax, [a] ; memory location a cmp ax, [b] ; memory location b jle b_is_max ; jump to b_is_max if mov [max], ax jmp end_if ; jump around else b_is_max: ; this is else mov ax, [b] mov [max], ax end_if: . . .
Loops • Operations are performed repeatedly via loops • In higher level languages, loops are hand-manufactured via conditions and branches (If Statement and Gotos) or using language defined structured loop statements • The latter include Repeat, While, and For Statements • We introduce the x86 loop instruction • Generally a loop body is repeated until a particular value (sentinel) is found • A loop body entered unconditionally is akin to a Repeat Statement
x86 Loop • Another assembler example knows the iteration count at the time of assembly, hence the x86 provided loop instruction can be used • A sample x86 loop instruction follows: loop next ; is executed: if --cx then goto next; • This loop body can be characterized as a For Statement • The third example does not know the number of iterations at the time of assembly. Hence, before entering the loop body the first time, a check must be made for the loop count to be = 0 • If so, the body is bypassed; else the body is entered and executed countably many times. Thus, the loop resembles a C-style For Statement
x86 Loop • We saw, loops allow the repeated operation of their bodies • Based on a condition, or based on a defined number of steps, which in effect defines that condition • On the x86 architecture, the cx register functions as the counter for counted loops, with the loop opcode • On x86 the counted loop is executed by the loop instruction, assuming the loop count in cx • As long as cx is not 0, execution continues at the place of the loop label • Else execution continues at the next instruction after the loop opcode • During each execution of the loop opcode, the value in cx is decremented by 1
x86 Loop ; demonstrate the x86 “loop” instruction ; assumes count to be in cx ; when loop is executed: decrement cx ; once cx is 0, continue at instruction after loop ; else branch to label ; place 10 into cx to define loop steps mov cx, 10 again: ; a label! Note the colon : mov ax, cx ; print value in ax call PutDec ; via PutDec procedure loop again ; check, if need to loop more ; prints the numbers 10 down to 1, but NOT 0
First Loop • We define a string in data segment, all ‘0’..’f’ digits • The data area is named ‘chars’ and being used as address (data offset) • The sentinel for loop termination is ‘#’ • Register bx used as index register • Note that only bx, si, di, and bp can be used for indexing on x86 • Practice the cmp instruction, which compares by subtracting, and then sets flags • Learn to know conditional (jcc) and unconditional jump (jmp) • See use of labels as destinations of jumps • Output of program is: 0123456789abcdef
First Loop ; Source file: loop1.asm ; Purpose: use, syntax of indexing array w. sentinel Start macro ; no parameters mov ax, @data ; @data predefined macro mov ds, ax ; now data segment reg set endm ; end macro: start Termin macro ret_code ; 1 parameter: return code mov ah, 4ch ; terminate: set ah + al mov al, ret_code; any errors? If /= 0 int 21h ; call sys sw for help endm ; end macro: termin Char_Out = 2h Sentin = '#' .model small .data chars db "0123456789abcdef", Sentin
First Loop .code main: start mov ah, Char_Out ; set up ah for sys mov bx, 0 ; to index string, init 0 next: mov dl, chars[bx] ; find next char inc bx ; increment index reg bx cmp dl, Sentin ; found sentinel? je done ; yep, so stop int 21h ; nop, so print it jmp next ; try next; could be sent done: termin 0 ; no errors if we reach end main ; start here!
Second Loop • Again we define character string in data segment, all ‘0’..’f’ hex digits • This time we use no sentinel • Assume that the loop is executed exactly 16 times, and is known a-priori, i.e. a countable loop • Again we use register bx as index register • Learn loop instruction, which tracks loop count and conditional branch • Loop instruction on x86 subtracts 1 from cx each time it is executed • If cx = 0, fall through; else branch to target, which is part of instruction • Output of program is: 0123456789abcdef
Second Loop ; Source file: loop2.asm ; Purpose: use, syntax of indexing char array ; loop is "countable" we know # of elements ; b 4 start of loop; we know at assembly time . . . same macros start, termin Char_Out = 2h Num_El = 10h ; 16 elements in chars array[] .model small .data chars db "0123456789abcdef"
Second Loop .code ; abbreviation main: start mov ah, Char_Out ; set up ah for system call mov bx, 0 ; initial index off 'chars' mov cx, Num_El ; know # iterations a priori next: mov dl, chars[bx]; find next char inc bx ; increment index register int 21h ; print it loop next ; try next one; could be 0: end