560 likes | 669 Views
Assembling and Linking. An assembly language program is turned into an executable file in two steps Assembling turns the source code into object code (.obj file) – an intermediate and inexact form of machine code Linking turns the object code into an executable form (.exe file)
Assembling and Linking • An assembly language program is turned into an executable file in two steps • Assembling turns the source code into object code (.obj file) – an intermediate and inexact form of machine code • Linking turns the object code into an executable form (.exe file) • The code in two executables can’t be combined, but the code in two object files can
The Debugger • Contrary to the name, it is not only for debugging • Debugging assembly language programs can help the programmer understand the CPU’s inner-workings • Many people advocate stepping through every program, assembly language or not
The “Moving” Example • Pages 82-83 • Demonstration of assembling, linking, and debugging steps • Common mistakes • Trying to operate on two operands of differing sizes • Using the label rather than the memory it labels • The offset keyword
Comments on Comments • In assembly language, comments start with a semicolon • Comments used with properly-inserted blank lines can make for a very readable program • Comments should also be used to explain confusing instructions
The Stack • A stack is like a spring-loaded bin of dishes in a cafeteria • Only the top is readily available • Placing a dish is called a push • Taking a dish is called a pop • A stack is known as a LIFO structure (last in, first out)
The Stack (cont.) • A computer doesn’t actually move bytes up and down, but keeps track of the top of the stack with the stack pointer (8086 pointer register sp) • The assembly language instructions push and pop directly manipulate the current program’s stack • Every push in a program should have a balancing pop • One of the best uses of the stack is to save the values in the registers • The stack demo (pages 86-87)
The Flags • The flags register keeps information on the state of the CPU • Most arithmetic and bitwise instructions have some effect on the flags register • The boolean values in the flags register affect the results of some instructions • The flags register is also used in conditional program flow (e.g. decision-making, loops)
Addition Instructions • Common addition instructions: add dest, source adc dest, source • If an add (or adc) instruction overflows, the would-be last bit (bit 8 for a byte, or bit 16 for a word) is stored in the carry flag • The adc instruction adds like add, but adds the carry flag to the first bit (bit 0)
Addition Instructions (cont.) • add and adc can be used together to add very large integers: ; number1 and number2 are ; defined as DD above mov ax, [word number1] mov dx, [word number1 + 2] add ax, [word number2] adc dx, [word number2 + 2]
Subtraction Instructions • Common subtraction instructions: sub dest, source sbb dest, source • If a sub (or sbb) instruction borrows from a nonexistent bit (bit 8 for a byte, or bit 16 for a word), the carry flag is set • The sbb instruction subtracts like sub, but acts like bit 0 was borrowed from if the carry flag is set
Subtraction Instructions (cont.) • sub and sbb can be used together to subtract very large integers: ; number1 and number2 are ; defined as DD above mov ax, [word number1] mov dx, [word number1 + 2] sub ax, [word number2] sbb dx, [word number2 + 2]
Multiplication Instructions • Common multiplication instructions: mul source imul source • The destination is always understood to be the ax register (using dx as overflow) or the al register (using ah as overflow)
Division Instructions • Common division instructions: div source idiv source • The destination is always understood to be the ax register or the ax and dx registers • If the source is 8-bit, • the destination is the ax register • the result is put in the al register, with the remainder in ah • If the source is 16-bit, • the destination is the ax and dx registers • the result is put in the ax register, with the remainder in dx
Signed Arithmetic • Addition and subtraction work the same way regardless of sign • imul and idiv treat operands as signed values; mul and div assume all values are unsigned • Sometimes, for imul and idiv, it is necessary to convert from 8 bits to 16, or from 16 to 32
Signed Arithmetic (cont.) • Just setting high-order bits to zero will not work when using two’s complement • cbw (convert byte to word) and cwd (convert word to doubleword) exist for conversion • cbw assumes the byte to be converted is in al and extends the sign bit through ah • cwd assumes the word to be converted is in ax and extends the sign bit through dx
Bitwise Instructions • The text calls them logic instructions • Common bitwise instructions: and dest, source not dest or dest, source test dest, source xor dest, source rcl dest, source rcr dest, source rol dest, source ror dest, source sar dest, source shl/sal dest, source shr dest, source
Bitwise Instructions (cont.) • and, or, xor, and not should be intuitive • All but not operate on a destination and source operands, and leave the result in the destination operand • not operates on the destination, and leaves the result in the destination • test • performs a logical and on the destination and source, and throws away the result • sets the zero flag if the result is zero, and clears it if it isn’t
Shift Instructions • Shifts can be grouped into four groups • Plain shifts (shl, shr) • Plain rotations (rol, ror) • Rotations through the carry flag (rcl, rcr) • Arithmetic shifts (sal, sar) • They each require the same operands, but have subtle differences
Shift Instructions (cont.) • Common syntax: shl ax, 1 shl bh, cl shl [number], 1 shl [number], cl • If the source is a constant, it must be 1 • If the destination is to be shifted any more than one place, the source must be the register cl only • If the processor is an 80386 or later, the source may be an 8-bit constant
Shifting Instructions • Page 107 in the text • The shift instructions (shl, shr) move a zero value into the empty bit, and put what was shifted out into the carry flag
Rotate-through-carry Instructions • Page 108 in the text • The rotate-through-carry instructions (rcl, rcr) move the carry flag into the empty bit, and put what was shifted out into the carry flag • These can be combined with shifting instructions to shift very large integers
Rotate-through-carry Instructions (cont.) • This will only work shifting bits one position at a time ; number1 is defined as DD above ; this multiplies it by two mov ax, [word number1] mov dx, [word number1 + 2] shl ax, 1 rcl dx, 1
Rotation Instructions • Page 107 in the text • rol shifts all bits left, moving the MSD into the LSD and also into the carry flag • ror shifts all bits right, moving the LSD into the MSD and also into the carry flag • Rotating the same number of places as there are bits will return the same number • Rotation instructions are usually of less practical value than the other shift instructions
Arithmetic Shifts • Page 108 in the text • Shifting a negative (in two’s complement) number right using shr will not divide the number by two properly • sar copies the old MSD into the new MSD, preserving the sign bit • sal is the same as shl
Flow Control • Assembly language (and machine code, for that matter) lacks certain elements we take for granted in higher-level languages: • for (<initializer>; <condition>; <increment>)… • while (<expression>)… • do…while (<expression>) • if (<expression>)…else if (<expression>)…else… • <function>(<parameter>,…) • Expressions and flow-control structures
Flow Control (cont.) • Instructions that allow non-sequential execution are called transfer or jump instructions • All work by changing the ip register (or, sometimes, cs:ip) under certain circumstances • Unconditional transfer instructions jump under any circumstances • Conditional transfer instructions jump when certain flags are set or cleared
Flow Control (cont.) • Three types of transfers • Subroutine (opcodes call, int) – these can be returned from, and are unconditional • Jump (opcode jmp) – these cannot be returned from, and are unconditional • Conditional jump (many opcodes)
Subroutines • The call instruction is one of two transfer instructions that may be returned from • call • pushes the address of the instruction after it onto the stack (or cs:<address>) • changes ip to be the address of the function that is being called (or cs:ip) • ret • pops ip from the top of the stack (or cs:ip)
Subroutines (cont.) • Example program – page 113 • Any subroutine should either: • document which registers it destroys • save all registers it uses on the stack • Subroutines should: • be as short as possible • be only as long as necessary • accomplish one simple task
PROC and ENDP • Are compiler directives, and are optional (except in this class) • Mark the beginning and end of a subroutine • Should each be followed by the name (or label) of the subroutine
NEAR and FAR • A near, or intrasegment call is one to the same code segment • A far, or intersegment call is one to a different code segment • There is only one call opcode, but two return opcodes – retn and retf • The opcode ret is translated into either retn or retf • A subroutine can be made explicitly near or far with the directives NEAR and FAR
Passing Values • There are three common methods for passing values to a subroutine: • Storing parameters in registers (like in AddRegisters) • Storing data in global variables (in the data segment) • Passing data on the stack • Choosing the second option is generally bad – if two subroutines use the same global variables, things could get ugly very fast
Passing Values (cont.) • The first option (registers) is extremely common, fast, and very workable • The third option (stack) is best for working with many parameters • Most high-level languages pass parameters to functions (or methods) on the stack
Passing Values (cont.) This will not work: mov ax, 1 push ax mov ax, 2 push ax mov ax, 3 push ax mov ax, 4 push ax call AddValues PROC AddValues pop dx pop cx pop bx pop ax . . ret ENDP AddValues
Passing Values (cont.) This will work: PROC AddValues pop si pop dx pop cx pop bx pop ax push si . . ret ENDP AddValues mov ax, 1 push ax mov ax, 2 push ax mov ax, 3 push ax mov ax, 4 push ax call AddValues
I’m SAVED! • Who should save the registers’ values? • The subroutine? • The caller? • Each method has its own strengths and weaknesses
Saving Private Registers versus push bx push cx call AddRegisters pop cx pop bx PROC AddRegisters push bx push cx . . pop cx pop bx ret ENDP AddRegisters
Goto, er, Jump • Assembly language has one unconditional jump instruction – jmp • jmp works exactly the same way as call, except it doesn’t push the address of the instruction after it onto the stack • Syntax: jmp label • A jump may be near or far, depending on which code segment the label is in • Use it as little as possible
Goto-if (Conditional Jumps) • Many instructions affect the flags register • Conditional jump instructions decide whether to jump or fall through based on the contents of the flags register • Consider the following: mov cx, 5 ; 5 -> cx Back: add ax, bx ; ax = ax + bx dec cx jnz Back ; while cx != 0
Comparison == Subtraction? • The cmp command is listed as a subtraction instruction on page 91 • Why? • sub instruction sets flags, but changes registers • cmp subtracts like sub, but doesn’t change registers • The flags can be tested after a cmp to find out how the two operands are related
Equal Equals Zero • Consider the following: PROC RegEqual mov cx, 1 ; Preset cx to 1 cmp ax, bx ; Compare ax and bx je Continue ; Jump if ax == bx xor cx, cx ; Otherwise, set cx to 0 Continue: ret ; Return to caller ENDP RegEqual
Endings for Relationships • Useful relationships • op1 greater than op2? • op1 equal to op2? • op1 less than op2? • All conditional jump instructions start with the letter j and end with letters that match a relationship • Page 121 contains a list of useful conditional jump endings • above and below versus greater and less
Seeing in Double • Some conditional jump endings are synonymous: • jz is the same as je • jge is the same as jnl • Conditional jump synonyms are translated into the same machine code • They exist only for clarity
Destination Restrictions • The jmp and call instructions may direct program flow to anywhere in memory (near or far) • Conditional jump instructions may only go 128 bytes back or 127 bytes forward • When the destination is out of reach, reverse the condition and add an unconditional jump
Flag Operations • Some instructions exist only to modify the flags register • Carry flag instructions: • stc – sets the carry flag • clc – clears the carry flag • cmc – complements (toggles) the carry flag • Direction flag instructions: • std – sets the direction flag • cld – clears the direction flag • used only for string instructions (covered later)
Flag Operations (cont.) • Interrupt flag instructions: • sti – sets the interrupt flag • cli – clears the interrupt flag • Carry flag instructions are commonly used to pass information back from subroutines, or indicate that an error occurred
Carry Flag mov dl, [value] call TestBit jc BitIsSet . . PROC TestBit clc test dl, 08h jz Exit stc Exit: ret ENDP TestBit
String Operations • “Strings” in assembly language are actually any contiguous group of bytes of any length • The 8086 CPU provides instructions that • Transfer strings • Inspect strings • All string instructions have common traits
String Instruction Commonalities • All operations that act on a source string (loading, copying, comparing) expect the source string to be at ds:si • All operations that act on a destination string (storing, copying, comparing) expect the destination string to be at es:di • All string operations increment or decrement si, di, or both • String operations increment when the direction flag is clear and decrement when it is set • All string operations can be prefixed with a repeat modifier
String Load Example • Consider the following: mov si, offset words ; Get the address ; of the string cld ; Go forward Repeat: lods [word ptr ds:si] ; ds:si -> ax, cmp ax, 0 ; si++ jne Repeat