270 likes | 416 Views
Chapter 8 String Operations. 8.1 Using String Instructions. String in the 80x86 Environment. Contiguous collection of bytes, words, doublewords or quadwords in memory Commonly defined in a program’s data segment using such directives as response BYTE 20 DUP (?)
E N D
Chapter 8 String Operations
String in the 80x86 Environment • Contiguous collection of bytes, words, doublewords or quadwords in memory • Commonly defined in a program’s data segment using such directives as response BYTE 20 DUP (?) label1 BYTE 'The results are ', 0 arrayD DWORD 60 DUP (0)
String Instructions • movs (move string) • Copy a string from one location to another • cmps (compare string) • Compare the contents of two strings • scas (scan string) • Search a string for one particular value • stos (store string) • Store a new value in some string position • lods (load string) • Copies a value out of some string position
String Instruction Operation • Each instruction applies to a source string, a destination string, or both • The elements (bytes, words, doublewords or quadwords) are processed one at a time • Register indirect addressing is used to locate the individual string elements • ESI (RSI in 64-bit mode) used for source string elements • EDI (RDI) for destination string elements
Mnemonic Variants • Since ESI and EDI are automatically used, operands are unnecessary • Assembler can’t tell size of string element without further information – suffix on mnemonic can be used • Example, movsb to move string of bytes
More on String Instruction Operation • String instruction operates on only one string element at a time, but gets ready to operate on the next element • Changes the source index register ESI and/or the destination index register EDI to contain the address of the next element of the string(s) • Can move forward or backward depending on direction flag DF • cld instruction sets forward direction • std instruction sets backward direction
Design to Copy Null-Terminated String while next source byte is not null loop copy source byte to destination; increment source index; increment destination index; end while; put null byte at end of destination string;
Implementation of String Copy mov edi,[ebp+8] ;destination mov esi,[ebp+12] ;initial source address cld ;clear direction flag whileNoNull: cmp BYTE PTR [esi],0 ;null source byte? je endWhileNoNull ;stop copying if null movsb ;copy one byte jmp whileNoNull ;go check next byte endWhileNoNull: mov BYTE PTR [edi],0 ;terminate dest string
Repeat Prefixes • Change the string instructions into versions which repeat automatically either for a fixed number of iterations or until some condition is satisfied • The three repeat prefixes actually correspond to two different single-byte codes • Not themselves instructions, but supplement machine codes for the primitive string instructions, making new instructions
rep prefix • Normally used with movs and with stos • Causes this design to be executed: while count in ECX > 0 loop perform primitive instruction; decrement ECX by 1; end while;
Additional Repeat Prefixes • repe (equivalent mnemonic repz) • “repeat while equal” (“repeat while zero”) • repne (same as repnz) • “repeat while not equal” (“repeat while not zero”) • Each appropriate for use with cmps and scas which affect the zero flag ZF
repe and repne Operation • Each works the same as rep, iterating a primitive instruction while ECX is not zero • Each also examines ZF after the string instruction is executed • repe and repz continue iterating while ZF=1, as it would be following a comparison where two operands were equal • repne and repnz continue iterating while ZF=0
cmps • Subtracts two string elements and sets flags based on the difference • If used in a loop, it is appropriate to follow cmps by a conditional jump instruction • repe and repne prefixes often used with cmps instructions
scas • Used to scan a string for the presence or absence of a particular string element • String which is examined is a destination string – the address of the element being examined is in the destination index register EDI • Accumulator contains the element being scanned for
stos • Copies a byte, a word, a doubleword or a quadword from the accumulator to an element of a destination string • Affects no flag, so only the rep prefix is appropriate for use with it • When repeated, it copies the same value into consecutive positions of a string
lods • Copies a source string element to the accumulator • No repeat prefix is useful with lods • lods and stos are often used together in a loop • lods at the beginning of a loop to fetch an element • stos at the end after the element is manipulated
xlat • “translate” • Uses a lookup table to modify the byte in AL • The table is at the address in EBX • The original value in AL is used as an index into the table • The byte at that index is stored as the new value in AL
Translation Code Example mov ecx, strLength ; string length lea ebx, table ; addr of translation table lea esi, string ; address of string lea edi, string ; destination also string forIndex: lodsb ; copy next character to AL xlat ; translate character stosb ; copy character back into string loop forIndex ; repeat for all characters
Building a Translation Table • Normally 256 bytes long • One entry for each possible byte value
Example Table for ASCII Codes • Leaves lower case letters and digits unchanged • Translates upper case letters to lower case • Translates all other characters to spaces table BYTE 48 DUP (' '), "0123456789", 7 DUP (' ') BYTE "abcdefghijklmnopqrstuvwxyz", 6 DUP (' ') BYTE "abcdefghijklmnopqrstuvwxyz", 133 DUP (' ') 0 at position 48 (3016)so 0 translated to 0 a at position 65 (4116)so A translated to a
dtoa macro expansion push ebx ; save EBX lea ebx, dest ; destination address push ebx ; destination parameter mov ebx, [esp+4] ; in case source was EBX mov ebx, source ; source value push ebx ; source parameter call dtoaproc ; dtoaproc(source,dest) add esp, 8 ; remove parameters pop ebx ; restore EBX • The real work is done by dtoaproc
dtoaproc algorithm determine whether source positive or negative; put 10 spaces in destination area; make EDI point at 11th byte; repeat digit := source mod 10; convert digit to ASCII and store at [EDI]; decrement EDI; divide source by 10; until source = 0; if original source negative, append leading minus sign;
dtoaproc special case • Most negative numbers are handled by processing the corresponding positive number, “remembering” the minus sign • 8000000016 has no corresponding positive number, so the characters - 2 1 4 7 4 8 3 6 4 8are stored in the destination area one at a time