130 likes | 166 Views
Learn how to create masks for uppercase characters in a text string using X86-SSE instructions, handling EOS characters, and multiple character ranges within text fragments.
E N D
Computer Architecture and System Programming Laboratory TA Session 12 x86-SSE text string processing instructions
X86-SSE Programming – Text Strings (SSE4.2) An implicit-length text string uses a terminating End-Of-String (EOS) character. X86-SSE includes four SIMD text string instructions that are capable of processing text string fragments up to 128 bits in length. Suppose you are given a text string fragment and want to create a mask to indicate the positions of the uppercase characters within the string. For example, each 1 in the mask 1000110000010010b signifies an uppercase character in the corresponding position of the text string "Ab1cDE23f4gHi5J6". The desired character range and text string fragment are loaded into registers XMM1 and XMM2, respectively.
RFLAGS: 0x4831 = 0100100000110001b
the output format bit 6 is set, which means that the mask value is expanded to bytes RFLAGS: multiple character ranges XMM1 contains two range pairs: one for uppercase letters and one for lowercase letters. RFLAGS: • text string fragment that includes an embedded EOS (‘\0’) character • ZF is set to 1 • final mask value excludes matching range characters following EOS RFLAGS:
multiple character ranges XMM1 contains two range pairs: one for uppercase letters and one for lowercase letters. RFLAGS: RFLAGS is set in a non-standard manner in order to supply the most relevant information: CF flag – Reset if IntRes2 is equal to zero, set otherwise ZF flag – Set if any byte/word of xmm2/mem128 is null, reset otherwise SF flag – Set if any byte/word of xmm1 is null, reset otherwise OF flag – IntRes2[0] AF flag – Reset PF flag – Reset
MOVDQU xmm1, xmm2/m128 Move unaligned double quadword from xmm2/m128 to xmm1. section .data str: db ‘Ab1cDE23f4gHi5J6’ AZ_mask: db ‘A', ‘Z’ times 14 db 0 imm: equ 01000100b AZ2az_mask: times 16 db ('a' - 'A’) result: times 16 db 0 db `\n\0` extern printf section .text global main main: enter movdqu xmm1, [AZ_mask] movdqu xmm2, [str] pcmpistrm xmm1, xmm2, imm movdqu xmm3, [AZ2az_mask] pand xmm0, xmm3 paddb xmm2, xmm0 movdqu [result], xmm2 mov rdi, result mov rax, 0 call printf leave ret PADDB xmm1, xmm2/m128 Add packed byte integers from xmm2/m128 and xmm1. PAND xmm1, xmm2/m128 Bitwise AND of xmm2/m128 and xmm1.
Equal any (imm[3:2] = 00). The result is a bit mask – 1 if the character belongs to a set, 0 if not. pcmpstrim xmm1, xmm2, 01000000b xmm1 xmm2 xmm0 Equal each (imm[3:2] = 10). The result is a bit mask – 1 if the corresponding bytes are equal, 0 if not equal. pcmpstrim xmm1, xmm2, 01001000b xmm1 xmm2 xmm0
Equal ordered (imm[3:2] = 11). The result is a bit mask – 1 if the substring is found at the corresponding position, 0 otherwise. pcmpstrim xmm1, xmm2, 01001100b xmm1 xmm2 xmm0
rcx RFLAGS: IntRes1 calculation – mask according to the given range Negative- IntRes2 calculation RCX = index of least significant set bit in IntRes2 RCX = 16 (invalid index) RCX RCX
IntRes1 calculation – mask according to the given range Negative- IntRes2 calculation RFLAGS: rcx RCX = index of least significant set bit in IntRes2 RCX = 11 (index of ‘\0’ character, or length of string) RCX RCX
rcx RFLAGS: RCX RCX
first loop cycle: section .data str: db ‘Ab1cDE23f4gHi5J6’ db ‘Ab1cDE23f4g\0’ EOS_mask: db 0x1,0xFF times 14 db 0 imm: equ 00010100b section .text global strlen strlen: enter xorrax xor rcx movdqu xmm1, [EOS_mask] .loop add rax, rcx pcmpistri xmm1, [str+rax], imm jnz .loop add rax, rcx leave ret RFLAGS: rcx second loop cycle: RFLAGS: rcx
RCX RCX