560 likes | 679 Views
Real-World Instruction Set Architectures Focus on IA-32. http://www.pds.ewi.tudelft.nl/~iosup/Courses/2012_ti1400_5.ppt. Course website: http://www.pds.ewi.tudelft.nl/~iosup/Courses/2012_ti1400_results.htm. IA family. 1982. IA (Intel Architecture) is a family of processors
E N D
Real-World Instruction Set ArchitecturesFocus on IA-32 http://www.pds.ewi.tudelft.nl/~iosup/Courses/2012_ti1400_5.ppt Course website: http://www.pds.ewi.tudelft.nl/~iosup/Courses/2012_ti1400_results.htm
IA family 1982 • IA (Intel Architecture) is a family of processors • Each processor—same architecture, but different organization • same instruction set • different performance • 32-bit memory addresses and variable length instructions • Very large instruction set (not RISC) 1985 1989 1993
Other Example: PowerPC Instruction unit instructions instructions Floating-point unit Integer unit Cache main memory
Floorplan PowerPC Registers Load/Store Unit Data Cache MMU Instr. Cache FPU
IA-32 Introduction Memory Layout Registers Instructions Examples of Assembler Code for IA-32 Subroutines 7
Memory • Memory is byte addressable • Doublewords can start at any byte location • Data Operands are 8 or 32 bits wide • Mode is little-endian scheme (vs big-endian PowerPC)
Addressable data units Bit 31 0 byte 3 byte 0 Byte Doubleword 0
IA-32 Introduction Memory Layout Registers Instructions Examples of Assembler Code for IA-32 Subroutines 10
IA register structure FP0 floating - point registers FP7 R0 general- purpose registers R7
Register Naming AX AH AL R0 EAX R1 EBX R2 ECX R3 EDX R4 ESP R5 EBP R6 ESI R7 EDI EIP EFLAGS Data registers Pointer registers Index registers Instruction Pointer Status Register
Status Register Status Register 0 31 13 12 11 9 8 7 6 OF IF TF SF ZF CF IOPL • CF Carry • ZF Zero • SF Sign • IOPL I/O privilege level • OF Overflow • IF Interrupt enable
Special registers Code Segment CS Stack Segment SS DS ES FS GS Data Segments
IA-32 Introduction Memory Layout Registers Instructions Examples of Assembler Code for IA-32 Subroutines 15
Instructions • Variable length instructions 1-12 bytes • Five typeof instructions • Copyinstructions (MOV) • Arithmetic and logic instructions • Flow control • Processor control instructions • I/O instructions • Format: INSTR Rdst,Rsrc
Instruction Format Opcode Addressing Displacement Immediate 1 or 2 bytes 1 or 2 bytes 1 or 4 bytes 1 or 4 bytes variable opcode length
Q CISC or RISC? Q Why both 5 and 6? Addressing modes • Many addressing modes: • Immediate value • Direct M(value) • Register [reg] • Register Indirect M([reg]) • Base with displacement M([reg]) +Disp • Index with displacement M([reg]S +Disp) • Base with index M([reg1]+[reg2]S) • Base with index and M([reg1]+[reg2]S+Disp) displacement S=1,2,4 or 8 Disp= 8 or 32-bit signed number
Immediate and Direct • Immediate MOV EAX, 25 [EAX] #25 MOV EAX, 3FA00H [EAX] # 3FA00H • Direct MOV EAX, loc [EAX] M(loc) or MOV EAX, [loc][EAX] M(loc)
Register indirect • Register MOV EBX,OFFSET loc [EBX] #loc or LEA EBX,loc[EBX] #loc • Register indirect MOV EAX,[EBX] [EAX] M(EBX) and MOV [EBX], 10 [EBX] 10 MOV DWORD PTR [EBX], 10 [EBX] 10 Q Why DWORD PTR?
Base with Index and Displacement • MOV EAX,[EBP+ESI*4+200] EAX M([EBP] + [ESI]*4 + #200) EBP 1000 1000 40 ESI 1200 1360 Operand
Arithmetic instructions • May have one or two operands ADD dst,scr meaning [dst] [dst] + [src]
Compare • Used to compare values and leave register contents unchanged CMP dst, src [dst] - [src]
Flow control • Two basic branch instructions: • JMP [loc] Branch unconditionally • JG, JZ, JS, etc Branch if condition is satisfied
IA-32 Introduction Memory Layout Registers Instructions Examples of Assembler Code for IA-32 Subroutines 25
Summation exampleJava code int[] listarray = new list[n]; int sum=0; for(index=n-1, index>=0, index--){ sum += list[index]; }
Summation exampleAssembler code, Version 1 [1/4] LEA EBX, NUM1 [EBX] #NUM1 MOV ECX, N [EXC] M(N) MOV EAX, 0 [EAX] #0 MOV EDI, 0 [EDI] #0 L: ADD EAX, [EBX+EDI*4] Add next number to EAX INC EDI [EDI] [EDI] +1 DEC ECX [ECX] [ECX] -1 JG L Branch if [ECX]>0 MOV SUM, EAX M(SUM) [EAX]
Summation exampleAssembler code, Version 1 [2/4] LEA EBX, NUM1 [EBX] #NUM1 MOV ECX, N [EXC] M(N) MOV EAX, 0 [EAX] #0 MOV EDI, 0 [EDI] #0 L: ADD EAX, [EBX+EDI*4] Add next number to EAX INC EDI [EDI] [EDI] +1 DEC ECX [ECX] [ECX] -1 JG L Branch if [ECX]>0 MOV SUM, EAX M(SUM) [EAX]
Summation exampleAssembler code, Version 1 [3/4] LEA EBX, NUM1 [EBX] #NUM1 MOV ECX, N [EXC] M(N) MOV EAX, 0 [EAX] #0 MOV EDI, 0 [EDI] #0 L: ADD EAX, [EBX+EDI*4] Add next number to EAX INC EDI [EDI] [EDI] +1 DEC ECX [ECX] [ECX] -1 JG L Branch if [ECX]>0 MOV SUM, EAX M(SUM) [EAX]
Summation exampleAssembler code, Version 1 [4/4] LEA EBX, NUM1 [EBX] #NUM1 MOV ECX, N [EXC] M(N) MOV EAX, 0 [EAX] #0 MOV EDI, 0 [EDI] #0 L:ADD EAX, [EBX+EDI*4] Add next number to EAX INC EDI [EDI] [EDI] +1 DEC ECX [ECX] [ECX] -1 JG L Branch if [ECX]>0 MOV SUM, EAX M(SUM) [EAX]
Summation exampleAssembler code, Version 1 LEA EBX, NUM1 [EBX] #NUM1 MOV ECX, N [EXC] M(N) MOV EAX, 0 [EAX] #0 MOV EDI, 0 [EDI] #0 L: ADD EAX, [EBX+EDI*4] Add next number to EAX INC EDI [EDI] [EDI] +1 DEC ECX [ECX] [ECX] -1 JG L Branch if [ECX]>0 MOV SUM, EAX M(SUM) [EAX]
Summation exampleAssembler code, Version 2 LEA EBX, NUM1 [EBX] #NUM1 SUB EBX, 4 MOV ECX, N [EXC] M(N) MOV EAX, 0 [EAX] #0 L: ADD EAX, [EBX+ECX*4]Add next number to EAX LOOP L[ECX] [ECX] -1 Branch if [ECX]>0 MOV SUM, EAX M(SUM) [EAX] Q Why SUB EBX,4?
Summation examplePerformance, Version 1 vs Version 2 LEA EBX, NUM1 MOV ECX, N MOV EAX, 0 MOV EDI, 0 L: ADD EAX, [EBX+EDI*4] INC EDI DEC ECX JG L MOV SUM, EAX LEA EBX, NUM1 SUB EBX, 4 MOV ECX, N MOV EAX, 0 L: ADD EAX, [EBX+ECX*4] LOOP L MOV SUM, EAX Replaced 1xMOV with 1xSUB Replaced 1xINC+1xDEC+1xJG with 1xLOOP Q What is the performance loss/gain?
Summation exampleThe .asm File .data NUM1 DD 0, 1, 2, -1, -2 N DD 5 SUM DD 0 .code MAIN:LEA EBX, NUM1 SUB EBX, 4 MOV ECX, N MOV EAX, 0 L: ADD EAX, [EBX+ECX*4] LOOP L MOV SUM, EAX CMP SUM,0 END MAIN
Sorting exampleJava code int[] listarray = new list[n]; int temp; for(j=n-1, j>0, j--){ for(k=j-1, k>=0, k--){ if(list[j] > list[k]) { temp = list[k]; list[k] = list[j]; list[j] = temp; } } }
Sorting ExampleAssembler code [1/4] LEA EAX, list [EAX] #list MOV EDI, N [EDI] n DEC EDI [EDI] n-1 init(j) outer: MOV ECX, EDI [ECX] j DEC ECX [ECX] j-1 init (k) MOV DL, [EAX+EDI] load list(j) into DL inner: CMP [EAX+ECX], DL compare list(k) to list(j) JLE next if list(j) >= list(k) XCNG [EAX+ECX], DL swap list(j), list(k) MOV [EAX+ECX], DL new list(j) in DL next: DEC ECX decrement k JGE inner repeat or terminate DEC EDI decrement j JGE outer repeat or terminate
Sorting ExampleAssembler code [2/4] LEA EAX, list [EAX] #list MOV EDI, N [EDI] n DEC EDI [EDI] n-1 init(j) outer: MOV ECX, EDI [ECX] j DEC ECX [ECX] j-1 init (k) MOV DL, [EAX+EDI] load list(j) into DL inner: CMP [EAX+ECX], DL compare list(k) to list(j) JLE next if list(j) >= list(k) XCNG [EAX+ECX], DL swap list(j), list(k) MOV [EAX+ECX], DL new list(j) in DL next: DEC ECX decrement k JGE inner repeat or terminate DEC EDI decrement j JGE outer repeat or terminate
Sorting ExampleAssembler code [3/4] LEA EAX, list [EAX] #list MOV EDI, N [EDI] n DEC EDI [EDI] n-1 init(j) outer: MOV ECX, EDI [ECX] j DEC ECX [ECX] j-1 init (k) MOV DL, [EAX+EDI] load list(j) into DL inner:CMP [EAX+ECX], DL compare list(k) to list(j) JLE next if list(j) >= list(k) XCNG [EAX+ECX], DL swap list(j), list(k) MOV [EAX+ECX], DL new list(j) in DL next: DEC ECX decrement k JGE inner repeat or terminate DEC EDI decrement j JGE outer repeat or terminate
Sorting ExampleAssembler code [4/4] LEA EAX, list [EAX] #list MOV EDI, N [EDI] n DEC EDI [EDI] n-1 init(j) outer:MOV ECX, EDI [ECX] j DEC ECX [ECX] j-1 init (k) MOV DL, [EAX+EDI] load list(j) into DL inner: CMP [EAX+ECX], DL compare list(k) to list(j) JLE next if list(j) >= list(k) XCNG [EAX+ECX], DL swap list(j), list(k) MOV [EAX+ECX], DL new list(j) in DL next: DEC ECX decrement k JGE inner repeat or terminate DEC EDI decrement j JGE outer repeat or terminate
Q Is this code a correctimplementation of the Java code? Sorting ExampleAssembler code [4/4] LEA EAX, list [EAX] #list MOV EDI, N [EDI] n DEC EDI [EDI] n-1 init(j) outer:MOV ECX, EDI [ECX] j DEC ECX [ECX] j-1 init (k) MOV DL, [EAX+EDI] load list(j) into DL inner: CMP [EAX+ECX], DL compare list(k) to list(j) JLE next if list(j) >= list(k) XCNG [EAX+ECX], DL swap list(j), list(k) MOV [EAX+ECX], DL new list(j) in DL next: DEC ECX decrement k JGE inner repeat or terminate DEC EDI decrement j JGE outer repeat or terminate int[] listarray = new list[n]; int temp; for(j=n-1, j>0, j--){ for(k=j-1, k>=0, k--){ if(list[j] > list[k]) { temp = list[k]; list[k] = list[j]; list[j] = temp; } } }
IA-32 Introduction Registers Memory Layout Instructions Examples of Assembler Code for IA-32 Subroutinesreally long 41
Subroutines [EIP] #sub [EIP] [ESP] [ESP] [ESP]+4 • CALL sub • Return address is saved on stack (ESP register) • Return is RET
Stack instructions • ESP register is used as stack pointer • PUSH src [ESP] [ESP] - #4 M([ESP]) [src] • POP dst [dst] M([ESP]) [ESP] [ESP] + #4 • PUSHAD (POPAD) push (pop) all 8registers on (from) stack
Stack frames [1/4] Note: Sub1 starts at address 2400 .... PUSH N Parameter n on stack 2000 CALL Sub1 Call subroutine at 2400 ........... Stack 10056 ESP Stack Pointer stack pointer program counter 2400 EIP 2004 10052 N
Stack frames [2/4] Note: Sub1 starts at address 2400 .... PUSHNParameter Non stack 2000 CALL Sub1 Call subroutine at 2400 ........... Stack 10052 ESP Stack Pointer stack pointer program counter 2000 EIP 10052 N
Stack frames [3/4] Note: Sub1 starts at address 2400 .... PUSH N Parameter n on stack 2000 CALLSub1Callsubroutineat2400 ........... Stack 10048 ESP Stack Pointer stack pointer program counter 10048 2000 EIP 2004 10052 N
Stack frames [4/4] Note: Sub1 starts at address 2400 .... PUSH N Parameter n on stack 2000 CALLSub1Callsubroutineat2400 ........... Stack 10048 ESP Stack Pointer stack pointer program counter 10048 2400 EIP 2004 10052 N
Subroutine Sub1 Sub1: PUSH EAXSave EAX PUSH EBXSave EBX MOV EAX, [EDI + 12] n to EAX DEC EAX .... PUSH EAX Load n-1 on stack L: CALLSub2Callsubroutine POP N Put result in M(N) POP EBXRestore EBX POP EAXRestore EAX RET return
2400: PUSH EAX PUSH EBX MOV EAX, [EDI + 12] DEC EAX Stack frame in Sub1 AfterPUSH EBX Stack frame at arrow 10036 ESP 10040 [EBX] 10040 [EAX] Return Address 10052 N ? EIP Q What is the value op EIP?
Subroutine Sub1 2400 PUSH EAX Save EAX PUSH EBX Save EBX MOV EAX, [EDI + 12] n to EAX DEC EAX .... PUSH EAX Load n-1 on stack L: CALL Sub2 Call subroutine POP N Put result in M(N) POP EBX Restore EBX POP EAX Restore EAX RET return AfterDEC EAX