280 likes | 368 Views
Introduction to Assembly Language IA32-II. Summer 2014 COMP 2130 Intro Computer Systems Computing Science Thompson Rivers University. Instructions. NOP – does nothing, no values May be used for delay It is actually an exchange function to one register to the itself. XCHG EAX, EAX
E N D
Introduction to Assembly Language IA32-II Summer 2014 COMP 2130 Intro Computer Systems Computing Science Thompson Rivers University
Instructions • NOP – does nothing, no values • May be used for delay • It is actually an exchange function to one register to the itself. • XCHG EAX, EAX • PUSH – push word, double-word or Quad-word on the stack • It automatically decrements the stack pointer esp, by 4 • POP – pops the data from the stack • Sets the esp automatically • It would increment esp • EQU – sets a variable equal to some memory • HLT – to halt the program Introduction
Sections of the program • An assembly program can be divided into three sections: • The datasection – for declaring initialized data or constants. The data may not change at runtime • The bsssection – for declaring variables • The textsection – the section containing code Introduction
Memory Segments Assembly follows the segmented memory model which divides the system memory into groups of independent segments referenced by pointers located in the segment registers. • Data segment - it is represented by .data section and the .bssand is used to declare the memory region, where data elements are stored for the program. Once declared, it cannot be extended and it remains static throughout the program. This buffer memory is zero-filled. DS register stores the starting address of the data segment • Code segment - it is represented by .text section. This defines an area in memory that stores the instruction codes. This is also a fixed area. CS register stores the starting address of the code segment • Stack- this segment contains data values passed to functions and procedures within the program. SSR (Stack Segment Register stores the starting address of the stack) Introduction
Addressing Modes There are three basic addressing modes • Register addressing – here one or both operands may be register • Immediate Addressing – the operand has a constant value or expression • Memory addressing - the memory may be addressed in this • Direct Memory Addressing • Indirect Memory Addressing • Offset Addressing Introduction
%eax %ecx %edx %ebx %esi %edi %esp %ebp Carnegie Mellon Moving Data: IA32 • Moving Data (data transfer operations) movlSource, Dest: • Operand Types • Immediate: Constant integer data • Example: $0x400, %eax • Like C constant, but prefixed with ‘$’ • Encoded with 1, 2, or 4 bytes • Register: One of 8 integer registers • Example: %eax, %edx • But %espand %ebpreserved for special use • Others have special uses for particular instructions • Memory:4 consecutive bytes of memory at address given by register • Simplest example: (%eax) • Various other “address modes”
Carnegie Mellon movl Operand Combinations Assumption: %eax <- temp1; %edx <- temp2; %ecx <- p; Cannot do memory-memory transfer with a single instruction How to do memory-memory transfer? Source Dest Src, Dest C Analog Reg movl $0x4,%eax temp1 = 0x4; Imm Mem movl $-147,(%ecx) *p = -147; Reg movl %eax,%edx temp2 = temp1; movl Reg Mem movl %eax,(%ecx) *p = temp1; Mem Reg movl (%ecx),%eax temp1 = *p;
Carnegie Mellon Simple Memory Addressing Modes • Normal (R) Mem[Reg[R]] • Register R specifies memory address.movl (%ecx),%eax • Displacement D(R) Mem[Reg[R]+D] • Register R specifies start of memory region. • Constant displacement D specifies offset.movl 8(%ebp),%edx i.e. %edx = %edx + (%ebp+8);
Memory Operands • Addressing Memory • 8 bit is the smallest unit • 32 bit addresses (may be extended to 64 bits for 64 bit assembly) • IA32 is little endian • Examples • movb $0x4a, %al // stores 0x4a in one byte • movw $5, %ax // stores 5 in two bytes • Movl $7, %eax // stores 7 in four bytes Introduction
Examples • Given below is the information • Fill in the following table showing the values %eax 0x100 0x104 0xAB $0x108 0x108 (%eax) 0xFF 4(%eax) 0xAB 9(%eax, %edx) -> data at location %eax + %edx + 9 -> 0x11 260(%ecx,%edx) -> data at location %ecx + %edx + 260 -> 0x13 (%eax,%edx,4) -> data at %eax + %edx * 4 -> 0x11 Introduction
Space allocation for data initialization The variable may be defined and allocated initial values. It allocates the space as per the type • DB - Define byte (1 byte) • DW - Define Word (2 Bytes) • DD - Define Double Word (4 Bytes) • DQ - Define QuadWord(8 Bytes) • DT - Define Ten Bytes (10 Bytes) Introduction
Opcodes/Commands • Some common Assembly commands are: Introduction
leal • Load effective address – variant of movl • Reads from memory to a register • Does not reference memory at all If %edxcontaind the value x, then leal 7(%edx, %edx,4), %eax means • %eax is set to 5x + 7 • The destination operand must be a register LEA, the only instruction that performs memory addressing calculations but doesn't actually address memory. LEA accepts a standard memory addressing operand, but does nothing more than store the calculated memory offset in the specified register, which may be any general purpose register. Introduction
Group II • These are all unary or binary operators • Inc • Dec • Neg • Not NEG: Negate (Two's Complement; i.e., Multiply by −1) The bitwise NOT, or complement, is a unary operation that performs logical negation on each bit, forming the ones' complement of the given binary value. Bits that are 0 become 1, and those that are 1 become 0. Introduction
Group III • Add • Sub • Imul – signed • Mul- unsigned • Xor • Or • and Introduction
Shifts • Shl • Shr • Sar • Sal movw $ff00,%ax # ax=1111.1111.0000.0000 (0xff00, unsigned 65280, signed -256) shrw $3,%ax # ax=0001.1111.1110.0000 (0x1fe0, signed and unsigned 8160) # (logical shifting unsigned numbers right by 3 # is like integer division by 8) shlw $1,%ax # ax=0011.1111.1100.0000 (0x3fc0, signed and unsigned 16320) # (logical shifting unsigned numbers left by 1 # is like multiplication by 2) Introduction
Exercise • Let us review a book example To calculate z*48, it is divided into two statements. First is z=z*3 and then shift left 4 times = z*48 Introduction
Linux System Calls • Linux system calls are the interface between the user space and kernel space. • In order to take input, produce output or exit form the system, it is required to call the OS services. • The system call numbers are stored in the register eax and then kernel is called • The common system calls are: • sys_exit - have the number 1 • sys_read - 3 • sys_write – 4 • To call the kernel, command used is (interrupt is called) • Int $0x80 (in Linux) and int 21h (DOS) • The result is usually returned in eax Introduction
Text section • Text section is where the code needs to be written. We may have procedures in this section too. • It is needed to define the entry point of the program in this section .text global _start _start: ; this is always the entry point of the program here we write all the statements Introduction
Role of eax register • It is an accumulator (like) register • All calculations occur in the accumulator register • all system calls are also called in the eax register Introduction
Role of ebx register • This does not have a dedicated role • It is used as a base pointer for memory access • It is used to store extra pointer or calculation step Introduction
Role of ecx register • This is the count register (for loops etc.) • The counting instructions use this register • The register counts downwards rather than up • This also holds the data to be written on the port Introduction
Role of edx register • This is the data register • Data register holds the size Introduction
Data declarations • An example of the data segment is: .data msg: .asciz "Hello, world!\n" len = . - msg msg2: .asciz "this is the first program \n" len1 = . - msg2 • .ascii expects zero or more string literals separated by commas. It assembles each string (with no automatic trailing zero byte) into consecutive addresses. • .asciz is just like .ascii, but each string is followed by a zero byte. The "z" in .asciz stands for "zero". Introduction
Text section .globl _start .text _start: movl $len, %edx movl $msg, %ecx movl $1, %ebx movl $4, %eax int $0x80 movl $0, %ebx movl $1, %eax int $0x80 .global is used to make the text symbol visible to ld. Both spellings (.globl and .global) are accepted, for compatibility with other assemblers. .xdef is also accepted as a synonym for .global. Introduction
Bss section .section .bss .lcomm input1 1 .lcomm input2 1 .lcommans 1 • .lcomm : Reserve length (an absolute expression) bytes for a local common denoted by symbol. The section and value of symbol are those of the new local common. The addresses are allocated in the bss section, so that at run-time the bytes start off zeroed. Introduction
Example 1 .globl _start .text _start: movl $len, %edx movl $msg, %ecx movl $1, %ebx movl $4, %eax int $0x80 movl $len1, %edx movl $msg2, %ecx movl $1, %ebx movl $4, %eax int $0x80 movl $0, %ebx movl $1, %eax int $0x80 .data msg: .asciz "Hello, world!\n" len = . - msg msg2: .asciz "this is the first program \n" len1 = . - msg2 [msharma@cs1 msharma]$ as first.as [msharma@cs1 msharma]$ ld -o demo a.out [msharma@cs1 msharma]$ ./demo Hello, world! this is the first program Introduction
Example 2 .section .data prompt_str1: .ascii "Enter first number: " str1_end: .set STR1_SIZE, str1_end-prompt_str1 prompt_str2: .ascii "\nThe number entered is : " str2_end: .set STR2_SIZE, str2_end-prompt_str2 .section .bss .lcomm input1 1 .section .text .globl _start _start: movl $4, %eax movl $1, %ebx movl $prompt_str1, %ecx movl $STR1_SIZE, %edx int $0x80 movl $3, %eax movl $0, %ebx movl $input1, %ecx movl $2, %edx int $0x80 movl $4, %eax movl $1, %ebx movl $prompt_str2, %ecx movl $STR2_SIZE, %edx int $0x80 movl $4, %eax movl $1, %ebx movl $input1, %ecx movl $2, %edx int $0x80 exit: movl$1, %eax movl$0, %ebx int$0x80 Introduction