230 likes | 243 Views
Learn about bit, nibble, byte, word, memory addressing, registers, assembly language programming, instruction set, operand types, arithmetic, and logical instructions.
E N D
Computer Architecture and System Programming Laboratory TA Session 1
Data Representation Basics • bit–basic information unit: (1/0) • nibble – sequence of 4 bits: • byte – sequence of 8 bits: • word– sequence of 2 (or 4) bytes 32 bit word 16 bit word main memory • address space:0 to 264-1= 0xFFFFFFFFFFFFFFFF • 264 bytes = 24∙260 bytes • = 24∙230 Gb = 16 exabytes address space physical memory
Registers file - CPU unit which contains registers general purpose registers r8 – r15 r8d r8w r8b Extended High Low flag register RFLAGS Instruction Pointer registerRIP - contains address of the next instruction that is going to be executed (at run time)- changed by unconditional jump, conditional jump, procedure call, and return instructions Note that the list of registers above is partial. The full list can be found here.
Assembly Language Program • consists of a series of processor instructions, assembler directives, comments, and data • translated by assembler into machine language instructions (binary code) that can be loaded into memory and executed • NASM - Netwide Assembler - is an assembler and for x86 architecture • Example: • assembly code: • MOVAL, 0x61 ; load AL with 97 decimal (61 hex) • machine code (in binary): • 1011000001100001 1011 a binary opcode of instruction 'MOV' 0 specifies if data is byte (‘0’) or full size 16/32 bits (‘1’) 000 a binary identifier for a register 'AL' 01100001 a binary representation of 97 decimal (97d = (int)(97/16)*10 + (97%16 converted to hex digit) = 61h)
Basic structure of an assembly instruction label:(pseudo) opcodeoperands; comment either required or forbidden by instruction optional fields • each instruction has its address in memory • we mark an instruction with a label to refer it in the code • (non-local) labels have to be unique • an instruction that follows a label can be at the same / next line • colon is optional RAM Examples: movax, 2; moves constant 2 to the register axbuffer: resb64; reserves 64 bytes 64 bytes 2048 mov buffer, 2 mov 2048, 2 mov [buffer], 2 mov [2048], 2 mov byte [buffer], 2 mov byte [2048], 2 starting from buffer address in memory to one byte numeric value 2 move - backslash (\) : if a line ends with backslash, the next line is considered to be a part of the backslash-ended line- no restrictions on white space within a line
Instruction Arguments A typical instruction has 2 operands- target operand (left)- source operand (right) 3 kinds of operands exists- immediate : value- register : AX,EBP,DL etc.- memory location : variable or pointer Examples: mov ax, 2 mov [buffer], ax target operand source operand target operand source operand register immediate memory location register mov [var1],[var2] ! Note that we cannot have both source and target operands specified as memory locations.
MOV - Move Instruction – copies source to target mov reg8/mem8(16,32,64),reg8/imm8(16,32,64) (copies content of register / immediate (source) to register / memory location (target)) mov reg8(16,32,64),reg8/mem8(16,32,64)(copies content of register / memory location (source) to register (target)) operands have to be of the same size Examples: mov eax, 0x2334AAFF mov [buffer], ax mov word[var], 2 reg32 imm32 reg16 mem16 imm16 mem16 Note that NASM doesn’t remember the size of variables you declare . It will deliberately remember nothing about the symbol var except where it begins, and so you must explicitly code mov word [var], 2.
Basic Arithmetical Instruction carry means the digit with promote to higher powers, whereas borrow means the digit we demote to lower powers <instruction> reg8/mem8(16,32,64),reg8/imm8(16,32,64) (source - register / immediate, - register / memory location) <instruction> reg8(16,32,64),reg8/mem8(16,32,64) (source - register / immediate, - register / memory location) ADD - add integers without carry Example:add AX, BX;(AX gets a value of AX+BX) SUB - subtract integers without carry Example:sub AX, BX ;(AX gets a value of AX-BX) ADC - add integers with carry (value of Carry Flag) Example:adc AX, BX;(AX gets a value of AX+BX+CF) SBB- subtract with borrow (value of Carry Flag) Example:sbb AX, BX;(AX gets a value of AX-BX-CF)
Basic Arithmetical Instruction <instruction> reg8/mem8(16,32,64)(source / - register / memory location) INC - increment integer Example: inc AX ;(AX gets a value of AX+1) DEC -decrement integer Example: dec byte [buffer] ;(first byte of [buffer] gets a value of first byte of [buffer] -1)
Basic Logical Instructions <instruction> reg8/mem8(16,32,64)(source / - register / memory location) NOT –one’s complement negation – inverts all the bits Example: mov al, 11111110b not al ;(AL gets a value of 00000001b);(11111110b + 00000001b = 11111111b) NEG –two’s complement negation – inverts all the bits, and adds 1 Example: mov al, 11111110b neg al ;(AL gets a value of not(11111110b)+1=00000001b+1=00000010b);(11111110b + 00000010b = 100000000b = 0)
Basic Logical Instructions <instruction> reg8/mem8(16,32,64),reg8/imm8(16,32,64) (source - register / immediate, - register / memory location) <instruction> reg8(16,32,64),reg8/mem8(16,32,64) (source - register / immediate, - register / memory location) OR –bitwise or – bit at index i of the gets ‘1’ if bit at index i of source or are ‘1’; otherwise ‘0’ Example: mov al, 11111100 b mov bl, 00000010b or AL, BL ;(AL gets a value 11111110b) AND–bitwise and – bit at index i of the gets ‘1’ if bits at index i of both source and are ‘1’; otherwise ‘0’ Example: or AL, BL ;(with same values of AL and BL as in previous example, AL gets a value 0)
CMP – Compare Instruction – compares integers CMP performs a ‘mental’ subtraction - affects the flags as if the subtraction had taken place, but does not store the result of the subtraction. cmp reg8/mem8(16,32,64),reg8/imm8(16,32,64) (source - register / immediate, - register / memory location) cmp reg8(16,32,64),reg8/mem8(16,32,64) (source - register / immediate, - register / memory location) Examples: mov al, 11111100bmov bl, 00000010bcmp al, bl ;(ZF (zero flag) gets a value 0) mov al, 11111100bmov bl, 11111100 bcmp al, bl ;(ZF (zero flag) gets a value 1)
JMP – unconditional jump jmp label JMP tells the processor that the next instruction to be executed is located at the label that is given as part of jmp instruction. Example: mov eax, 1inc_again: inc eaxjmpinc_again mov ebx, eax this is infinite loop ! this instruction is never reached from this code rip register gets inc_again label (address)
J<Condition> – conditional jump j<cond> label • execution is transferred to the target instruction only if the specified condition is satisfied • usually, the condition being tested is the result of the last arithmetic or logic operation mov eax, 1inc_again: inc eax cmp eax, 10jneinc_again ; if eax ! = 10, go back to loop Example: mov eax, 1inc_again: inc eax cmp eax, 10jeend_of_loop ; if eax = = 10, jump to end_of_loopjmpinc_again ; go back to loop end_of_loop:
Note that the list above is partial. The full list can be found here.
d<size> – declare initialized data d<size> initial value Examples:var:db 0x55 ; define a variable var of size byte, initialized by 0x55 ; three bytes in succession var: db 0x55,0x56,0x57 var: db 'a‘ ; character constant 0x61 (ascii code of ‘a’) ; string constant var: db 'hello’,10 var: dw 0x1234 ; 0x34 0x12 var: dw ‘A' ; 0x41 0x00 – complete to word var: dw ‘ABC' ; 0x41 0x42 0x43 0x00 – complete to word var: dd 0x12345678 ; 0x78 0x56 0x34 0x12
Assignment 0 You get a simple program that receives a string from the user. Than, it calls to a function (that you’ll implement in assembly) that receives one string as an argument and should do the following: • Convert lower case to upper case. • Convert ‘(’ into ‘<’. • Convert ‘)’ into ‘>’. • Count the number of the non-letter characters, that is ANY character that is not 'a'->'z' or 'A'->'Z‘, including‘\n’ character The function shall return the number of the letter characters in the string. The characters conversion should be in-place. 42: heLLo() WorLd! 42: HELLO<> WORLD! 9 Example:
main.c #include <stdio.h> # define MAX_LEN 100 /* Maximal line size */ extern int do_Str (char*); int main(void) { char str_buf[MAX_LEN]; int counter = 0; fgets(str_buf, MAX_LEN, stdin); /* Read user's command line string */ counter = do_Str (str_buf); /* Your assembly code function */ printf("%s%d\n",str_buf,counter); return 0; }
myasm.s section .data ; data section, read-write an: DQ 0 ; this is a temporary var section .text ; our code is always in the .text section global do_Str; makes the function appear in global scope extern printf ; tell linker that printf is defined elsewhere ; (not used in the program) do_Str: ; functions are defined as labels push rbp; save Base Pointer (bp) original value movrbp, rsp; set base pointer to point on function frame movrcx, rdi; get function argument ;;;;;;;;;;;;;;;; FUNCTION EFFECTIVE CODE STARTS HERE ;;;;;;;;;;;;;;;; movaword [an], 0 ; initialize answer label_here: ; Your code goes somewhere around here... inc rcx ; increment pointer cmp byte [rcx], 0 ; check if byte pointed to is zero jnz label_here ; keep looping until it is null terminated ;;;;;;;;;;;;;;;; FUNCTION EFFECTIVE CODE ENDS HERE ;;;;;;;;;;;;;;;; movrax,[an] ; return an (returned values are in rax) movrsp, rbp pop rbp ret
suppose that register RCX contains address of (first character of) input string str_buf • suppose an is a name of counter that counts non letters characters of str_buf • an is defined in the second line of the assembly code below • an is of type DQ, this means Define Quadword, i.e. 64-bit sized variable • using RCX, get to first character of str_buf, and change it according to the assignment requirements • if this character is not letter, increment the value of an Example: str_buf and an at the beginning of do_Strfunction execution: RAM 8 bytes of quadword variable an an ‘\n’ ASSCII code input string str_buf RCX
Running NASM To assemble a file, you issue a command of the form > nasm -f <format> <filename> [-o <output>] [ -l listing] Example: > nasm -f elf64 myasm.s -o myelf.o It would create myelf.o file that has elf format (executable and linkable format).We use main.c file (that is written in C language) to start our program, and sometimes also for input / output from a user. So to compile main.c with our assembly file we should execute the following command: gcc main.c myelf.o -o myexe.out If we do not use main.c, we compile with gcc as follows: gcc myelf.o -o myexe.out It would create executable file myexe.out.In order to run it you should write its name on the command line: > myexe.out
How to run Linux from Window • Go to http://www.chiark.greenend.org.uk/~sgtatham/putty/download.html • Run the following executable • Use “lvs.cs.bgu.ac.il” or “lace.cs.bgu.ac.il” host nameand click ‘Open’ • Use your Linux username and password to login lace server • Go to http://in.bgu.ac.il/en/natural_science/cs/LocalGuides/Pages/ Computer-Labs-building-34-and-95.aspx • Choose any free Linux computer • Connect to the chosen computer by using “ssh –X cs302row2-2” (maybe you would be asked for your password again) • cd (change directory) to your working directory