1.12k likes | 1.34k Views
EEL 3801. Part III Assembly Language Programming. Assembly Language Programming. The basic element of an assembly program is the statement. Two types of statements: Instructions : executable statements that actually do something.
E N D
EEL 3801 Part III Assembly Language Programming EEL 3801C
Assembly Language Programming • The basic element of an assembly program is the statement. • Two types of statements: • Instructions: executable statements that actually do something. • Directive: provide information to assist the assembler in producing executable code. For example, create storage for a variable and initialize it. EEL 3801C
Assembly Programming (cont’d) • Assembly language instructions equate one-to-one to machine-level instructions, except they use a mnemonic to assist the memory. • Program control: Control the flow of the program and what instructions to execute next (jump, goto). • Data transfer: Transfer data to a register or to main memory. • Arithmetic: add, subtract, multiply, divide, etc. EEL 3801C
Assembly Programming (cont’d) • Logical: > < = etc. • Input-output: read, print etc. EEL 3801C
Statements • A statement is composed of a name, a mnemonic, operands and an optional comment. Their general format is as follows: [name] [mnemonic] [operand(s)] [;comments] EEL 3801C
Names, labels • Name: Identifies a label for a statement or for a variable or constant. • Can contain one or more of several characters (see page 56 of new textbook). • Only first 31 characters are recognized • Case insensitive. • First character may not be a digit. EEL 3801C
Names, labels • The period “.” may only be used as the first character. • Cannot use reserved names. • Can be used to name variables. Such when put in front of a memory allocation directive. Can also be used to define a constant. For example: count1 db 50 ; a variable (memory allocation directive db) count2 equ 100 ; a constant EEL 3801C
Names, labels • Can be used as labels for statements to act as place markers to indicate where to jump to. Can identify a blank line. For example: label1: mov ax,10 mov bx,0 . . . label2: jmp label1 ; jump to label1 EEL 3801C
Mnemonics • Mnemonic: identifies an instruction or directive. These were described above. • The mnemonics are standard keywords of the assembly language for a particular processor. • You will become familiar with them as time goes on this semester. EEL 3801C
Operands • Operands are various pieces of information that tells the CPU what to take the action on. Operands may be a register, a variable, a memory location or an immediate value. 10 immediate value count variable AX register [0200] memory location • Comments: Any text can be written after the statement as long as it is preceded by the “;”. EEL 3801C
Elements of Assembly Language for the 8086 Processor • Assembler Character Set These are used to form the names, mnemonics, operands, variables, constants, numbers etc. which are legal in 8086 assembly. • Constant: A value that is either known or calculated at assembly time. May be a number or a string of characters. Cannot be changed at run time. EEL 3801C
Elements of Assembly Language… (cont.) • Variable: A storage location that is referenced by name. A directive must be executed identifying the variable name with the location in memory. • Integers: Numeric digits with no decimal point, followed by the radix mentioned before (e.g., d, h, o, or b). Can be signed or unsigned. EEL 3801C
Elements of Assembly Language… (cont.) • Real numbers: floating point number made up of digits, a decimal point, an optional exponent, and an optional leading sign. (+ or -) digits.digits [exponential (+ or -)] digits EEL 3801C
Elements of Assembly Language… (cont.) • Characters and strings: • A character is one byte long. • Can be mapped into the binary code equivalent through the ASCII table, and vice-versa. • May be enclosed within single or double quotation marks. • Length of string determined by number of characters in string, each of which is 1 byte. For example: EEL 3801C
Elements of Assembly Language… (cont.) “a” – 1 byte long ‘b’ – 1 byte long “stack overflow” – 14 bytes long ‘abc#?%%A’ – 8 bytes long EEL 3801C
Example of Simple Assembly Program • The following simple program will be demonstrated and explained: mov ax,5 ; move 5h into the AX register add ax,10 ; add 10h to the AX register add ax,20 ; add 20h to the AX register mov sum,ax ; store value of AX in variable ; ending the program mov ax, 4C00H int 21 EEL 3801C
Example (cont.) • The result is that at the end of the program, the variable sum, which exists somewhere in memory (declaration not shown), now accepts the value accumulated in AX, namely, 35h. • Explain program. EEL 3801C
More Complex Example (cont.) 1: title Hello World Program (hello.asm) 2: 3: ; this program displays “Hello, World” 4: 5: dosseg 6: .model small 7: .stack 100h 8: 9: .data 10: hello_message db ‘Hello, World!’,0dh,0ah,’$’ 11: EEL 3801C
More Complex Example (cont.) 12: .code 13: main proc 14: mov ax,@data 15: mov ds,ax 16: 17: mov ah,9 18: mov dx,offset hello_message 19: int 21h 20: 21: mov ax,4C00h 22: int 21h 23: main endp 24: end main EEL 3801C
More Complex Example (cont.) The program is explained as follows: • Line 1: Contains the title directive. All characters located after the title directive are considered as comments, even though the ; symbol is not used. • Line 3: Comment line with the ; symbol • Line 5: The dosseg directive specifies a standard segment order for the code, data and stack segments. The code segment is where the program instructions are stored. The data segment is where the data (variables) are stored. The stack segment is where the stack is maintained. EEL 3801C
More Complex Example (cont.) • Line 6: The .model directive indicates the memory architecture to be used. In this case, it uses the Microsoft small memory architecture. It indicates this by the word small after .model. • Line 7: This directive sets aside 100h of memory for the stack. This is equivalent to 256 bytes of memory (162 = 256). • Line 9: The .data directive marks the beginning of the data segment, where the variables are defined and memory allocated to them. EEL 3801C
More Complex Example (cont.) • Line 10: The db directive stands for “define byte”, which tells the assembler to allocate a sequence of memory bytes to the data that follow. 0dh is a carriage return and 0ah is the linefeed symbol. The $ is the required terminator character. The number of memory bytes is determined by the data themselves. Hello_message is the name of the variable to be stored in memory, and the db allocates memory to it in the size defined by the data following it. • Line 12: The directive .code is the indication of the beginning of the code segment. The next few lines represent instructions. EEL 3801C
More Complex Example (cont.) • Line 13: The proc directive is used to declare the main procedure called “main”. Any name could have been used, but this is in keeping with the C/C++ programming requirement that the main procedure be called the main function. The first executable instruction following this directive is called the program entry point - the point at which the program begins to execute. EEL 3801C
More Complex Example (cont.) • Line 14: The instruction mov is used to copy the contents of one address into the other one. The first operand is called the destination address, while the second one is called the source address. In this particular use, we tell the assembler to copy the address of the data segment (@data) into the AX register. • Line 15: Copies the content of AX into DS, which is a register used to put the data segment, the default base location for variables. EEL 3801C
More Complex Example (cont.) • Line 17: This instruction places the value 9 in the AH register. Remember that from the page.com program, this is the register used to store the name of the DOS subroutine to be called with the int 21 instruction. • Line 18: This instruction places the address of the string to be identified in the DX register. Remember that this is the offset, where the default base segment is already identified in the DS register as the base segment for the data segment. Since the address of the variable hello_message begins at the beginning of the data segment, identified by DS, we only need to supply the offset for the variable. EEL 3801C
More Complex Example (cont.) • Line 19: The instruction int 21, as we saw before, takes the name of the function from the DX register, which in this case is 9. DOS funtion 9, incidentally, sends the contents of DX register to the VRT output device. The DX register contains the address of the string to be sent. • Line 21 and 22: These instructions represent the equivalent of an end or stop statement. This is different from that done for page.com because this will be an executable program (.exe), rather than a .com program. More on this later. • Line 23: Indicates the end of the main procedure. EEL 3801C
More Complex Example (cont.) • Line 24: The END directive – the last line to be assembled. The main next to it indicates the program entry point. EEL 3801C
More Complex Example (cont.) • The program may seem overly complicated for such a simple program. • But remember that assembly language corresponds one-to-one with machine language instructions. • Note that it takes only 562 bytes of memory when assembled and compiled. EEL 3801C
More Complex Example (cont.) • The same program written in a high level language will require several more machine level instructions to carry out the same thing. • Written in Turbo C++, the executable program would take 8772 bytes of memory to store. EEL 3801C
Specifics of ALP – Data Definition Directives • A variable is a symbolic name for a location in memory. This is done because it is easy to remember variables, but not memory locations. It is like an aka, or a pseudonym. EEL 3801C
Data Definition Directives • Variables are identified by labels. A label shows the starting location of a variable’s memory location. A variable’s offset is the distance from the beginning of the data segment to the beginning of the variable. EEL 3801C
Data Definition Directives (cont.) • A label does not indicate the length of the memory that the variable takes up. • If a string is being defined, the label offset is the address of the first byte of the string (the first element of the string). • The second element is the offset + 1 byte. • The third element is offset +2 bytes. EEL 3801C
DB Define byte DW Define word Define doubleword DD DF DP Define far pointer DQ Define quadword DT Define tenbytes Data Definition Directives (cont.) • The amount of memory to be allocated is determined by the directive itself. EEL 3801C
Define Byte • Allocates storage for one or more 8-bit values (bytes). Has the following format: [name] DB initialvalue [,initialvalue] • The name is the name of the variable. Notice that it is optional. EEL 3801C
Define Byte (cont) • initialvalue can be one or more 8-bit numeric values, a string constant, a constant expression or a question mark. • If signed, it has a range of 127 to –128, If unsigned, it has a range of 0 – 255. EEL 3801C
Define Byte - Example char db ‘A’ ; ASCII character signed1 db -128 ; min signed value signed2 db 127 ; max signed value unsigned1 db 0 ; min unsigned value unsigned2 db 255 ; max signed value EEL 3801C
Define Byte (cont) • Multiple values: A sequence of 8-bit numbers can be used as initialvalue. They are then grouped together under a common label, as in a list. • The values must be separated by commas. list db 10,20,30,40 EEL 3801C
Define Byte (cont) • The 10 would be stored at offset 0000; • 20 at offset 0001; • 30 at offset 0002; and • 40 at offset 0003, • where 0001 represents a single byte. EEL 3801C
Define Byte (cont) • A variable’s value may be left undefined. This can be done by placing a ‘?’ for each byte to be allocated (as in a list). count db ? EEL 3801C
Define Byte (cont) • A string may be assigned to a variable, each of whose elements will be allocated a byte. c_string db ‘This is a long string’ • The length of a string can be automatically determined by the assembler by the ‘$’ symbol. • See page 65 of new book for details. EEL 3801C
Define Word • Serves to allocate memory to one or more variables that are 16 bits long. Has the following format: [name] DW initialvalue [,initialvalue] • The name is the name of the variable. Notice that it is optional. • initialvalue can be one or more 16-bit numeric values, a string constant, a constant expression or a question mark. EEL 3801C
Define Word (cont) • If signed, it has a range of 32,767 to –32,768, • If unsigned, it has a range of 0 – 65,535. EEL 3801C
Define Word (Example) var dw 1,2,3 ; defines 3 words signed1 dw -32768 ; smallest signed value signed2 dw 32767 ; largest signed value unsigned1 dw 0 ; smallest unsigned value unsigned2 dw 65535 ; largest signed value var-bin dw 1111000011110000b var-hex dw 4000h var-mix dw 1000h,4096,’AB’,0 EEL 3801C
Reverse Storage Format • The assembler reverses the bytes in a word when storing it in memory. • The lowest byte is placed in the lowest address. • It is re-reversed when moved to a 16-bit register. value dw 2AB6h B6 2A EEL 3801C
Define Doubleword – DD • Same as DB and DW, except the memory allocated is now 4 bytes (2 words, 32 bits). [name] DD initialvalue [,initialvalue] • The name is the name of the variable. Notice that it is optional. • initialvalue can be one or more 32-bit numeric values, either in dec., hex or bin. form, string const., a const. Expression, or ? EEL 3801C
Multiple Values • A sequence of 32-bit numbers can be used as initialvalue. • They are then grouped together under a common label, as in a list. • The values must be separated by commas. EEL 3801C
Reverse Order Format • As in define word, the bytes in doubleword are stored in reverse order as in DW. • For example, var dd 12345678h 78 56 34 12 EEL 3801C
Type Checking • When a variable is created, the assembler characterizes it according to its size (i.e., byte, word, doubleword, etc.). • When a variable is later referenced, the assembler checks its size and only allows values to be stored that come from a register or other memory that matches in size. • Mismatched movements of data not allowed. EEL 3801C
Data Transfer Instructions – mov • The instruction mov is called the data transfer instruction. • A very important one in assembly - much programming involves moving data around. • Operands are 8- or 16-bit on the 8086, 80186 and 80286. • Operands on the 80386 and beyond, they can also be 32-bits long. EEL 3801C
Data Transfer Instructions – mov (cont) • The format is: mov destination,source • The source and destination operands are self-explanatory. • The source can be an immediate value (a constant), a register, or a memory location (variable). It is not changed by the operation. • The destination can be a register or a memory location (variable). EEL 3801C