320 likes | 462 Views
52.223 Low Level Programming Lecturer: Duncan Smeed. Overview of IA-32 Assembly Language Programming Part 1. Program Translation Hierarchy. Assembly Language Programming level. }. An Assembly Language Program: Global View.
E N D
52.223 Low Level ProgrammingLecturer: Duncan Smeed Overview of IA-32 Assembly Language Programming Part 1
Program Translation Hierarchy Assembly Language Programming level } Overview of IA-32 Assembly Language Programming - Part 1
An Assembly Language Program: Global View • Typically, an Assembly Language Program (ALP) is divided into three sections that specify the main components of a program. In some cases these sections can be inter-mixed to provide for better design and structure. These section are: • Assembler Directives (aka Pseudo-ops) • Assembly Language Instructions • Data Storage Directives Overview of IA-32 Assembly Language Programming - Part 1
Assembler Directives (Pseudo-ops) • These are directives supplied by the user to the assembler for defining data and symbols, setting assembler and linking conditions, and specifying output formats, etc. The directives do not produce machine code. Examples: DOSSEG - Specifies a standard segment order for the code, data and stack segments. PROC - Identifies the first executable instruction: the program entry point. END - Program End. This informs the assembler that the program source is finished. Overview of IA-32 Assembly Language Programming - Part 1
Assembly Language Instructions • These are the actual IA-32 instructions that are translated into executable machine code. Examples: MOV [operands] ; to move data, i.e. memory to register ADD [operands]; to add two data values AND [operands]; to logically AND two data values Overview of IA-32 Assembly Language Programming - Part 1
Data Storage Directives • Also known as Data Definition Directives • These allocate data storage locations containing initialized or uninitialized data. Examples: db "Good afternoon”,0 db 20 dup(0) ; 20 bytes, all zeroed db 20 dup(?) ; 20 uninitialised bytes dw ?,?,?,?,? ; 5 uninitialised words Overview of IA-32 Assembly Language Programming - Part 1
Format of Assembly Language Statements • In general an assembly language (AL) statement can contain up to four fields. Namely: [name][mnemonic][operand(s)][comment] • name identifies a label, variable, constant (symbol) or keyword. • mnemonic identifies the AL instruction (opcode) or an assembler directive. • operand(s) identifies the operand(s) for the mnemonic. • comment signifies AL commentary/documentation. Overview of IA-32 Assembly Language Programming - Part 1
[name] • This field identifies a label, variable, constant or keyword. • Label - When a name appears next to a program instruction, it is called a label. Labels serve as place markers to be used as, for example, an address reference in a jump instruction: jmp endif_01 • Variable - A name used before a data allocation directive identifies a location where data resides in memory. E.g.: Count1 db 50 ; the variable count1 • Constant - A name used to define a constant. E.g.: max_col equ 80 ; the constant max_col • Keyword - A keyword, or reserved word, has some predefined meaning to the assembler. It may be an instruction mnemonic or an assembler directive. Keywords cannot be used out of context or as identifiers: add mov ax,10 ; illegal use of add as label Overview of IA-32 Assembly Language Programming - Part 1
[mnemonic] • This field contains the mnemonic of: • an instruction opcode (e.g. MOV, ADD) or, • a pseudo-op (e.g. DB, EQU) • To distinguish labelled statements from unlabelled ones the mnemonic field of an unlabelled statement must (depending on assembler) either: • not start in the first column since that’s where labels start, • or labels must have an identifying character - often a ‘:’ suffix - to differentiate them from other fields. E.g. the following code uses both types of formatting for illustration (but note most assemblers use just one style or the other): jmp endif_01 ; ‘tabbed in’ statement else_01: mov ax,10 ; ‘suffix :’ style label endif_01 add dx,ax ; ‘column 1’ style label Overview of IA-32 Assembly Language Programming - Part 1
[operand(s)] • For those instructions or pseudo-ops that require operands then this field contains one or more operands separated - typically - by commas (e.g. registers or addresses of data to be operated upon by the instruction in the mnemonic (op-code) field. Examples: ax ‘A’ ax,100 [200],bx dx,[bx] [bx+si],cx ax,[bx+si+2] Overview of IA-32 Assembly Language Programming - Part 1
[comment] • The remainder of the statement is the comment field. • Some assemblers require this field to start with a special character, such as ';’ or ‘#’. • Comments in the program are for documentation purposes only and are ignored by the assembler. • Such comments are absolutely vital when programming in AL since there is such a large semantic gap between the design of a program/algorithm at a high level and its implementation at such a low level. Overview of IA-32 Assembly Language Programming - Part 1
Comment-only Statements • The exception to the format of [name][mnemonic][operand(s)][comment] is that if a line starts with a special comment-line character then the whole line is treated as a comment: ; This is an example of a comment-only line. If you ever ; write AL programs then such comment lines should ; ideally outnumber code lines by a significant factor! ; IOW, AL is a write-only language ;-) ; Incidentally, the following in-line comment is almost ; worthless!!: mov ax,10 ; move the value 10 into AX Overview of IA-32 Assembly Language Programming - Part 1
Field Separators • In general, the fields are separated by spaces and if the label field is NOT present it must be replaced by at least one space. To improve the appearance of the program it is wise to position the fields at particular column positions (e.g. at tab stops). For example, contrast the following two programs - one with an untidy layout and the other with a neat layout. ;1) Untidily laid out ; example program mov ax,[150] mov bx,[152] add ax,2 mov [154],ax mov [150], bx int 20 ;2) Neatly laid out ; example program mov ax,[150] ; blah mov bx,[152] ; blah blah add ax,2 ; wibble mov [154],ax ; ...wibble mov [150],bx ; blah int 20 ; End program Overview of IA-32 Assembly Language Programming - Part 1
Data Definition Directives Revisited • Variables are really just symbolic names for locations in memory where data is stored. In assembly language, (global) variables are identified by labels. • A label does not, however, indicate how many bytes of storage are allocated to a variable - it is, in effect, the address of the first byte of a data structure. • The following syntax diagram shows that label is optional, and only one intialvalue is required. If more are supplied, they must be separated by commas: [label] <directive> initialvalue [,initialvalue] Overview of IA-32 Assembly Language Programming - Part 1
…Data Definition Directives • Data definition directives are used to allocate storage and include the following pre-defined types: Overview of IA-32 Assembly Language Programming - Part 1
DB - Define Byte • The DB directive allocates storage for one or more 8-bit values: [label] DBinitialvalue [,initialvalue] • Initialvalue can be one or more 8-bit values, a string constant, a constant expression (evaluated at assembly time), or a question mark (?). If the value is signed, it has the range -128 to +127; if unsigned, the range is 0 to 255. Here are a few examples: char db 'A' ; ASCII character min_s db -128 ; min. signed value max_s db +127 ; max. signed value min_u db 0 ; min. unsigned value max_u db 255 ; max. unsigned value Overview of IA-32 Assembly Language Programming - Part 1
… DB - Define Byte • Each value may also be expressed in a different radix. For example, the following variables all contain exactly the same value. Which radix to use is entirely up to the programmer but is usually chosen to reinforce the context of its use. I.e. if a value is to be treated in a 'character' context then the definition reflects that. Thus: char_version db 'A' ; ASCII character hex_version db 41h ; as hexadecimal dec_version db 65 ; as decimal bin_version db 01000001b ; as binary oct_version db 101q ; as octal Overview of IA-32 Assembly Language Programming - Part 1
… DB - Define Byte • A list of values may be grouped under a single label, with the values separated by commas. In the following example, list1and list2have the same contents: list1 db 10, 32, 41h,001000010b list2 db 0Ah,20h,'A',22h • A variable contents may be left undefined by using the question mark (?) operator. Or a numeric expression can initialise a variable with a value that is calculated at assembly time. Examples: count db ? ages db ?,?,?,?,? scrn_size db 80*24 Overview of IA-32 Assembly Language Programming - Part 1
…DB - Define Byte • A string may be assigned to a variable, in which case the variable (label) stands for the address of the first byte. C_string db "Good morning",0 pascal_string db 12,"Good morning" • Long strings can be made more readable in an AL source program by continuing them over multiple lines without the necessity of supplying a label for each. The following string is terminated by an end-of-line sequence and a null byte: a_long_string db "This is a string " db "that clearly is going to take " db "several lines to store in an " db "assembly language program." db 0Dh,0Ah,0 ; EOL sequence + NULL Overview of IA-32 Assembly Language Programming - Part 1
$ Operator • The assembler can automatically calculate the length of a string by making use of the $ operator which represents the assembler's current location counter value. In the following example, a_string_len is initialised to 16: a_string db "This is a string" a_string_len db $-a_string Overview of IA-32 Assembly Language Programming - Part 1
DW - Define Word • The DW directive creates storage for one or more 16-bit words. The syntax is: [label] DWinitialvalue [,initialvalue] • Initialvalue can be any 16-bit value from 0 to 65,535 (FFFFh) or -32,768 (8000h) to +32,767 (7FFFh) if signed, a constant expression (evaluated at assembly time), or a question mark (?) to leave a variable uninitialised. Overview of IA-32 Assembly Language Programming - Part 1
DW and Near Pointers • The offset of a variable or subroutine may be stored in another variable. In the next example, the assembler sets listPtr to the offset of list. Then listPtrPtr contains the address of listPtr. Finally, aProcPtr contains the offset of a label called clear_screen. list dw 256,257,258,259 listPtr dw list listPtrPtr dw listPtr aProcPtr dw clear_screen Overview of IA-32 Assembly Language Programming - Part 1
DD - Define Doubleword • The DD directive creates storage for one or more 32-bit doublewords. The syntax is: [label] DD initialvalue [,initialvalue] • Initialvalue can be any 32-bit value up to FFFFFFFFh, a segment-offset address, a 4-byte encoded real number, or a decimal real number. The bytes are stored in little-endian format, i.e. the value 12345678h would be stored in memory as: memory address (offset): 00 01 02 03 contents: 78 56 34 12 Overview of IA-32 Assembly Language Programming - Part 1
…DD - Define Doubleword • You can define either a single doubleword or a list of doublewords. In the example that follows, far_pointer1 is uninitialised and the assembler automatically initialises far_pointer2 to the 32-bit segment-offset address of subroutine1: signed_val dd -2147483648 far_pointer1 dd ? far_pointer2 dd subroutine1 Overview of IA-32 Assembly Language Programming - Part 1
DUP Operator • The DUP operator only appears after a storage allocation directive (DB, DW,...). DUP allows for the repetition of one or more values when allocating storage. This is especially useful when allocating space for a table or array. For example: db 20 dup(0) ; 20 bytes, all zeroed db 20 dup(?) ; 20 uninitialised bytes db 4 dup('ABC') ; 12 bytes: 'ABCABCABCABC' • The DUP operator may also be nested. The first example below creates storage containing (in ASCII) 000XX000XX. The second example creates a 2-dimensional word table of 3 rows by 4 columns: aTable db 4 dup( 3 dup('0'), 2 dup('X') ) anArray dw 3 dup( 4 dup(0) ) Overview of IA-32 Assembly Language Programming - Part 1
Type Checking • When a variable is created using DB, DW, etc., the assembler gives it a default attribute (byte, word, etc.) based on its size. This type is checked on referencing the variable and an error results if the types do not match. So: count dw 20h ... mov al,count ;error: type mismatch Overview of IA-32 Assembly Language Programming - Part 1
…Type Checking • To overcome type checks requires the use of a LABEL directive to create a new name (and associated type) at the same address. Thus: count_lo label byte ; byte attribute count dw 20h ; word attribute ... mov al,count_lo ; use low byte of count mov cx,count ; use all of count Overview of IA-32 Assembly Language Programming - Part 1
Addressing Modes Revisited • As we have seen an instruction consists of • the op-code that tells the process what instruction to perform and, • the operand or address field which tells the processor where to find that data to be operated upon. This address is known as the Effective Address (EA). • To determine the EA, the processor uses one of a number of addressing modes that are defined by the operand field of the instruction. Getting the EA from the addressing mode may be quite simple (e.g. the operand is [the contents of] a data register) or complex (e.g. the operand is in memory, the address of which is contained in an address register). [See 52223_02/16-34 for details of the IA-32 AMs.] Overview of IA-32 Assembly Language Programming - Part 1
Aside: Lecture Notes Archive • Further examples of AMs, etc., of the IA-16 subset of IA-32 can be found in my lecture notes archive at: <http://www.cis.strath.ac.uk/~dunc/ cdrom/archives/ay2000/teaching/llp/lectures/odd/part2.html> Overview of IA-32 Assembly Language Programming - Part 1
DEBUG • DEBUG is included as part of the standard Windows installation. • DEBUG is a DOS-mode debugger, which means: • It’s of no use for debugging Win 32 applications • But it is useful to explore the wonderful(!) world of the IA-16 (real mode) subset of IA-32. • An overview of DEBUG can be found in my lecture notes archive at: <http://www.cis.strath.ac.uk/~dunc/ cdrom/archives/ay2000/teaching/llp/practicals/debug.html> Overview of IA-32 Assembly Language Programming - Part 1
H:/llp/p1>debug Overview of IA-32 Assembly Language Programming - Part 1
References & Bibliography • Duncan’s Archived 52.223 Lecture Notes <http://www.cis.strath.ac.uk/~dunc/cdrom/archives/ay2000/teaching/llp/> • sandpile.org -- IA-32 architecture <http://www.sandpile.org/ia32/index.htm> • PC Assembly Language <http://www.drpaulcarter.com/pcasm/> • Linux Assembly HOWTO <http://www.faqs.org/docs/Linux-HOWTO/Assembly-HOWTO.html> • Inline Assembly with DJGPP <http://www.delorie.com/djgpp/doc/brennan/brennan_att_inline_djgpp.html> • docs.sun.com: IA-32 Assembly Language Reference Manual <http://docs.sun.com/app/docs/doc/806-3773/6jct9o0ad?a=view> • Pentium Assembly Code Using gcc <http://william.krieger.faculty.noctrl.edu/archive/c2003_09_csc220/assembly/> • Microsoft Windows XP - Debug <http://www.microsoft.com/resources/documentation/windows/xp/all/proddocs/en-us/debug.mspx> Overview of IA-32 Assembly Language Programming - Part 1