1 / 25

IA32 programming for Linux

IA32 programming for Linux. Concepts and requirements for writing Linux assembly language programs for Pentium CPUs. A source program’s format. Source-file: a pure ASCII-character textfile It’s created using a text-editor (such as ‘vi’) You cannot use a ‘word processor’ (why?)

mharbour
Download Presentation

IA32 programming for Linux

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. IA32 programming for Linux Concepts and requirements for writing Linux assembly language programs for Pentium CPUs

  2. A source program’s format • Source-file: a pure ASCII-character textfile • It’s created using a text-editor (such as ‘vi’) • You cannot use a ‘word processor’ (why?) • Program consists of series of ‘statements’ • Every program-statement fits on one line • Program-statements all have same layout • Designed in 1950s for IBM’s punch-cards

  3. Statement Layout (1950s) • Each ‘statement’ was comprised of four ‘fields’ • Fields appear in a prescribed left-to-right order • These four fields were named (in order): -- the ‘label’ field -- the ‘opcode’ field -- the ‘operand’ field -- the ‘comment’ field • In many cases some fields could be left blank • Extreme case (very useful): whole line is blank!

  4. Statement layout • Parsing an assembly language statement • A colon character (‘:’) terminates the label • A white-space character (e.g., blank or tab) separates the opcode from its operand(s) • A hash character (‘#’) begins the comment Label: opcode operand(s) # comment

  5. The ‘as’ program • The ‘assembler’ is a computer program • It accepts a specified text-file as its input • It must be able to ‘parse’ each statement • It can produce onscreen ‘error messages’ • It can generate an ELF-format output file • (That file is known as an ‘object module’) • It can also generate a ‘listing file’ (optional)

  6. The ‘label’ field • A label is a ‘symbol’ followed by a colon (‘:’) • The programmer invents his/her own ‘symbols’ • Symbols can use letters and digits, plus a very small number of ‘special’ characters ( ‘.’, ‘_’, ‘$’ ) • A ‘symbol’ is allowed to be of arbitrarily length • The Linux assembler (‘as’) was designed for translating source-text produced by a high-level language compiler (such as ‘cc’) • But humans can also write such files directly

  7. The ‘opcode’ field • Opcodes are predefined symbols that are recognized by the GNU assembler • There are two categories of ‘opcodes’ (called ‘instructions’ and ‘directives’) • ‘Instructions’ represent operations that the CPU is able to perform (e.g., ‘add’, ‘inc’) • ‘Directives’ are commands that guide the work of the assembler (e.g., ‘.global’, ‘.int’)

  8. Instructions vs Directives • Each ‘instruction’ gets translated by ‘as’ into a machine-language statement that will be fetched and executed by the CPU when the program runs (i.e., at ‘run time’) • Each ‘directive’ modifies the behavior of the assembler (i.e., at ‘assembly time’) • With GNU assembly language, these are easy to distinguish: directives begin with ‘.’

  9. A list of the Pentium opcodes • An ‘official’ list of the instruction codes can be found in Intel’s programmer manuals: http://developer.intel.com • But it’s three volumes, nearly 1000 pages (it describes ‘everything’ about Pentiums) • An ‘unofficial’ list of (most) Intel instruction codes can fit on one sheet, front and back: http://www.jegerlehner/intel/

  10. The AT&T syntax • The GNU assembler uses AT&T syntax (instead of official Intel/Microsoft syntax) so the opcode names differ slightly from names that you will see on ‘official’ lists: Intel-syntax AT&T-syntax --------------- ---------------------- ADD  addb/addw/addl INC  incb/incw/incl CMP  cmpb/cmpw/cmpl

  11. The UNIX culture • Linux is intended to be a version of UNIX (so that UNIX-trained users already know Linux) • UNIX was developed at AT&T (in early 1970s) and AT&T’s computers were built by DEC, thus UNIX users learned DEC’s assembley language • Intel was early ally of DEC’s competitor, IBM, which deliberately used ‘incompatible’ designs • Also: an ‘East Coast’ versus ‘West Coast’ thing (California, versus New York and New Jersey)

  12. Bytes, Words, Longwords • CPU Instructions usually operate on data-items • Only certain sizes of data are supported: BYTE: one byte consists of 8 bits WORD: consists of two bytes (16 bits) LONGWORD: uses four bytes (32 bits) • With AT&T’s syntax, an instruction’s name also incorporates its effective data-size (as a suffix) • With Intel syntax, data-size usually isn’t explicit, but is inferred by context (e.g., from operands)

  13. The ‘operand’ field • Operands can be of several types: -- a CPU register may hold the datum -- a memory location may hold the datum -- an instruction can have ‘built-in’ data -- frequently there are multiple data-items -- and sometimes there are no data-items • An instruction’s operands usually are ‘explicit’, but in a few cases they also could be ‘implicit’

  14. Examples of operands • Some instruction will have two operands: movl %ebx, %ecx addl $4, %esp • Some instructions will have one operand: incl %eax pushl $fmt • An instruction that lacks explicit operands: ret

  15. The ‘comment’ field • An assembly language program often can be hard for a human being to understand • Even a program’s author may not be able to recall his programming idea after awhile • So programmer ‘comments’ can be vital! • A comment begin with the ‘#’ character • The assembler disregards all comments (but they will appear in program listings)

  16. ‘Directives’ • Sometimes called ‘pseudo-instructions’ • They tell the assembler what to do • The assembler will recognize them • Their names begin with a dot (‘.’) • Examples: ‘.section’, ‘.int,’ ‘.string’, … • The names of valid directives appears in the table-of-contents of the GNU manual

  17. New program example • Let’s look at a demo program (‘squares.s’) • It will display a mathematical table showing some numbers and their squares • But it doesn’t use any multiplications! • It uses an algorithm based on algebra: (n+1)2 - n2 = n + n + 1 If you already know the square of a given number n , you can get the square of the next number n+1 by just doing additions

  18. Visualizing the algorithm idea n n (n + 1)2 = n2 + 2n + 1

  19. Our program uses a ‘loop’ • Here’s our program idea (expressed in C++) int num = 0, val = 0; do { val += num + num + 1; num += 1; printf( “ %d %d \n”, num, val ); } while ( num != 20 );

  20. Some program ‘directives’ • ‘.equ’ – equates a symbol to a value: .equ MAX, 20 • ‘.string’ – just an alternative for ‘.asciz’: body: .string “ %d %d \n” • ‘.space’ – reserves uninitialized memory num: .space 4 # to hold an ‘int’

  21. Some new ‘instructions’ • ‘inc’ – adds one to the specified operand: incl arg • ‘cmp’ – compares two specified operands: cmpl $MAX, arg • ‘je’ – jumps (to a specified instruction) if the condition ‘equal to’ is currently true: je again

  22. Comparisons can be ‘tricky’ • It’s easy to get confused by AT&T syntax: mov $5, %eax while: add $-1, %eax cmp $5, %eax jge while (e.g., will this loop ever finish executing?) • REMEMBER: ‘compare’ means ‘subtract’

  23. The FLAGS register O F D F I F T F S F Z F 0 A F 0 P F 1 C F Legend: ZF = Zero Flag SF = Sign Flag CF = Carry Flag PF = Parity Flag OF = Overflow Flag AF = Auxiliary Flag

  24. In-class exercise #1 • How would you modify the source-code for the ‘squares’ program so that it prints out a larger table (i.e., more than 20 lines)? • Question: How many lines can you display on the screen before your program starts to show ‘wrong’ entries?

  25. In-class exercise #2 • Can you write an enhanced version of our ‘squares.s’ program that would also show the cube for each number (but using only the ‘add’ and ‘inc’ arithmetic operations)? • HINT: Remember these algebra formulas: (n+1)3 - n3 = 3n2 + 3n + 1 3n = n + n + n

More Related