340 likes | 575 Views
Machine Independent Assembler Features. Literal, Symbol, Expression. Literals. Motivation It is convenient if a programmer can write the value of a constant operand as a part of the instruction that uses it.
E N D
Machine Independent Assembler Features Literal, Symbol, Expression
Literals • Motivation • It is convenient if a programmer can write the value of a constant operand as a part of the instruction that uses it. • This avoids having to define the constant elsewhere in the program and make up a label for it. • Such an operand is called a literal because the value is stated literally in the instruction.
Literal Example (1) Use = to represent a literal
Literal Program (1)’s Object Code Notice that the object code is the same as the previous one.
Literal Program (2)’s Object Code Notice that the object code is the same as the previous one.
Difference between Literal and Immediate Operand • Immediate addressing • The operand value is assembled as part of the machine instruction. • Literal • The assembler generates the specified value as a constant at some other memory location. • The address of this generated constant is used as the target address for the machine instruction. • The effect of using a literal is exactly the same as if the programming had defined the constant explicitly and used the label assigned to the constant as the instruction operand.
Literal Pool • All of the literal operands used in the program are gathered together into one or more literal pools. • Normally literals are placed into a pool at the end of the program. • Sometimes, it is desirable to place literals into a pool at some other location in the object program. • LTORG directive is introduced for this purpose. • When the assembler encounters a LTORG, it creates a pool that contains all of the literals used since the previous LTORG.
LTORG Example • If we do not use a LTORG on line 93, the literal =C’EOF’ will be placed at the end of the program. • This operand will then begin at address 1073 and be too far away from the instruction that uses it. PC-relative addressing mode cannot be used. • LTORG thus is used when we want to keep the literal operands close to the instruction that uses it.
Duplicate Literals • Duplicate literals – the same literal used in more than one place in the program • For duplicate literals, we should store only one copy of the specified data value to save space. • Most assembler can recognize duplicate literals. • E.g., There are two uses of =X’05’ on lines 215 and 230 respectively. • However, just one copy is generated on line 1076.
Recognize Duplicate Literals • Comparing the character string defining them • For example, =X’05’ and =X’05’ • Comparing the generated data value • For example, =C’EOF’ and =X’454F46’ • More intelligent • However, usually the benefit is not great enough to justify the added complexity.
The Problem with String-Defining Literals • We should be careful about the literal whose value depends on their locations in the program. • E.g., ( * usually denotes the current value of the location counter) BASE * LDB =* • If * appears on line 13, it would specify 0003. If it appears on line 55, it would specify an operand with value 0020.
Literal Processing (1) • Need a literal table LITTAB. For each literal used, the table contains: • Literal name • The operand value and length • The address assigned to the operand when it is placed in a literal pool.
Literal Processing (2) • Pass 1: • When encountering an literal, try to find it in LITTAB. If it can be found, do nothing. • Otherwise, the literal is added to the LITTAB • Name, operand value and length can be entered now • Only the address field is left blank. • When encountering a LTORG or the end of the program, the assembler makes a scan of LITTAB. At this time, each literal currently in this table is assigned an address. The location counter then should be updated to reflect the number of bytes occupied by each literal.
Literal Processing (3) • Pass 2: • For each literal operand encountered, we search it in LITTAB and find its assigned address. • The data values specified by the literals in each literal pool are inserted at the appropriate places in the object program exactly as if these values had been generated by BYTE or WORD statements. • If a literal represents an address in the program (e.g., a location counter value), the assembler must also generate the Modification record.
Symbols • We can use the EQU directive to define a symbol’s value. • So far, the only symbols defined in a program are labels, whose values are the addresses assigned to them. • E.g., MAXLEN EQU 4096. • The value assigned to a symbol may be a constant, or any expression involving constants and previously defined symbols.
Usages of Symbols (1) Establish symbolic names to improve readability in place of numeric values. E.g., +LDT #4096 can be changed to : MAXLEN EQU 4096 +LDT #MAXLEN • When the assembler encounters the EQU statement, it enters MAXLEN into SYMTAB (with value 4096). • During assembly of the LDT instruction, the assembler searches SYMTAB for the symbol MAXLEN, using its value as the operand in the instruction.
Usages of Symbols (2) • Define mnemonic names for registers: (1) (2)
ORG Directive • This can be used to indirectly assign values to symbols. • When this statement is encountered during assembly of a program, the assembler resets its location counter to the specified value. • The ORG statement will thus affect the values of all labels defined until the next ORG. • Normally when an ORG without specified value is encounter, the previously saved location counter value is restored,
ORG Usage Example • Suppose that we have the following data structure and want to access its fields:
Program without Using ORG Show offsets, less readable We can then use LDA VALUE, X to fetch the VALUE field from the table entry indicated by the content of register X. To fetch the next record, X is added by 6 + 1 + 2.
Program with Using ORG Show sizes, more readable Restore the location counter to the old value
No Forward Reference Allowed • For EQU and ORG, all symbols used on the right hand side of the statement must have been defined previously in the program. • This is because in the two-pass assembler, we require that all symbols must be defined in pass 1. Allowed Not allowed
ORG Restriction Not allowed Not allowed
Expression • So far, when we define the value of a symbol or label, only one term is used. • E.g., 106 BUFEND EQU * • Actually, we can also use expressions which contains many terms to define a symbol’s value. • E.g., 107 MAXLEN EQU BUFEND - BUFFER
Relative v.s. Absolute Value • Generally, +, -, *, /, operations are allowed to be used in an expression. • Division is defined to produce an integer. • Regarding program relocation, a symbol’s value can be classified as • Relative • Its value is relative to the beginning of the object program, and thus its value is dependent of program location. • E.g., labels or reference to location counter (*) • Absolute • Its value is independent of program location • E.g., a constant.
Relative v.s. Absolute Expression • Depending on the type of value they produce, expressions are classified as: • Absolute • An expression that contains only absolute terms, or • An expression that contains relative terms but the relative terms occur in pairs and the terms in each pair have opposite signs. (/ and * operations are not allowed) • Relative • An expression in which all relative terms except one can be paired and the remaining unpaired term must have a positive sign. (/ and * operations are not allowed)
Absolute Expression Example • 107 MAXLEN EQU BUFEND – BUFFER • Although BUFEND and BUFFER are relative terms (because their values will change when the program is loaded into a different place in memory), the expression (BUFEND – BUFFER) is an absolute expression. • Why? the value of this expression is 0x1000, which is the same no matter where this program is loaded. • Why? Because BUFEND and BUFFER can be represented as (startingaddr + x) and (startingaddr + y), BUFEND – BUFFER becomes (x – y), which is a constant.
Illegal Expressions • BUFEND + BUFFER • 100 – BUFFER • 3 * BUFFER • These expressions represent neither absolute nor location within the program • Therefore, these expression are considered illegal.
Enhanced Symbol Table • To determine the type of an expression, we must keep track of the types of all symbols defined in the program, • Therefore, we need a flag in the symbol table to indicate type of value (absolute or relative) in addition to the value itself.