330 likes | 343 Views
Learn the role of assemblers in translating human-readable mnemonic instructions into machine language, and the distinction from compilers. Explore two-pass assembly process, symbol tables, and direct vs. indirect addressing modes.
E N D
A Discussion on Assemblers • Mnemonic instructions, such as LOAD 104, are easy for humans to write and understand. Also labels can be used to identify particular memory locations. • They are impossible for computers to understand. • Assemblers translate instructions that are comprehensible to humans into the machine language that is comprehensible to computers • We note the distinction between an assembler and a compiler: In assembly language, there is a one-to-one correspondence between a mnemonic instruction and its machine code. With compilers, this is not usually the case. • Assemblers create an object program file from mnemonic source code (assembly program) in two passes. • During the first pass, the assembler assembles as much of the program as it can, while it builds a symbol tablethat contains memory references for all symbols in the program. • During the second pass, the instructions are completed using the values from the symbol table. Lecture
A Discussion on Assemblers • Consider our example program at the right. • Note that we have included two directives HEX and DEC that specify the radix of the constants. • The first pass, creates a symbol table and the partially-assembled instructions as shown (ie. doesn’t know X is located at address 104). • Also after the first pass, the translated instructions are incomplete Mnemonic instructions or alphanumeric name Label or Memory location name Lecture
A Discussion on Assemblers • After the second pass, the assembler uses the symbol table to fill in the addresses and create the corresponding machine language instructions • After the second pass, it knows X is located at address 104 and that is totally translated to machine code Lecture
Extending Our Instruction Set • So far, all of the MARIE instructions that we have discussed use a direct addressing mode. • This means that the address of the operand is explicitly stated in the instruction. • It is often useful to employ a indirect addressing, where the address of the address of the operand is given in the instruction • If you have ever used pointers in a program, you are already familiar with indirect addressing. Lecture
Extending Our Instruction Set • We have included three indirect addressing mode instructions in the MARIE instruction set. • The first two are LOADI X and STOREI X, where X specifies the address of the operand to be loaded or stored. • In RTL : • It would be the same conceptually for AddI, SubI, JumpI and JnS MAR X MBR M[MAR] MAR MBR MBR M[MAR] AC MBR MAR X MBR M[MAR] MAR MBR MBR AC M[MAR] MBR LOADI X STOREI X Lecture
Extending Our Instruction Set • Our first new instruction is the CLEAR instruction. • All it does is set the contents of the accumulator to all zeroes. • This is the RTL for CLEAR: AC 0 Lecture
A Discussion on Decoding • As mentioned earlier, the control unit causes the CPU to execute a sequence of steps correctly • There are control signals asserted on various components in making the components active • A computer’s control unit keeps things synchronized, making sure that bits flow to the correct components as the components are needed. • There are two general ways in which a control unit can be implemented: hardwired controlandmicroprogrammed control. • With microprogrammed control, a small program is placed into read-only memory in the microcontroller. • Hardwired controllers implement this program using digital logic components. There is a direct connection between the control lines and the machine instructions. Lecture
A Discussion on Decoding • Your text provides a complete list of the register transfer language (or RTN) for each of MARIE’s instructions. • The RTL or RTN actually defines the microoperations of the control unit. • Each microoperation consists of a distinctive signal pattern that is interpreted by the control unit and results in the execution of an instruction. • The signals are fed to combinational circuits within the control unit that carry out the logical operations for the instruction • Recall, the RTL for the Add instruction is: MAR X MBR M[MAR] AC AC + MBR Lecture
A Discussion on Decoding • Each of MARIE’s registers and main memory have a unique address along the datapath (0 through 7). • The addresses take the form of signals issued by the control unit. • Let us define two sets of three signals. • One set, P2, P1, P0, controls reading from memory or a register, • and the other set consisting of P5, P4, P3, controls writing to memory or a register. • Let’s examine MARIE’s MBR (with address 3) • Keep in mind from Ch 2 how registers are configure using flip-flops Lecture
A Discussion on Decoding - MBR The MBR register is enabled for reading when P0 and P1 are high The MBR register is enabled for writing when P3 and P4 are high Lecture
A Discussion on Decoding • We note that the signal pattern just described is the same whether our machine used hardwired or microprogrammed control. • In hardwired control, the bit pattern of machine instruction in the IR is decoded by combinational logic. • The decoder output works with the control signals of the current system state to produce a new set of control signals. Unique output signal corresponding to the opcode in the IR Produce the series of signals that result in the execution of the microoperations Produces the timing signal for each tick of the clock (sequential logic used here because the series of timing signals is repeated) – for tick, a different group of logic can be activated Lecture
A Discussion on Decoding - ADD Bit pattern for the Add = 0011 instruction in the IR. Timing signal added with instruction bits produce required behavior Result Here Control lines and bits controlling the register functions and the ALU Lecture
A Discussion on Decoding • The hardwired approach is FAST, however, the control logic are tied together via circuits and complex to modify • In microprogrammed control, the control can be easier modified • In microprogrammed control, instruction microcode produces control signal changes. • Machine instructions are the input for a microprogram that converts the 1s and 0s of an instruction into control signals. • The microprogram is stored in firmware, which is also called the control store. • A microcode instruction is retrieved during each clock cycle. Lecture
A Discussion on Decoding • All machine instructions are input into the microprogram. The microprogram’s job is to convert the machine instructions into control signals. • The hardwired approach, timing signals from the clock are ANDed using combinational logic circuits to invoke signals • In the microprogram approach, the instruction microcode produces changes in the data-path signals Lecture
A Closer Look at Instruction Set ArchitecturesChapter 5 Dr. Clincy 15
Introduction • In Ch 4, we learned that machine instructions consist of opcodes and operands. Opcodes specify the operations to be executed; operands specify register or memory locations of data. • Sections 5.1 and 5.2 builds upon the ideas in Chapter 4 and looks more closer at Instruction Set Architecture (ISA)– specifically, the instruction format • We will look at different instruction formats • We will see the interrelation between machine organization and instruction formats. • By understanding a high-level language’s low-level instruction set architecture and format, this leads to a deeper understanding of the computer’s architecture in general. • As a computer scientist, in understanding the computer’s architecture in more detail, you can build more efficient and reliable programsm • When a computer architecture is in the design phase, the instruction set format must be determined first – it must match the architecture and last for years Dr. Clincy
Instruction Formats Instruction sets are differentiated by the following: • Number of bits per instruction (16, 32, 64). • How the data is stored (Stack-based or register-based). • Number of explicit operands per instruction (0,1,2 or 3). • Operand location (instructions can be classified as register-to-register, register-to-memory or memory-to-memory). • Types of operations (or instructions) and which instructions can access memory or not. • Type and size of operands (operands can be addresses, numbers or characters). Dr. Clincy
Instruction Formats As mentioned earlier, when a computer architecture is in the design phase, the instruction set format must be determined first – it must match the architecture and last for years. Instruction set architectures are measured several factors: • the amount of space a program requires. • Instruction complexity (ie. decoding required). • Instruction length (in bits). • total number of instructions in the instruction set. Dr. Clincy
Instruction Formats Issues to consider when designing an instruction set: • Instruction length. • Whether short, long, or variable. • Short uses less space but is limited • Fixed size is easier to decode but wastes space • Memory organization. • Whether byte- or word addressable. • If memory has words and not byte-addressable, it could be difficult to access a single character (4 bits) • Number of operands. • How should operands be stored in the CPU • Number of addressable registers. • How many registers ? • How should the registers be organized ? • Addressing modes. • Choose any or all: direct, indirect or indexed. For example, MARIE used the direct and indirect addressing modes • Given a byte-addressable machine, should the least significant byte be stored at the highest or lowest byte address ? Little Endian Vs Big Endian debate) Dr. Clincy
Instruction Formats • The term “Endian” refers to a computer’s “Byte order”, or the way the computer stores the bytes • If we have a two-byte integer, the integer may be stored so that the least significant byte is followed by the most significant byte or vice versa. • In Little endian machines,the least significant byte is followed by the most significant byte. (ie MSB-LSB) (most PCs, Intel) • Big endian machines store the most significant byte first (at the lower address). (ie. LSB-MSB) (most UNIX machines, Computer networks, Motorola) • As an example, suppose we have the hexadecimal number 12345678. • The big endian and small endian arrangements of the bytes are shown below. Dr. Clincy
Instruction Formats • Big endian: • Is more natural. • The sign of the number can be determined by looking at the byte at address offset 0. • Strings and integers are stored in the same order. • Doesn’t allow values on non-word boundaries (ie odd-numbered byte addresses). (ie. if a word is 2 bytes, must start on 0, 2, 4, 6, 8, etc) • Conversion from a 16-bit integer address to a 32-bit integer address requires arithmetic. • Little endian: • Makes it easier to place values on non-word boundaries (ie odd-numbered byte addresses). • Conversion from a 16-bit integer address to a 32-bit integer address does not require any arithmetic. Dr. Clincy
Instruction Formats • Once the designer decides on how the bytes should be ordered in memory, • the next consideration for the architecture design is how the CPU will store data. • We have three choices: 1. A stack architecture 2. An accumulator architecture 3. A general purpose register architecture. • In choosing one over the other, the tradeoffs are simplicity (and cost) of hardware design with execution speed and ease of use. Dr. Clincy
Instruction Formats • In a stack architecture, instructions and operands are implicitly taken from the stack. • A stack cannot be accessed randomly. • In an accumulator architecture, one operand of a binary operation is implicitly in the accumulator. • One operand is in memory, creating lots of bus traffic. • In a general purpose register (GPR) architecture, registers can be used instead of memory. • Faster than accumulator architecture. • Efficient implementation for compilers. • Results in longer instructions. Dr. Clincy
Instruction Formats • Most systems today are GPR systems. • There are three types: • Memory-memory where two or three operands may be in memory. • Register-memory where at least one operand must be in a register. • Load-store where no operands may be in memory. • The number of operands and the number of available registers has a direct affect on instruction length. Dr. Clincy
Instruction Formats • Machine instructions that have no operands must use a stack (last in, first out (LIFO)) • Stack architectures make use of “push” and “pop” instructions. • Push X places the data in memory location X onto the stack • Pop X removes the top element in the stack and stores it at location X. • Stack architectures require us to think about arithmetic expressions a little differently. • We are accustomed to writing expressions using infix notation, such as: Z = X + Y. • Stack arithmetic requires that we use postfix notation: Z = XY+. • This is also called reverse Polish notation, (somewhat) in honor of its Polish inventor, Jan Lukasiewicz (1878 - 1956). • places the operator after the operands Dr. Clincy
Instruction Formats • Example 1 – Adding use a stack • CPU adds the top two elements of the stack, popping them both • And then push the sum onto the top of the stack • Example 2 – Subtracting use a stack • The top stack element is subtracted from the next-to-the top element, both are popped • And the result is pushed onto the top of the stack Dr. Clincy
Instruction Formats • The principal advantage of postfix notation is that parentheses are not used. • The infix expression “3 + 4” is the postfix equivalent of “3 4 +” • For example, the infix expression, Z = (X Y) + (W U), becomes: Z = X Y W U + in postfix notation. Dr. Clincy
Instruction Formats • Example: Convert the infix expression (2+3) - 6/3 to postfix: The division operator takes next precedence; we replace 6/3 with 6 3 /. 2 3+ - 6 3/ The quotient 6/3 is subtracted from the sum of 2 + 3, so we move the - operator to the end. 2 3+ 6 3/ - Dr. Clincy
Instruction Formats • We have seen how instruction length is affected by the number of operands supported by the ISA. • In any instruction set, not all instructions require the same number of operands. • Operations that require no operands, such as HALT, necessarily waste some space when fixed-length instructions are used. • One way to recover some of this space is to use expanding opcodes. Dr. Clincy
Instruction Formats • A system has 16 registers and 4K of memory. • We need 4 bits to access one of the registers. We also need 12 bits for a memory address. • If the system is to have 16-bit instructions, we have two choices for our instructions: Dr. Clincy
Instruction Formats • If we allow the length of the opcode to vary, we could create a very rich instruction set: Is there something missing from this instruction set? Dr. Clincy
We need: 3 23 = 192 bits for the 3-bit operands 2 24 = 32 bits for the 4-bit operands 4 23 = 32 bits for the 3-bit operands. Total: 256 bits. Instruction Formats • Example: Given 8-bit instructions, is it possible to allow the following to be encoded? • 3 instructions with two 3-bit operands. • 2 instructions with one 4-bit operand. • 4 instructions with one 3-bit operand. Dr. Clincy
Instruction Formats Dr. Clincy