510 likes | 684 Views
CS 3501 - Chapter 4 (Sec 5.1 &5.2). Dr. Clincy Professor of CS. Dr. Clincy. Lecture. Slide 1. Chapter 4 Objectives. Learn the components common to every modern computer system. Be able to explain how each component contributes to program execution.
E N D
CS 3501 - Chapter 4 (Sec 5.1 &5.2) Dr. Clincy Professor of CS Dr. Clincy Lecture Slide 1
Chapter 4 Objectives • Learn the components common to every modern computer system. • Be able to explain how each component contributes to program execution. • Understand a simple architecture invented to illuminate these basic concepts, and how it relates to some real architectures. • Know how the program assembly process works. Lecture
Introduction • In Chapter 2, we discussed how binary-coded data is stored and manipulated by various computer system components. • In Chapter 3, we described how fundamental components are designed and built from digital circuits. • Also from Chapter 3, we know that memory is used to store both data and program instructions in binary • Having this background, we can now understand how computer components are fundamentally built • The next question is, how do the various components fit together to create useful computer systems. Lecture
Basic Structure of Computers Coded info is stored in memory for later use Program is stored in memory and determines the processing steps Input unit accepts code info from human operators, electromechanical devices (ie keyboard), other computers via networks ALU uses the coded info to perform the desired operations All actions are coordinated by the control unit The output unit sends the results back out externally Collectively called the I/O unit Collectively called the processor Lecture
CPU Basics • The next question is, how is the program EXECUTED and how is the dataPROCESSED properly ? • The computer’s CPU or Processor • Fetches the program instructions, • Decodes each instruction that is fetched , and • Perform the indicated sequence of operations on the data (execute) • The two principal parts of the CPU are theDatapathand theControl unit. • Datapath - consists of an arithmetic-logic unit (ALU) and network of storage units (registers) that are interconnected by a data bus that is also connected to main memory. • Control Unit - responsible for sequencing the operations and making sure the correct data is in the correct place at the correct time. Lecture
CPU Basics • Registers hold data that can be readily accessed by the CPU – data like addresses, program counter, data, and control info • Registers can be implemented using D flip-flops. • A 32-bit register requires 32 D flip-flops. • There are many different registers – • to store values, • to shift values, • to compare values, • registers that count, • registers that temporary store values, • index registers to control program looping, • stack pointer registers to manage stacks of info for processes, • status or flag registers to hold status or mode of operation, • and general purpose registers Lecture
CPU Basics • The arithmetic-logic unit (ALU) carries out • logical operations (ie. comparisons) and • arithmetic operations (ie. adding or multiplying) • The ALU knows which operations to perform because it is controlled by signals from the control unit. • The control unit determines which actions to carry out according to the values in a program counter register and a status register. • The control unit tells the ALU which registers to use and turns on the correct circuitry in the ALU for execution of the operation. • The control unit uses a program counter register to find the next instruction for execution and uses a status register to keep track of overflows, carries, and borrows. Lecture
The Bus • The CPU shares data with other system components by way of a data bus. • A bus is a set of wires that simultaneously convey a single bit along each line. • One or more devices can share the bus. • The “sharing” often results in communication bottlenecks • The speed of the bus is effect by its length and the number of devices sharing it Lecture
The Bus • Two types of buses are commonly found in computer systems: point-to-point, and multipoint buses. • Point-to-point bus connects two specific devices • Multipoint buses connects a number of devices. Because of the sharing, a bus protocol is used. Lecture
The Bus • Buses consist of data lines, control lines, and address lines. Address lines determine the location of the source or destination of the data. Data lines convey bits from one device to another. Moves the actual information that must be moved from one location to another. Control lines determine the direction of data flow, and when each device can access the bus. • When sharing the bus, concurrent bus requests must be arbitrated. • Four categories of bus arbitration are: • Daisy chain: Permissions are passed from the highest-priority device to the lowest. • Centralized parallel: Each device is directly connected to an arbitration circuit. • Distributed using self-detection: Devices decide which gets the bus among themselves. • Distributed using collision-detection: Any device can try to use the bus. If its data collides with the data of another device, it tries again. Lecture
Types of Buses • Processor-memory bus – short high speed bus used to transfer to and from memory • I/O buses – longer buses that interface with many I/O devices other than the processor • Backplane bus (or system bus) – connects the processor, I/O devices and memory. • Expansion bus – connect external devices • Local bus – a data bus that connect a peripheral device directly to the CPU • Buses from a timing perspective: • Synchronous buses - work off clock ticks – all devices using this bus type are synchronized by the clock rate • Asynchronous buses – control lines coordinate the operations and a “handshaking protocol” is used for the timing. These types of buses can scale better and work with more devices Lecture
Clocks • Every computer contains at least one clock that: • Regulates how quickly instructions can be executed • Synchronizes the activities of its components. • A fixed number of clock cycles are required to carry out each data movement or computational operation. • As a result, instruction performance is measured in clock cycles. • The clock frequency, measured in megahertz or gigahertz, determines the speed with which all operations are carried out. • Clock cycle time is the reciprocal (or inverse) of its clock frequency. • An 800 MHz clock has a cycle time of 1.25 ns. • Clock speed should not be confused with CPU performance. • The CPU time required to run a program is given by the general performance equation: • We see that we can improve CPU throughput when we reduce the number of instructions in a program, reduce the number of cycles per instruction, or reduce the number of nanoseconds per clock cycle. Lecture
The Input/Output Subsystem • A computer communicates with the outside world through its input/output (I/O) subsystem. • Input device examples: keyboard, mouse, card readers, scanners, voice recognition systems, touch screens • Output device examples: monitors, printers, plotters, speakers, headphones • I/O devices connect to the CPU through various interfaces. • I/O can be memory-mapped-- where the I/O device behaves like main memory from the CPU’s point of view. • Or I/O can be instruction-based, where the CPU has a specialized I/O instruction set. Lecture
Memory Organization • We discussed a simple example of how memory is configured in Ch 3 – we now will cover more detail of: • How memory is laid out • How memory is addressed • Envision memory as a matrix of bits – each row implemented as a register or “storage cell” – and each row being the size of a addressable Word. • Each register or storage cell (typically called memory location) has a unique address. • The memory addresses typically start at zero and progress upward Lecture
Memory Organization • Computer memory consists of a linear array of addressable storage cells that are similar to registers. • Memory can be byte-addressable, or word-addressable, where a word typically consists of two or more bytes. • Byte-addressable case: although the Word could be multiple bytes, each individual byte would have an address – with the lowest address being the “address” of the Word • Memory is constructed of RAM chips, often referred to in terms of length width. • If the memory word size of the machine is 16 bits, then a 4M 16 RAM chip gives us 4 megabytes of 16-bit memory locations. Lecture
Memory Organization • For alignment reasons, in reading 16-bit words on a byte-addressable machine, the address should be a multiple of 2 (i.e 2 bytes) • For alignment reasons, in reading 32-bit words on a byte-addressable machine, the address should be a multiple of 4 (i.e 4 bytes) • For alignment reasons, in reading 64-bit words on a byte-addressable machine, the address should be a multiple of 4 (i.e 8 bytes). Lecture
How does the computer access a memory location corresponds to a particular address? Memory is referred to using notation: Length x Width (L x W) We observe that 4M can be expressed as 2 2 2 20 = 2 22 words – means 4M long with each item 8 bits wide. Provided this is byte-addressable, the memory locations will be numbered 0 through 2 22 -1. Thus, the memory bus of this system requires at least 22 address lines. Memory Organization Dr. Clincy Lecture 17
Memory Organization • Physical memory usually consists of more than one RAM chip. • A single memory module causes all accesses to memory to be sequential - only one memory access can be performed at a time • By splitting or spreading memory across multiple memory modules (or banks), access can be performed in parallel – this is called Memory interleaving • With low-order interleaving, the low order bits of the address specify which memory bank contains the address of interest. • In high-order interleaving, the high order address bits specify the memory bank. Lecture
Memory Organization • Example: Suppose we have a memory consisting of 16 2K x 8 bit chips. • Memory is 32K = 25 210 = 215 • 15 bits are needed for each address. • We need 4 bits to select the chip, and 11 bits for the offset into the chip that selects the byte. Lecture
Memory Organization • In high-order interleaving the high-order 4 bits select the chip. • In low-order interleaving the low-order 4 bits select the chip. Lecture
MARIE • Up to this point, we have discussed: (1) the CPU in general, (2) the ALU, (3) the Control Unit, (4) the Bus, (5) the Clock, (6) the I/O subsystem, and (7) Memory and Addressing • We can now bring together many of the ideas that we have discussed to this point using a very simple computer architecture model called MARIE • MARIE stands for the Machine Architecture that is Really Intuitive and Easy. • MARIE was designed for the singular purpose of illustrating basic computer system concepts. • While this system is too simple to do anything useful in the real world, a deep understanding of its functions will enable you to comprehend system architectures that are much more complex. Lecture
MARIE Characteristics MARIE in general consist of a CPU (which contains an ALU and several registers) and Memory (for storing data and programs). The MARIE architecture has the following characteristics: • Binary, two's complement data representation. • Stored program, fixed word length data and instructions. • 4K words of word-addressable main memory (not byte addressable) • 16-bit data words. • 16-bit instructions, 4 bits for the opcode and 12 bits for the address. • A 16-bit arithmetic logic unit (ALU) with seven registers for control and data movement. • Accumulator (AC) • Instruction Register (IR) • Memory Buffer Register (MBR) • Program Counter (PC) • Memory Address Register (MAR) • Input Register • Output Register Lecture
MARIE Registers Registers are storage locations with in the CPU. MARIE’s seven registers are: • Accumulator, AC, a 16-bit register that holds a conditional operator (e.g., "less than") or one operand of a two-operand instruction. • Memory address register, MAR, a 12-bit register that holds the memory address of an instruction or the operand of an instruction. • Memory buffer register, MBR, a 16-bit register that holds the data after its retrieval from, or before its placement in memory. • Program counter, PC, a 12-bit register that holds the address of the next program instruction to be executed. • Instruction register, IR, which holds an instruction immediately preceding its execution. • Input register, InREG, an 8-bit register that holds data read from an input device. • Output register, OutREG, an 8-bit register, that holds data that is ready for the output device. Lecture
MARIE Architecture Depicted • ALU carries out the logic operations (comparisons) and arithmetic operations (adding, etc). Each memory location has a unique address. Addresses go from 0 to 4K-1 (which is 4095). Each location can store a 16-bit word • Control Unit monitors and control the execution of all instructions and the transfer of all information • CU extracts the instruction from memory, decodes the instructions, make sure data is in the right place at the right time, tell the ALU which registers to use, services interrupts, turn on the correct circuitry in the ALU for operation execution Lecture
MARIE Architecture Depicted • Accumulator, AC holds a conditional operator (e.g., "less than") or one operand of a two-operand instruction. • Output register, OutREG, an 8-bit register, that holds data that is ready for the output device Memory address register, MAR, a 12-bit register that holds the memory address of an instruction or the operand of an instruction • Not shown is the Status or Flag register of the ALU that holds info indicating various conditions such as overflows • Input register, InREG, an 8-bit register that holds data read from an input device. • Memory buffer register, MBR, a 16-bit register that holds the data after its retrieval from, or before its placement in memory. • Program counter, PC, a 12-bit register that holds the address of the next program instruction to be executed Instruction register, IR, which holds an instruction immediately preceding its execution. Lecture
MARIE Bus • A Bus is needed to transfer data or instructions into or out of registers • The registers are interconnected, and connected with main memory through a common data bus. • Each device connected to the bus is identified by a unique id or number • Before any device can use the bus, the device’s unique number is set on the control lines to allow that device to carry out an operation. • Separate connections are also provided between the accumulator and the memory buffer register, and the ALU and the accumulator and memory buffer register. • This permits data transfer between these devices without use of the main data bus. Lecture
MARIE Data Path • Also have separate communication or data paths outside of the bus to speed up execution • The benefit is, these paths allow events to occur in parallel • One event can be using the bus • And some other event can using a data path at the same time • Have a communication path between the MAR and Memory • MAR provides the inputs to the address lines for memory • So the CPU knows where to read from or write to memory • Have a communication path from the MBR to the AC • Have a communication path from the MBR to the ALU • Allows data in MBR to be used in arithmetic operations • Also, info can flow from the AC through the ALU and back into the AC without traveling on the Bus • The datapath in general is the path that information follows – the figure depicts MARIE’s datapath Lecture
MARIE Instruction Set Architecture • A computer’s instruction set architecture (ISA) specifies the format of its instructions and the primitive operations that the machine can perform. • The ISA is an interface between a computer’s hardware and its software. • Some ISAs include hundreds of different instructions for processing data and controlling program execution. • The MARIE ISA consists of only thirteen instructions. Lecture
MARIE ISA • This is the format of a MARIE instruction: • The fundamental MARIE instructions are: • Specifies the instruction to be executed, therefore 24=16 instructions • Allows for a maximum size of memory of 212 - 1 Lecture
LOAD Instruction • The Load instruction allows data to be moved from memory into the CPU via the MBR and the AC • All data must be first moved into the MBR and then either into the AC or ALU • The Load instruction doesn’t have to name the AC as a final destination, the AC register is implicit in the instruction • This is a bit pattern for a LOAD instruction as it would appear in the IR: • We see that the opcode is 1 and the address from which to load the data is 3. Lecture
Intro to Other Instructions • Allows data to be moved from the CPU back to memory • Typically represented as ASCII. In real-life, has to be converted if used as numeric. For MARIE, assume numeric only • Causes the current program execution to terminate • Move the data value found at address X into the MBR, then add the MBR value to the value in the AC • Move the data value found at address X into the MBR, then subtract the MBR value from the value in the AC • Allows conditional branching (ie. while loops, if statements). When the instruction is executed, the value in the AC is inspected • Allows an unconditional branch. When the instruction is executed, it causes the contents of the Program Counter (PC) to be replaced with the value of X, which is the address of the next instruction to fetch – thus skipping Lecture
SKIPCOND Instruction • As mentioned earlier, when SKIPCOND is executed, the value in the AC is examine • In MARIE’s case, address bits 10 and 11 are examined: • Bits used to specify the condition to be tested • If 00, translates to “bypass the next instruction if the AC is negative” • If 01, translates to “bypass the next instruction if the AC is equal to 0” • If 10, translates to “bypass the next instruction if the AC is greater than 0” • If 11, an error condition occurred • This is a bit pattern for a SKIPCOND instruction as it would appear in the IR: • We see that the opcode is 8 and bits 11 and 10 are 10, meaning that the next instruction will be skipped if the value in the AC is greater than zero. Lecture
Examine Some Instructions • Opcode is binary 1, for LOAD. The address where the value in located in memory is 3. Data found at address 3 is copied into the AC • Opcode is binary 3, for ADD. The address where the value in located in memory is 13. Data found at address 13 is placed in the MBR, then the MBR value is added to the value in the AC, and then the value in the AC is over-written with the sum • Opcode is binary 8, for SKIPCOND. Bits 11 and 10 are 10 indicating bypass the next instruction if the AC is greater than 0. If the AC’s value is less than or equal to zero, this instruction is ignored and the next instruction is executed. Otherwise, the PC is incremented by 1, thus skipping Lecture
MARIE Register Transfer Notation • Instead of using binary values to represent the instruction, instruction names or mnemonics used (pronounced Nee-Monics) • Binary version – called machine instructions • Mnemonics version – called assembly language instructions • An assembler’s job is to convert the assembly instructions into the machine instructions • Recall that architectures are comprised of various components like the ALU, registers, memory decoders and control units • A single machine instruction causes these components to execute tasks • Each machine instruction can consist of multiple component-level operations • Mini-instructions are being executed. These mini-instructions are called microoperations • The exact sequence of microoperations that are carried out by an instruction can be specified using register transfer language (RTL) or register transfer notation (RTN). • In the MARIE RTL, we use the notation M[X] to indicate the actual data value stored in memory location X, and to indicate the transfer of bytes to a register or memory location. Lecture
MARIE LOAD RTL • The RTL for the LOADinstruction is: • Loads the contents of memory location X into the AC. • Address X is placed into the MAR. The IR uses the bus to copy the value of X into the MAR MAR X MBR M[MAR] AC MBR • Data at location M[MAR] (or address X) is moved into the MBR. This operation and the operation above must be in sequence and can’t occur at the same time • Data is then placed in the AC. This operation can occur immediately after the above operation because the MBR and AC have a direct connection with one another Lecture
MARIE STORE RTL • The RTL for the STORE instruction is: • Stores the contents of the AC in memory location X. MAR X, MBR AC M[MAR] MBR • Address X is placed into the MAR and also the content or value in the AC is placed in the MBR • The contents of the MBR is stored at location M[MAR] (or address X) Lecture
MARIE ADD RTL • The RTL for the ADD instruction is: • Data stored at memory location X is added to the AC. MAR X MBR M[MAR] AC AC + MBR • Address X is placed into the MAR • Data at location M[MAR] (or address X) is moved into the MBR • Data in the MBR is added to the value in the AC and the result is stored back in the AC Lecture
MARIE SUB RTL • The RTL for the ADD instruction is: • Data stored at memory location X is added to the AC. MAR X MBR M[MAR] AC AC - MBR • Address X is placed into the MAR • Data at location M[MAR] (or address X) is moved into the MBR • Data in the MBR is subtracted from the value in the AC and the result is stored back in the AC Lecture
MARIE INPUT, OUTPUT, HALT RTL • INPUT AC InREG OutREG AC No Operation • Input for any input device is first placed into the InREG, then the data is transferred into the AC • OUTPUT • Contents of the AC is placed into the OutREG, and eventually sent out to an output device • HALT • No operations performed on registers – the machine simply ceases execution of the program Lecture
MARIE JUMP RTL • The RTL for the JUMP instruction is: • Causes an unconditional branch to the given address, X. PC X PC IR[11-0] • Therefore to execute this instruction, the address X, must be loaded into the PC • Since the least significant 12 bits of the 16 bits is the address, the instruction above is really Lecture
MARIE SKIPCOND RTL • Recall that SKIPCOND skips the next instruction according to the value of the AC. • Uses bits 10 and 11 to determine what comparison to perform on the AC • If the condition is true, the next instruction is skipped (PC incremented) • The RTL for the this instruction is the most complex in our instruction set: If IR[11 - 10] = 00 then If AC < 0 then PC PC + 1 else If IR[11 - 10] = 01 then If AC = 0 then PC PC + 1 else If IR[11 - 10] = 11 then If AC > 0 then PC PC + 1 • Checking if AC is negative • Checking if AC is equal to zero • Checking if AC is positive Lecture
Instruction Processing • The fetch-decode-execute cycle is the series of steps that a computer carries out when it runs a program. • We first have to fetch an instruction from memory, and place it into the IR. • Once in the IR, it is decoded to determine what needs to be done next. • If a memory value (operand) is involved in the operation, it is retrieved and placed into the MBR. • With everything in place, the instruction is executed. Lecture
Instruction Processing – Flow Chart If a memory value (operand) is involved in the operation, it is retrieved and placed into the MBR When program is first loaded, the address of the first instruction is placed into the PC Copy the contents of the PC to the MAR Go to main memory and fetch the instruction found at address in the MAR, place it in the IR, then increment the PC by 1 With everything in place, the instruction is executed. Decode the leftmost 4 bits of the IR in determining the opcode – and copy the rightmost 12 bits of the IR to the MAR Lecture
Instruction Processing - Interrupts • All computers provide a way of interrupting the fetch-decode-execute cycle. • Interrupts occur when: • A user break (e.,g., Control+C) is issued • I/O is requested by the user or a program • A critical error occurs • Interrupts can be caused by hardware or software. • Software interrupts are also called traps. • Interrupt processing involves adding another step to the fetch-decode-execute cycle as shown below Lecture
Interrupt Instruction Processing Interrupt service routines Lecture
Interrupt - Instruction Processing • For general-purpose systems, it is common to disable all interrupts during the time in which an interrupt is being processed. • Typically, this is achieved by setting a bit in the flags register. • Interrupts that are ignored in this case are called maskable. • Nonmaskable interrupts are those interrupts that must be processed in order to keep the system in a stable condition. Lecture
I/O - Instruction Processing • Interrupts are very useful in processing I/O. • However, interrupt-driven I/O is complicated, and is beyond the scope of our present discussion. • MARIE, being the simplest of simple systems, uses a modified form of programmed I/O. • All output is placed in an output register, OutREG, and the CPU polls the input register, InREG, until input is sensed, at which time the value is copied into the accumulator. Lecture
A Simple Program • Lets consider a program that adds two numbers together, storing the sum in memory • Both the assembly language program and data are stored in memory • Consider the simple MARIE program given below. We show a set of mnemonic instructions and data stored at addresses 100 - 106 (hex): • Program • Data • Result • Assembly Language • Machine Language Lecture
A Simple Program - Continuing • Let’s look at what happens inside the computer when our program runs. • This is the LOAD 104 instruction: • The PC is loaded with the address of the first instruction • The PC’s contents is stored in the MAR • The instruction at the address stored in the MAR is stored into the IR – initial instruction was fetched • PC is incremented • Address portion of the instruction is loaded into the MAR • Opcode portion is decoded • The data at the address stored in the MAR is stored into the MBR – operand 35 was fetched • The MBR’s contents is stored in the AC Lecture
A Simple Program - Continuing • The second instruction, ADD 105: • The PC’s contents is stored in the MAR (101 vs 100) • The instruction at the address stored in the MAR is stored into the IR – initial instruction was fetched • PC is incremented • Address portion of the instruction is loaded into the MAR • Opcode portion is decoded • The data at the address stored in the MAR is stored into the MBR – operand -23 was fetched • The MBR’s contents is added to the contents of the AC, and the results, 12, is stored in the AC Lecture