340 likes | 501 Views
ECE 353 Lab C Pipeline Simulator. Aims. Further experience in C programming Handling strings Further experience in the use of assertions Reinforce concepts from pipelining Techniques for simulating parallel operations with a sequential program. Statement of Work.
E N D
Aims • Further experience in C programming • Handling strings • Further experience in the use of assertions • Reinforce concepts from pipelining • Techniques for simulating parallel operations with a sequential program
Statement of Work • Write a C program to simulate the register contents of a MIPS machine running an abbreviated instruction set • Should be cycle-accurate with respect to register contents. • Simulates a machine consisting of add, sub, addi, mul, lw, sw, beqinstructions only. • Will need to provide some error detection wrt the assembly program input.
Simulation Approaches • State change in simulation • Event-driven Simulation: Follows events as they occur. • Time-driven Simulation: Driven by a clock; does something every clock transition. • Computational Platform • Sequential: Simpler to program and debug; slower. • Parallel: More complicated program (need to take interactions into account); much higher throughput (potentially). Lab C involves a sequential, time-driven simulation.
Inputs to the Simulator • MIPS assembly language program which will be in a file read by the simulator. • Parameters of the simulated machine: • Memory access time: c cycles. • Multiply time: m cycles. • All other EX operations: ncycles. • ID and WB stages take one cycle to complete their tasks.
Memory • Separate instruction and data memories (separate address spaces). • Instruction Memory: • Addressed by the PC. • Size: 2K bytes (512 words). • Data Memory: • Accessible to the program. • Size: 2K bytes (512 words). Implement both as linear arrays.
Processor • Five stages: IF, ID, EX, MEM, WB • Each stage follows the rules you learned in ECE 232, except that IF and EX can take multiple cycles to complete an operation.
IF • Fetches from the Instruction Memory. • Will have to freeze if there is an unresolved branch. • Branches are detected in the ID stage. • When a branch is detected, no further instructions are decoded until the branch is resolved. • Branches are resolved in the EX stage where a subtraction operation is carried out and the result used to determine if the branch is taken or not.
ID • Decodes instructions (takes 1 cycle to do this). • Checks for data (RAW) and control hazards. • Holds up further processing until these hazards have been resolved (no forwarding in this machine). • Provides the EX stage with the operands it needs.
EX • Executes the specified operation on the operands it receives from ID. • Can take multiple cycles depending on the operation. • EX is not itself pipelined; only one instruction can be in EX at any time. • Use a counter to keep track of ongoing operations.
MEM • Is only used by the lw and sw instructions. • Receives the data memory address from EX and carries out the data memory read/write operation. • Can take multiple cycles to do this (c cycles).
WB • Writes back into the register file. • Will have nothing to do if the instruction does not do a register write. • Register reads/writes follow the approach in Hennessy & Patterson.
Simulator Modules • progScanner(): Reads from the file provided by the user, containing the assembly language program. Note that this will have to flexible in dealing with empty spaces. • parser(): • From the instruction passed to it by progscanner(), break it up into its constituent fields (opcode, source registers, immediate field, destination register). • Detect a certain subset of errors in the program: • Illegal opcode • Unrecognized register name • Register number out of bounds • Immediate field that is too large
Simulator Modules (contd.) • Pipeline stage modules: • IF • ID • EX • MEM • WB • Pipeline stages will communicate via latches separating them. A latch can only hold information relating to a single instruction. • Deal with structural hazards. • Use valid/empty bits to specify if (a) the contents of a latch are valid for use by the stage to its right and (b) the latch is available to be written into by the stage to its left.
Simulator Execution • We need to simulate parallel operations with a sequential program. How? • Maintain a counter to represent the clock value. • Each time the counter increments (representing an advance in time by one cycle), call the stages in reverse order: WB, MEM, EX, ID, IF. • Why should this work?
Assertions • Use assertions to help with correctness. • Assertions should have been thought through and contribute meaningfully to the reliability of the program. • Preconditions, postconditions, invariants can all be checked. For a good introduction, see http://ptolemy.eecs.berkeley.edu/~johnr/tutorials/assertions.html
Some Design Issues • Coordinating producer and consumer operation with respect to the latch between them. • Multicycle instruction handling • Reading assembly instruction lines and parsing them • Information that needs to be carried along with the instruction down the pipeline • Dealing with hazards: • Structural: Can only move from a stage if the next stage is free. • Data: RAW. • Control: Pending branch freezes all subsequent ops until it is resolved.
Producer/Consumer Latch Management • Rules for correct functioning: • Producer stage should not load into a latch before the consumer has consumed the previous item it placed there. • Consumer stage should not read from a latch before the producer has placed a new item there. • Solution: Use a validItem bit, stored in the latch itself. • Set by the producer when it places a new item in the latch • Cleared by the consumer • Must be done in the same cycle in which the consumer places its result in its output latch or writes back to the register file. • This approach is only safe since there is no race condition between producer and consumer stages in the simulation. Do not try this if doing a hardware implementation.
Multicycle Operations • IF takes c cycles to access instruction memory • EX takes m cycles for multiply and n for other operations • MEM takes c cycles to access data memory for lw and sw instructions • Implementation: • Use a counter initialized appropriately when an operation starts in a stage. • The counter will be decremented each cycle.
Reading Assembly Code • Need to find a suitable function from the standard C library for this. • fscanf() is not the most convenient option: why? • There are other options, e.g., char *fgets(char *string, int n, FILE *stream);
Parsing Assembly Code • Break the code down into the opcode and other fields. • Opcode: Is delimited to its right by at least one empty space. • Other fields: Delimited by a comma. Empty spaces can be ignored. • Programming Approaches: • Walk down the string returned by fgets() looking for delimiters. • Use a library function strtok(). • Caution: strtok() modifies its first argument… • Some useful functions: strcmp(); atoi(); isdigit(). • Make sure you know what these do before using them! (Most groups used feof() incorrectly last time…)
Some error detection • Illegal opcodes: Match the opcode field against all legal opcodes (the MIPS subset plus haltSimulation). • Register not recognized or number out of bounds: • Registers can be specified either symbolically (e.g., $s0, $t5) or numerically (e.g., $8): check first character beyond $ to see which it is. • If numerically specified, registers must be numbered 0-31. • If symbolically specified, the character string must match a legal register name.
Error detection (continued) • Immediate field must be representable in 16 bits (two’s complement). • Check if the numerical quantity is within bounds. • Memory accesses must be aligned to integer boundaries • Data addresses must be a multiple of 4. • Memory alignment problems are detected by MEM; other errors are detected by ID. • When an error is detected, the simulator should print an error message specifying the error type and then stop.
Carrying information down the pipeline • Since control signals are not required to be simulated, you can pass the entire instruction down the pipeline. • To handle beq, carry the PC as well. • exResult will go from EX to MEM and then to WB.
Hazards: Structural • Each stage can be occupied by no more than one instruction at any time. • A pipeline latch will be cleared by the consuming stage only in the cycle in which it fills the latch to its right, if any. • A completed operation with an uncleared latch to its right has to wait until that latch has been cleared (by setting validItem=0) by the next stage. This can cause backpressure in the pipeline.
Hazards: Control • beq is the only instruction in this set that causes a control hazard. • When beq is decoded in ID: • A branchPending flag is set by ID. • branchPending=1 causes the instruction following the one decoded to be removed. • IF idles while a branch is pending; it can only resume in the cycle afterbranchPending has been reset (be careful about how you implement this since you are calling IF() after EX() in your simulation!). • The branch is resolved in EX. • If branch is taken, PC is set to point to the target instruction, else to the next instruction immediately following the beq. • EX resets branchPending.
Hazards: Data • Three types of data hazard: RAW, WAR, WAW. • Due to the linear nature of the simple MIPS pipeline, MIPS is only subject to RAW hazards. • Detected in the ID stage. • ID checks to see if any latch in front of it has an instruction producing data that will be consumed by the instruction it is decoding. • If so, hold that instruction in ID until all such hazards resolve.
Statistics • Stage Utilization: Fraction of cycles for which the stage is doing useful work. • Useful work does NOT include waiting for some hazard to clear. • Keep a counter in each function which counts the number of cycles the stage has been busy. • Global. • Static. • Remember to initialize all counters!
Implementation: IF Stage • Takes the PC value and fetches the appropriate instruction from the Instruction Memory (takes c cycles for this). • Sends the instruction along with its PC value to the next stage. • Freezes whenever instructed to (IF_ID latch full or branchPending set). • Increments PC and starts fetching next instruction unless it is made to freeze. • Increments usefulCyclesIF counter whenever it is doing useful work.
Implementation: ID Stage • When IF_ID latch has a valid element, decodes the instruction. • Detects all data hazards. • If the instruction is beq, sets branchPending flag. • Waits for any structural hazards to clear. • Fetches the operands for use by the EX stage and places them in the ID_EX latch.
Implementation: EX Stage • Does not need to worry about data or control hazards; these have been resolved by the time this instruction has got to EX. • Calculates the result of the requested operation: this takes m cycles for MULT and n cycles for all other operations. • If the EX_MEM latch is free, places the result in this latch and clears the ID_EX latch for use by ID; else waits for the EX_MEM latch to clear.
Implementation: MEM Stage • All non-memory instructions simply pass through this stage. • lw and sw instructions involve memory access which takes c cycles. • sw instructions complete at MEM; the result for lw is passed to the WB stage along with the instruction. • Does MEM have to worry about structural hazards?
Implementation: WB Stage • Has no latch in front of it; hence no structural hazards can impede it. • Takes just one cycle. • Does nothing for beqand sw instructions.
Submission • Summary sheet on Moodle • Authorship/testing information. • Call graph of code • Testing procedures used • Assertions used • Results (Seven tables: one for each of the programs provided on the course website). • Upload program code on quark. • Only a single file should be uploaded: no zipfiles. • ONLY ONE COPY PER GROUP SHOULD BE SUBMITTED.