260 likes | 449 Views
Caching II. Andreas Klappenecker CPSC321 Computer Architecture. Verilog Questions & Answers. Verilog Q & A. How is the xor instruction encoded? R-format instruction, function field Ox26 See [PH] page A-59 What is the purpose of Idealmem.v? It models the memory
E N D
Caching II Andreas Klappenecker CPSC321 Computer Architecture
Verilog Q & A • How is the xor instruction encoded? • R-format instruction, function field Ox26 • See [PH] page A-59 • What is the purpose of Idealmem.v? • It models the memory • dmeminit.v initializes data memory • imeminit.v initializes instruction memory
Verilog Q&A • How do I specify delays? `define DEL 10 begin a <= #(`DEL) b; c <= #(`DEL) d; end Delays can be inserted anywhere in an assignment
Delays Simulation starts: @time 0: i=3, j=4 Simulation continues until first delay #1 and waits until time 1. @time 1, j is sampled @time 2, assign 4 to i continue w/ next stmt @time 3, i is sampled @time 4, assign 4 to j module iab; integer i, j; initial begin i = 3; j = 4; begin #1 i = #1 j; #1 j = #1 i; end end endmodule
Delays @time 0: i=3, j=4 both non-blocking assignments finish at time 0 [intra-assignments delays do not delay the execution of the statement] sample j and schedule to assign to i at time 1 sample i and schedule to assign to j @time 1: i = 4, j = 3 module ianb; integer i, j; initial begin i = 3; j = 4; begin i <= #1 j; j <= #1 i; end end endmodule
Delays Hint: Using unit delays simplifies debugging • It allows you to find out which signal depends on which • Do not code in the form #1, rather use define ‘foo_del 1 // Change later a <= #(‘foo_del) b;
Clock module m555 (CLK); parameter STime = 0,Ton = 50,Toff = 50,Tcc=Ton+Toff; output CLK; reg CLK; initial begin #STime CLK = 0; end always begin #Toff CLK = ~CLK; #Ton CLK = ~CLK; end endmodule
Project • For jal and jr, the datapath of the book is not enough • You need more control signals for ALUop, so there is no point to stick to the way it is done in the book
Report Include some a table explaining your control signals, e.g.,
Memory • Users want large and fast memories • SRAM is too expensive for main memory • DRAM is too slow for many purposes • Compromised: Build a memory hierarchy
Locality • Temporal locality • A referenced item will be again referenced soon • Spatial locality • nearby data will be referenced soon
Direct Mapped Cache • Mapping: address modulo the number of blocks in the cache, x -> x mod B
Direct Mapped Cache • Cache with 1024=210 words • tag from cache is compared against upper portion of the address • If tag=upper 20 bits and valid bit is set, then we have a cache hit otherwise it is a cache missWhat kind of locality are we taking advantage of?
Direct Mapped Cache • Taking advantage of spatial locality:
Cache Hits and Misses • Read hits • this is what we want! • Read misses • stall the CPU, fetch block from memory, deliver to cache, restart • Write hits: • can replace data in cache and memory (write-through) • write the data only into the cache (write-back the cache later) • Write misses: • read the entire block into the cache, then write the word
What Block Size? • A large block size reduces cache misses • Cache miss penalty increases • We need to balance these two constraints • Next time: • How can we measure cache performance? • How can we improve cache performance?