540 likes | 549 Views
This review covers topics such as two's complement arithmetic, ripple carry ALU logic, cache design issues, and the IEEE 754 floating point standard. It also includes practice problems and examples.
E N D
Exam 2 Review • Two’s Complement Arithmetic • Ripple carry ALU logic and performance • Look-ahead techniques, performance and equations • Basic multiplication and division ( non- restoring) algorithms • IEEE 754 floating point standard (definition provided) • Write a sequence of register transfers to implement a given instruction for MIPS • Given a set of Register Transfers, design the control needed for some component
Cache Design Issues • Where can a word or block of words be placed • in the cache?
Cache Design Issues • Where can a word or block of words be placed • in the cache? • 2. How can a word be found if it is in the cache?
Cache Design Issues • Where can a word or block of words be placed • in the cache? • How can a word be found if it is in the cache? • Which word or block of words should be replaced • on a cache miss?
Cache Design Issues • Where can a word or block of words be placed • in the cache? • How can a word be found if it is in the cache? • Which word or block of words should be replaced • on a cache miss? • 4. When do we write the main memory?
Cache Design Issues • Where can a word or block of words be placed • in the cache? • How can a word be found if it is in the cache? • Which word or block of words should be replaced • on a cache miss? • When do we write the main memory? • Remember READS dominate WRITES
10 bit Address 4 bit Addr 10 bit Addr Cache 16 Entries Main 1K words 10 bit words
10 bit Address 4 bit Addr 10 bit Addr Cache 16 Entries • Where can a word • be placed in the cache? Main 1K words 10 bit words
10 bit Address 4 bit Addr 10 bit Addr Cache 16 Entries • Where can a word • be placed in the cache? • Use the last 4 bits of the address. Main 1K words 10 bit words
Address Index Index 4 bit Addr 10 bit Addr Cache 16 Entries • Where can a word • be placed in the cache? • Use the last 4 bits of the address. Main 1K words 10 bit words
Address Index 9 8 7 6 5 4 3 2 1 0 Index 4 bit Addr 10 bit Addr Cache 16 Entries . . . Main 1K words 10 bit words
Address Index 9 8 7 6 5 4 3 2 1 0 Index 4 bit Addr 10 bit Addr Cache 16 Entries . . How can a word be found if it is in the cache? 26 = 64 possibilities Need to know the rest of the address ! . Main 1K words 10 bit words
Address Index 9 8 7 6 5 4 3 2 1 0 Index 4 bit Addr 10 bit Addr Cache 16 Entries . . How can a word be found if it is in the cache? 26 = 64 possibilities Need to know the rest of the address ! Save it in the Cache . Main 1K words 10 bit words
Address Tag Index 9 8 7 6 5 4 3 2 1 0 Index 4 bit Addr 10 bit Addr Cache . . . Tag – 6 bits Data -10 bit Main 1K words 10 bit words
Address Tag Index 9 8 7 6 5 4 3 2 1 0 Index 4 bit Addr 10 bit Addr Cache . . . Data -10 bit 1 Tag – 6 bits Also need a Valid bit to indicate that the cache has valid data Main 1K words 10 bit words
Address Tag Index 9 8 7 6 5 4 3 2 1 0 Index 4 bit Addr 10 bit Addr Cache Valid . . . 1 Tag – 6 bits Data -10 bit Which word should be replaced on a cache miss? Main 1K words 10 bit words
Address Tag Index 9 8 7 6 5 4 3 2 1 0 Index 4 bit Addr 10 bit Addr Cache Valid . . . 1 Tag – 6 bits Data -10 bit Which word should be replaced on a cache miss? The one with the same Index. Main 1K words 10 bit words
Hit if: the location has been accessed and there has not been a location accessed with the same index since then.
Hit if: the location has been accessed and there has not been a location accessed with the same index since then. Temporal locality: most recently accessed Spatial locality: will not be replaced until an access occurs beyond the size of the cache
Hit if: the location has been accessed and there has not been a location accessed with the same tag since then. Temporal locality: most recently accessed Spatial locality: will not be replaced until an access occurs beyond the size of the cache The larger the cache the lower the miss rate and the lower the average access time (approaches Hit time)
Address Tag Index 9 8 7 6 5 4 3 2 1 0 Index 4 bit Addr 10 bit Addr Cache Valid . . . 1 Tag – 6 bits Data -10 bit When do we write the main memory? Main 1K words 10 bit words
Address Tag Index 9 8 7 6 5 4 3 2 1 0 Index 4 bit Addr 10 bit Addr Cache Valid . . . 1 Tag – 6 bits Data -10 bit When do we write the main memory? As soon as the cache is written. Called Write-Through Main 1K words 10 bit words
Direct Mapped General Structure Address - n bits Byte Offset Tag Index n-k-2 k 2 2k Cache Words Valid 1 Tag (n-k-2 bits) Computer Word (n bits)
Direct Mapped General Structure Address - n bits Byte Offset Tag Index n-k-2 k 2 2k Cache Words Valid 1 Tag (n-k-2 bits) Computer Word (n bits) Ex: 32 bit address and 214 words of data cache, k = 14
Direct Mapped General Structure Address - n bits Byte Offset Tag Index n-k-2 k 2 2k Cache Words Valid 1 Tag (n-k-2 bits) Computer Word (n bits) Ex: 32 bit address and 214 words of data cache, k = 14 Cache width is 32 +32-14-2+1 = 49 bits 49 / 32 = 1.53 bits more than just data
Address – 32 bits Byte Offset Tag Index 16 14 2 Valid Tag Data 16K entries 16 32
Address – 32 bits Byte Offset Tag Index 16 14 2 Valid Tag Data 16K entries 16 32
Address – 32 bits Byte Offset Tag Index 16 14 2 Valid Tag Data 16K entries 16 32 = Hit
Address – 32 bits Byte Offset Tag Index 16 14 2 Valid Tag Data 16K entries 16 32 Data = Hit
Cache Control for MIPS Lite Tag Index Byte Offset 31 30 . . . 17 16 15 . . . 3 2 1 0 Address Valid Tag Data Cache 48 47 46 . . . 33 32 31 30 . . . 3 2 1 0 Notation: Cache Memory ( Field) [Address] Data field addressed by PC Index CM(31 –0)[PC(15-2)] Tag Field Addressed by PC Index CM(47-32)[PC(15-2]
Cache Control for MIPS Lite Tag Index Byte Offset 31 30 . . . 17 16 15 . . . 3 2 1 0 Address Valid Tag Data Cache 48 47 46 . . . 33 32 31 30 . . . 3 2 1 0 Instruction Fetch - Assume Main Memory access is 5 clock cycles Was: S0 M[PC] IR, PC+4 PC, S1 S
Cache Control for MIPS Lite Tag Index Byte Offset 31 30 . . . 17 16 15 . . . 3 2 1 0 Address Valid Tag Data Cache 48 47 46 . . . 33 32 31 30 . . . 3 2 1 0 Instruction Fetch - Assume Main Memory access is 5 clock cycles Was: S0 M[PC] IR, PC+4 PC, S1 S S0 CM(31-0)[ PC(15-2)] IR, PC+4 PC, HitS1+HitS10 S
Cache Control for MIPS Lite Tag Index Byte Offset 31 30 . . . 17 16 15 . . . 3 2 1 0 Address Valid Tag Data Cache 48 47 46 . . . 33 32 31 30 . . . 3 2 1 0 Instruction Fetch - Assume Main Memory access is 5 clock cycles Was: S0 M[PC] IR, PC+4 PC, S1 S S0 CM(31-0)[ PC(15-2)] IR, PC+4 PC, HitS1+HitS10 S S10 PC – 4 PC S11 S S11 MM[PC] MMOut S12 S S12 S13 S S13 S14 S S14 S15 S S15 S16 S
Cache Control for MIPS Lite Tag Index Byte Offset 31 30 . . . 17 16 15 . . . 3 2 1 0 Address Valid Tag Data Cache 48 47 46 . . . 33 32 31 30 . . . 3 2 1 0 Instruction Fetch - Assume Main Memory access is 5 clock cycles Was: S0 M[PC] IR, PC+4 PC, S1 S S0 CM(31-0)[ PC(15-2)] IR, PC+4 PC, HitS1+HitS10 S S10 PC – 4 PC S11 S S11 MM[PC] MMOut S12 S S12 S13 S S13 S14 S S14 S15 S S15 S16 S S16 MMOut CM(31-0)[PC(15-2)], PC(31-16) CM(47-32)[PC(15-2)] 1 CM(48)[PC(15-2)] S0 S
Cache Control for MIPS Lite Tag Index Byte Offset 31 30 . . . 17 16 15 . . . 3 2 1 0 Address Valid Tag Data Cache 48 47 46 . . . 33 32 31 30 . . . 3 2 1 0 Instruction Fetch - Assume Main Memory access is 5 clock cycles Hit = Valid[Index] { Cache Tag[Index] = Addr Tag} For CM addressed by PC(15-2)
Address – 32 bits WRITE Write Cache Write Main Byte Offset Tag Index 16 14 2 Valid Tag Data 16K entries 16 32 Data = Hit
Cache Control for MIPS Lite Tag Index Byte Offset 31 30 . . . 17 16 15 . . . 3 2 1 0 Address Valid Tag Data Cache 48 47 46 . . . 33 32 31 30 . . . 3 2 1 0 Memory Write - Assume Main Memory access is 5 clock cycles WAS: S5 B M[ALUOut] S0 S
Cache Control for MIPS Lite Tag Index Byte Offset 31 30 . . . 17 16 15 . . . 3 2 1 0 Address Valid Tag Data Cache 48 47 46 . . . 33 32 31 30 . . . 3 2 1 0 Memory Write - Assume Main Memory access is 5 clock cycles WAS: S5 B M[ALUOut] S0 S S5 B CM(31-0)[ALUOut(15-2)],
Cache Control for MIPS Lite Tag Index Byte Offset 31 30 . . . 17 16 15 . . . 3 2 1 0 Address Valid Tag Data Cache 48 47 46 . . . 33 32 31 30 . . . 3 2 1 0 Memory Write - Assume Main Memory access is 5 clock cycles WAS: S5 B M[ALUOut] S0 S S5 B CM(31-0)[ALUOut(15-2)], ALUOut(31-16) CM(47-32)[ALUOut(15-2)] 1 CM(48)[ALUOut(15-2)]
Cache Control for MIPS Lite Tag Index Byte Offset 31 30 . . . 17 16 15 . . . 3 2 1 0 Address Valid Tag Data Cache 48 47 46 . . . 33 32 31 30 . . . 3 2 1 0 Memory Write - Assume Main Memory access is 5 clock cycles WAS: S5 B M[ALUOut] S0 S S5 B CM(31-0)[ALUOut(15-2)], ALUOut(31-16) CM(47-32)[ALUOut(15-2)] 1 CM(48)[ALUOut(15-2)] B MM[ALUOut] S17 S S17 S18 S S18 S19 S S19 S20 S S20 S0 S
DECStation 3100 Processor Instruction Cache Data Cache 14 bit Index 14 bit Index 32 bit Data Word 32 bit Data Word 16K Entries 16K Entries 64 KB Data 64 KB Data Main Memory
DECStation 3100 Instruction Data Effective Program Miss Rate Miss Rate Miss Rate gcc 6.1% 2.1% 5.4% spice 1.2% 1.3% 1.2% • Only Read Misses • Effective is weighted average of accesses • Instruction Miss Rate not always less than Data • Miss Rates clearly depend on the program • Direct Mapped 1 word cache is effective
DECStation 3100 Instruction Data Effective Program Miss Rate Miss Rate Miss Rate gcc 6.1% 2.1% 5.4% spice 1.2% 1.3% 1.2% Direct Mapped 1 word cache is effective Average Memory Access Time = Hit Time + Miss Rate * Miss Penalty
DECStation 3100 Instruction Data Effective Program Miss Rate Miss Rate Miss Rate gcc 6.1% 2.1% 5.4% spice 1.2% 1.3% 1.2% What if combined into one large cache?
DECStation 3100 Instruction Data Effective Combined Program Miss Rate Miss Rate Miss Rate Miss Rate gcc 6.1% 2.1% 5.4% 4.8% spice 1.2% 1.3% 1.2% What if combined into one large cache? One large cache has lower miss rate than two half caches Split caches can double the bandwidth by simultaneous access ( pipelining)
Write – Through Performance Improvement Every Write : Write Cache and Write Main Memory Can be 10% to 15% of instructions
Write – Through Performance Improvement Consider a Write Buffer Processor Write Buffer Cache Main Memory
Write – Through Performance Improvement Consider a Write Buffer Processor Write Buffer Cache Address Data Valid Main Memory
Write – Through Performance Improvement Consider a Write Buffer Processor Write Buffer Cache Memory Controller Writes Data from Buffer to Main and Releases Buffer Main Memory