1 / 54

Exam Review: Cache Design Issues and Two's Complement Arithmetic

This review covers topics such as two's complement arithmetic, ripple carry ALU logic, cache design issues, and the IEEE 754 floating point standard. It also includes practice problems and examples.

tadame
Download Presentation

Exam Review: Cache Design Issues and Two's Complement Arithmetic

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Exam 2 Review • Two’s Complement Arithmetic • Ripple carry ALU logic and performance • Look-ahead techniques, performance and equations • Basic multiplication and division ( non- restoring) algorithms • IEEE 754 floating point standard (definition provided) • Write a sequence of register transfers to implement a given instruction for MIPS • Given a set of Register Transfers, design the control needed for some component

  2. Cache Design Issues • Where can a word or block of words be placed • in the cache?

  3. Cache Design Issues • Where can a word or block of words be placed • in the cache? • 2. How can a word be found if it is in the cache?

  4. Cache Design Issues • Where can a word or block of words be placed • in the cache? • How can a word be found if it is in the cache? • Which word or block of words should be replaced • on a cache miss?

  5. Cache Design Issues • Where can a word or block of words be placed • in the cache? • How can a word be found if it is in the cache? • Which word or block of words should be replaced • on a cache miss? • 4. When do we write the main memory?

  6. Cache Design Issues • Where can a word or block of words be placed • in the cache? • How can a word be found if it is in the cache? • Which word or block of words should be replaced • on a cache miss? • When do we write the main memory? • Remember READS dominate WRITES

  7. 10 bit Address 4 bit Addr 10 bit Addr Cache 16 Entries Main 1K words 10 bit words

  8. 10 bit Address 4 bit Addr 10 bit Addr Cache 16 Entries • Where can a word • be placed in the cache? Main 1K words 10 bit words

  9. 10 bit Address 4 bit Addr 10 bit Addr Cache 16 Entries • Where can a word • be placed in the cache? • Use the last 4 bits of the address. Main 1K words 10 bit words

  10. Address Index Index 4 bit Addr 10 bit Addr Cache 16 Entries • Where can a word • be placed in the cache? • Use the last 4 bits of the address. Main 1K words 10 bit words

  11. Address Index 9 8 7 6 5 4 3 2 1 0 Index 4 bit Addr 10 bit Addr Cache 16 Entries . . . Main 1K words 10 bit words

  12. Address Index 9 8 7 6 5 4 3 2 1 0 Index 4 bit Addr 10 bit Addr Cache 16 Entries . . How can a word be found if it is in the cache? 26 = 64 possibilities Need to know the rest of the address ! . Main 1K words 10 bit words

  13. Address Index 9 8 7 6 5 4 3 2 1 0 Index 4 bit Addr 10 bit Addr Cache 16 Entries . . How can a word be found if it is in the cache? 26 = 64 possibilities Need to know the rest of the address ! Save it in the Cache . Main 1K words 10 bit words

  14. Address Tag Index 9 8 7 6 5 4 3 2 1 0 Index 4 bit Addr 10 bit Addr Cache . . . Tag – 6 bits Data -10 bit Main 1K words 10 bit words

  15. Address Tag Index 9 8 7 6 5 4 3 2 1 0 Index 4 bit Addr 10 bit Addr Cache . . . Data -10 bit 1 Tag – 6 bits Also need a Valid bit to indicate that the cache has valid data Main 1K words 10 bit words

  16. Address Tag Index 9 8 7 6 5 4 3 2 1 0 Index 4 bit Addr 10 bit Addr Cache Valid . . . 1 Tag – 6 bits Data -10 bit Which word should be replaced on a cache miss? Main 1K words 10 bit words

  17. Address Tag Index 9 8 7 6 5 4 3 2 1 0 Index 4 bit Addr 10 bit Addr Cache Valid . . . 1 Tag – 6 bits Data -10 bit Which word should be replaced on a cache miss? The one with the same Index. Main 1K words 10 bit words

  18. Hit if: the location has been accessed and there has not been a location accessed with the same index since then.

  19. Hit if: the location has been accessed and there has not been a location accessed with the same index since then. Temporal locality: most recently accessed Spatial locality: will not be replaced until an access occurs beyond the size of the cache

  20. Hit if: the location has been accessed and there has not been a location accessed with the same tag since then. Temporal locality: most recently accessed Spatial locality: will not be replaced until an access occurs beyond the size of the cache The larger the cache the lower the miss rate and the lower the average access time (approaches Hit time)

  21. Address Tag Index 9 8 7 6 5 4 3 2 1 0 Index 4 bit Addr 10 bit Addr Cache Valid . . . 1 Tag – 6 bits Data -10 bit When do we write the main memory? Main 1K words 10 bit words

  22. Address Tag Index 9 8 7 6 5 4 3 2 1 0 Index 4 bit Addr 10 bit Addr Cache Valid . . . 1 Tag – 6 bits Data -10 bit When do we write the main memory? As soon as the cache is written. Called Write-Through Main 1K words 10 bit words

  23. Direct Mapped General Structure Address - n bits Byte Offset Tag Index n-k-2 k 2 2k Cache Words Valid 1 Tag (n-k-2 bits) Computer Word (n bits)

  24. Direct Mapped General Structure Address - n bits Byte Offset Tag Index n-k-2 k 2 2k Cache Words Valid 1 Tag (n-k-2 bits) Computer Word (n bits) Ex: 32 bit address and 214 words of data cache, k = 14

  25. Direct Mapped General Structure Address - n bits Byte Offset Tag Index n-k-2 k 2 2k Cache Words Valid 1 Tag (n-k-2 bits) Computer Word (n bits) Ex: 32 bit address and 214 words of data cache, k = 14 Cache width is 32 +32-14-2+1 = 49 bits 49 / 32 = 1.53 bits more than just data

  26. Address – 32 bits Byte Offset Tag Index 16 14 2 Valid Tag Data 16K entries 16 32

  27. Address – 32 bits Byte Offset Tag Index 16 14 2 Valid Tag Data 16K entries 16 32

  28. Address – 32 bits Byte Offset Tag Index 16 14 2 Valid Tag Data 16K entries 16 32 = Hit

  29. Address – 32 bits Byte Offset Tag Index 16 14 2 Valid Tag Data 16K entries 16 32 Data = Hit

  30. Cache Control for MIPS Lite Tag Index Byte Offset 31 30 . . . 17 16 15 . . . 3 2 1 0 Address Valid Tag Data Cache 48 47 46 . . . 33 32 31 30 . . . 3 2 1 0 Notation: Cache Memory ( Field) [Address] Data field addressed by PC Index CM(31 –0)[PC(15-2)] Tag Field Addressed by PC Index CM(47-32)[PC(15-2]

  31. Cache Control for MIPS Lite Tag Index Byte Offset 31 30 . . . 17 16 15 . . . 3 2 1 0 Address Valid Tag Data Cache 48 47 46 . . . 33 32 31 30 . . . 3 2 1 0 Instruction Fetch - Assume Main Memory access is 5 clock cycles Was: S0 M[PC] IR, PC+4 PC, S1 S

  32. Cache Control for MIPS Lite Tag Index Byte Offset 31 30 . . . 17 16 15 . . . 3 2 1 0 Address Valid Tag Data Cache 48 47 46 . . . 33 32 31 30 . . . 3 2 1 0 Instruction Fetch - Assume Main Memory access is 5 clock cycles Was: S0 M[PC] IR, PC+4 PC, S1 S S0 CM(31-0)[ PC(15-2)] IR, PC+4 PC, HitS1+HitS10 S

  33. Cache Control for MIPS Lite Tag Index Byte Offset 31 30 . . . 17 16 15 . . . 3 2 1 0 Address Valid Tag Data Cache 48 47 46 . . . 33 32 31 30 . . . 3 2 1 0 Instruction Fetch - Assume Main Memory access is 5 clock cycles Was: S0 M[PC] IR, PC+4 PC, S1 S S0 CM(31-0)[ PC(15-2)] IR, PC+4 PC, HitS1+HitS10 S S10 PC – 4 PC S11 S S11 MM[PC] MMOut S12 S S12 S13 S S13 S14 S S14 S15 S S15 S16 S

  34. Cache Control for MIPS Lite Tag Index Byte Offset 31 30 . . . 17 16 15 . . . 3 2 1 0 Address Valid Tag Data Cache 48 47 46 . . . 33 32 31 30 . . . 3 2 1 0 Instruction Fetch - Assume Main Memory access is 5 clock cycles Was: S0 M[PC] IR, PC+4 PC, S1 S S0 CM(31-0)[ PC(15-2)] IR, PC+4 PC, HitS1+HitS10 S S10 PC – 4 PC S11 S S11 MM[PC] MMOut S12 S S12 S13 S S13 S14 S S14 S15 S S15 S16 S S16 MMOut CM(31-0)[PC(15-2)], PC(31-16) CM(47-32)[PC(15-2)] 1 CM(48)[PC(15-2)] S0 S

  35. Cache Control for MIPS Lite Tag Index Byte Offset 31 30 . . . 17 16 15 . . . 3 2 1 0 Address Valid Tag Data Cache 48 47 46 . . . 33 32 31 30 . . . 3 2 1 0 Instruction Fetch - Assume Main Memory access is 5 clock cycles Hit = Valid[Index] { Cache Tag[Index] = Addr Tag} For CM addressed by PC(15-2)

  36. Address – 32 bits WRITE Write Cache Write Main Byte Offset Tag Index 16 14 2 Valid Tag Data 16K entries 16 32 Data = Hit

  37. Cache Control for MIPS Lite Tag Index Byte Offset 31 30 . . . 17 16 15 . . . 3 2 1 0 Address Valid Tag Data Cache 48 47 46 . . . 33 32 31 30 . . . 3 2 1 0 Memory Write - Assume Main Memory access is 5 clock cycles WAS: S5 B M[ALUOut] S0 S

  38. Cache Control for MIPS Lite Tag Index Byte Offset 31 30 . . . 17 16 15 . . . 3 2 1 0 Address Valid Tag Data Cache 48 47 46 . . . 33 32 31 30 . . . 3 2 1 0 Memory Write - Assume Main Memory access is 5 clock cycles WAS: S5 B M[ALUOut] S0 S S5 B CM(31-0)[ALUOut(15-2)],

  39. Cache Control for MIPS Lite Tag Index Byte Offset 31 30 . . . 17 16 15 . . . 3 2 1 0 Address Valid Tag Data Cache 48 47 46 . . . 33 32 31 30 . . . 3 2 1 0 Memory Write - Assume Main Memory access is 5 clock cycles WAS: S5 B M[ALUOut] S0 S S5 B CM(31-0)[ALUOut(15-2)], ALUOut(31-16) CM(47-32)[ALUOut(15-2)] 1 CM(48)[ALUOut(15-2)]

  40. Cache Control for MIPS Lite Tag Index Byte Offset 31 30 . . . 17 16 15 . . . 3 2 1 0 Address Valid Tag Data Cache 48 47 46 . . . 33 32 31 30 . . . 3 2 1 0 Memory Write - Assume Main Memory access is 5 clock cycles WAS: S5 B M[ALUOut] S0 S S5 B CM(31-0)[ALUOut(15-2)], ALUOut(31-16) CM(47-32)[ALUOut(15-2)] 1 CM(48)[ALUOut(15-2)] B MM[ALUOut] S17 S S17 S18 S S18 S19 S S19 S20 S S20 S0 S

  41. DECStation 3100 Processor Instruction Cache Data Cache 14 bit Index 14 bit Index 32 bit Data Word 32 bit Data Word 16K Entries 16K Entries 64 KB Data 64 KB Data Main Memory

  42. DECStation 3100 Instruction Data Effective Program Miss Rate Miss Rate Miss Rate gcc 6.1% 2.1% 5.4% spice 1.2% 1.3% 1.2% • Only Read Misses • Effective is weighted average of accesses • Instruction Miss Rate not always less than Data • Miss Rates clearly depend on the program • Direct Mapped 1 word cache is effective

  43. DECStation 3100 Instruction Data Effective Program Miss Rate Miss Rate Miss Rate gcc 6.1% 2.1% 5.4% spice 1.2% 1.3% 1.2% Direct Mapped 1 word cache is effective Average Memory Access Time = Hit Time + Miss Rate * Miss Penalty

  44. DECStation 3100 Instruction Data Effective Program Miss Rate Miss Rate Miss Rate gcc 6.1% 2.1% 5.4% spice 1.2% 1.3% 1.2% What if combined into one large cache?

  45. DECStation 3100 Instruction Data Effective Combined Program Miss Rate Miss Rate Miss Rate Miss Rate gcc 6.1% 2.1% 5.4% 4.8% spice 1.2% 1.3% 1.2% What if combined into one large cache? One large cache has lower miss rate than two half caches Split caches can double the bandwidth by simultaneous access ( pipelining)

  46. Write – Through Performance Improvement Every Write : Write Cache and Write Main Memory Can be 10% to 15% of instructions

  47. Write – Through Performance Improvement Consider a Write Buffer Processor Write Buffer Cache Main Memory

  48. Write – Through Performance Improvement Consider a Write Buffer Processor Write Buffer Cache Address Data Valid Main Memory

  49. Write – Through Performance Improvement Consider a Write Buffer Processor Write Buffer Cache Memory Controller Writes Data from Buffer to Main and Releases Buffer Main Memory

More Related