CSCI206 - Computer Organization & Programming

CSCI206 - Computer Organization & Programming Memory Introduction Revised by Alexander Fuchsberger and Xiannong Meng in spring 2019 based on the notes by other instructors. zyBook: 12.1, 12.2

Many different types of memory Volatile • SRAM • DRAM • SDRAM • PC100 • PC133 • DDR SDRAM • DDR2 SDRAM • DDR3 SDRAM • DDR4 SDRAM • GDDR3 • GDDR4 • GDDR5 • RDRAM Non-volatile: • ROM • EEPROM • NOR Flash • NAND Flash • SD • SDHC • SDXC • FRAM • HDD • Optical Drive WHY?

Competing Features of Memory

Memory hierarchy CPU Registers On-board CPU Cache Main memory On chips (circuits) Secondary storage Typically involving mechanical parts.

Intel iCore-7 memory hierarchy

Intel iCore-7 cache

Trade-offs • Ideally memory would be infinitely large, fast, and low power • This is impossible • The memory hierarchy simulates a large/fast memory system using combination of different memory technologies

Why it works We can simulate a large/fast memory because of • temporal locality • recently accessed data is likely to be accessed again in the future • spatial locality • data near recently accessed data (by address) is more likely to be requested in the future than data that is far away

Cache Memory • The cache is a small amount of fast (expensive) memory holding data currently being worked on (temporal/spatial locality) • Main memory is much larger, slower, and cheaper • The processor interfaces with the cache, so memory appears to be fast! • A cache algorithm decides which memory blocks to store and when to move blocks back into main memory

Cache Hit lw $t0, 0($s0) [$s0+0] is in cache • Cache runs at CPU speed so there is no delay, data is read from cache in M stage • All pipeline diagrams we have done assume cache hits • Otherwise, MEM-EX or MEM-MEM forward would be impossible as memory read is much slower than register

Cache Miss lw $t0, 0($s0) [$s0+0] NOT incache • Latency to main memory is 100 ns • In comparison, register reading is a few cycles • If the CPU runs at 2 GHz how many cycles do we stall?

Cache performance parameters • Hit rate: The fraction of memory accesses found in a level of the memory hierarchy. • Miss rate: The fraction of memory accesses not found in a level of the memory hierarchy. • Hit time: The time required to access a level of the memory hierarchy • Miss penalty: The time required to fetch a block into a level of the memory hierarchy from the lower level.

Average Memory Access Time (AMAT) • If a cache hits 80% of the time and the miss penalty is 200 cycles, the AMAT (in clock cycles) is

Performance A program with 1M instructions runs on a 2 GHz ideal pipelined processor (CPI=1). 25% of the instructions access memory with an 80% hit rate and 100 cycle miss penalty. How long does the program take to execute?

Approach

CSCI206 - Computer Organization & Programming