1 / 13

Computer Systems Architecture A networking Approach Chapter 12 Introduction The Memory Hierarchy

Levels of Performance ? You Get What You Pay For. Recall:Dynamic Random Access Memory (DRAM)Capacitors to store state (0 or 1)Periodically refreshedRelatively cheapStatic Random Access Memory (SRAM)Transistors to store stateDoesn't need to be refreshed, faster, and uses less power than DRAMM

juancarlos
Download Presentation

Computer Systems Architecture A networking Approach Chapter 12 Introduction The Memory Hierarchy

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


    1. Computer Systems Architecture A networking Approach Chapter 12 Introduction The Memory Hierarchy CS 147 Nathaniel Gilbert 1

    2. Levels of Performance – You Get What You Pay For Recall: Dynamic Random Access Memory (DRAM) Capacitors to store state (0 or 1) Periodically refreshed Relatively cheap Static Random Access Memory (SRAM) Transistors to store state Doesn’t need to be refreshed, faster, and uses less power than DRAM More expensive than DRAM 2

    3. Levels of Performance cont. 3

    4. Levels of Performance cont. 4

    5. Localization of Access – exploiting repetition Computers tend to access the same locality of memory. This is partly due to the programmer organizing data in clusters along with the compiler attempting to organize code efficiently. This localization can be exploited in memory hierarchy. 5

    6. Localization of Access cont. Exploiting localization of memory access Keep related data in smaller groups (try not to store all input and output to a single array when reading from/writing to disk) Only the portion of data the CPU is using should be loaded into faster memory. 6

    7. Localization of Access cont. 7

    8. Localization of Access cont. On a sun workstation (200 MHz CPU, 256 Mbyte main memory, 256 kbyte cache, 4 Gbyte local hard drive), the output was: 8

    9. Localization of Access cont. The reason for the doubling of time is the movement of data up and down the data hierarchy. The array is sent to higher memory in blocks because the 256 kbytes of cache memory cannot hold the whole object. 9

    10. Instruction and Data Caches – Matching Memory to CPU Speed A 2 GHz Pentium CPU accesses program memory an average off 0.5 ns just for fetching instructions DDO DRAM responds within 10 ns. If the CPU only used DRAM, it would result in 20x loss in speed This is where using SRAM (cache) comes into play Downfall of cache: Misses (if the desired code is not in the memory segment) may take longer because the memory has to be reloaded Negative cache – (depending on architecture) where negative results (failures) are stored 10

    11. Instruction and Data Caches cont. Cache is built from SRAM chips, and ideally are made to match the system clock speed of a CPU The Cache Controller Unit (CCU) and cache memory, are inserted between the CPU and the main memory. Level 1 and Level 2 cache are different by placement. Level 1 is on the CPU chip. Level 2 was generally located off the CPU chip and was slowed down by the system bus. Intel successfully integrated a 128 kbyte L2 cache memory onto the CPU and continues to offer integrated chips. 11

    12. Instruction and Data Caches cont. Generic System Architecture Level 1 is the microprocessor with three forms of cache: D-cache – (Data) Fast buffer containing application data I-cache – (Instruction) Speed up executable instruction TLB – (Translation Lookaside Buffer) Stores a map of translated virtual page addresses Level 2 is Unified cache Memory – DRAM CPU and Register file reside in Level 1 Register file – Small amount of memory closest to CPU where data is manipulated 12

    13. Thank You 13

More Related