FSM, Cache memory

FSM, Cache memory Prof. Sin-Min Lee Department of Computer Science CS147 Lecture 14

Processor Input Control Memory Datapath Output The Five Classic Components of a Computer

The Processor Picture

Processor/Memory Bus PCI Bus I/O Busses

Two Basic Types of Memory • RAM (Random Access Memory) • Used to store programs and data that computer needs when executing programs • Volatile and loses information once power is turned off

2 Basic Types of Memory • ROM (Read-Only Memory) • Stores critical information necessary to operate the system, such as program necessary to boot computer • Not volatile and always retains its data • Also embedded in systems where programming does not need to change

Memory Hierarchy • Hierarchal Memory • Approach in which computer systems use combination of memory types to provide best performance at best cost • Basic types that constitute hierarchal memory system include registers, cache, main memory and secondary memory

Memory Hierarchy • Today’s computers each have small amount of very high-speed memory, called cache where data from frequently used memory locations may be temporarily stored • Cache is connected to main memory, which is typically medium-speed memory • Main memory is complemented by secondary memory, composed of hard disk and various removable media

The Memory Hierarchy

von NeumannArchitecturePrinceton Memory Address Pointer Arithmetic Logic Unit (ALU) Data/Instructions Pc = Pc + 1 Program Counter Featuring Deterministic Execution

Cache Memory • Physical memory is slow (more than 30 times slower than processor) • Cache memory uses SRAM chips. • Much faster • Much expensive • Situated closest to the processor • Can be arranged hierarchically • L1 cache is incorporated into processor • L2 cache is outside

Cache Memory This photo shows level 2 cache memory on the Processor board, beside the CPU

Cache Memory- Three LevelsArchitecture Memory Multi- Gigabytes Large and Slow 160 X Cache Control Logic 2 Gigahertz Clock 8X 2X 16X L3 Cache Memory L2 Cache Memory L1 Cache Memory 32 Kilobytes 128 Kilobytes 16 Megabytes Featuring Really Non-Deterministic Execution Address Pointer

Cache (1) • Is the first level of memory hierarchy encountered once the address leaves the CPU • Since the principle of locality applies, and taking advantage of locality to improve performance is so popular, the term cache is now applied whenever buffering is employed to reuse commonly occurring items • We will study caches by trying to answer the four questions for the first level of the memory hierarchy

Cache (2) • Every address reference goes first to the cache; • if the desired address is not here, then we have a cache miss; • The contents are fetched from main memory into the indicated CPU register and the content is also saved into the cache memory • If the desired data is in the cache, then we have a cache hit • The desired data is brought from the cache, at very high speed (low access time) • Most software exhibits temporal locality of access, meaning that it is likely that same address will be used again soon, and if so, the address will be found in the cache • Transfers between main memory and cache occur at granularity of cache lines or cache blocks, around 32 or 64 bytes (rather than bytes or processor words). Burst transfers of this kind receive hardware support and exploit spatial locality of access to the cache (future access are often to address near to the previous one)

Where can a block be placed in Cache? (1) • Our cache has eight block frames and the main memory has 32 blocks

Where can a block be placed in Cache? (2) • Direct mapped Cache • Each block has only one place where it can appear in the cache • (Block Address) MOD (Number of blocks in cache) • Fully associative Cache • A block can be placed anywhere in the cache • Set associative Cache • A block can be placed in a restricted set of places into the cache • A set is a group of blocks into the cache • (Block Address) MOD (Number of sets in the cache) • If there are n blocks in the cache, the placement is said to be n-way set associative

How is a Block Found in the Cache? • Caches have an address tag on each block frame that gives the block address. The tag is checked against the address coming from CPU • All tags are searched in parallel since speed is critical • Valid bit is appended to every tag to say whether this entry contains valid addresses or not • Address fields: • Block address • Tag – compared against for a hit • Index – selects the set • Block offset – selects the desired data from the block • Set associative cache • Large index means large sets with few blocks per set • With smaller index, the associativity increases • Full associative cache – index field is not existing

Which Block should be Replaced on a Cache Miss? • When a miss occurs, the cache controller must select a block to be replaced with the desired data • Benefit of direct mapping is that the hardware decision is much simplified • Two primary strategies for full and set associative caches • Random – candidate blocks are randomly selected • Some systems generate pseudo random block numbers, to get reproducible behavior useful for debugging • LRU (Last Recently Used) – to reduce the chance that information that has been recently used will be needed again, the block replaced is the least-recently used one. • Accesses to blocks are recorded to be able to implement LRU

What Happens on a Write? • Two basic options when writing to the cache: • Writhe through – the information is written to both, the block in the cache an the block in the lower-level memory • Write back – the information is written only to the lock in the cache • The modified block of cache is written back into the lower-level memory only when it is replaced • To reduce the frequency of writing back blocks on replacement, an implementation feature called dirty bit is commonly used. • This bit indicates whether a block is dirty (has been modified since loaded) or clean (not modified). If clean, no write back is involved

There are three methods in block placement: Direct mapped: if each block has only one place it can appear in the cache, the cache is said to be direct mapped. The mapping is usually (Block address) MOD (Number of blocks in cache) Fully Associative : if a block can be placed anywhere in the cache, the cache is said to be fully associative. Set associative : if a block can be placed in a restricted setof places in the cache, the cache is said to be set associative . A set is a group of blocks in the cache. A block is first mapped onto a set,and thenthe block can be placed anywhere within that set.The set is usually chosen by bit selection; that is, (Block address) MOD (Number of sets in cache)

A pictorial example for a cache with only 4 blocks and a memory with only 16 blocks.

Direct mapped cache: A block from main memory can go in exactly one place in the cache. This is called direct mapped because there is direct mapping from any block address in memory to a single location in the cache. cache Main memory

Fully associative cache : A block from main memory can be placed in any location in the cache. This is called fully associative because a block in main memory may be associated with any entry in the cache. cache Main memory

Set associative cache : The middle range of designs between direct mapped cache and fully associative cache is called set-associative cache. In a n-way set-associative cache a block from main memory can go into n (n at least 2) locations in the cache. 2-way set-associative cache Main memory Memory/Cache Related Terms

Locality of Reference • If location X is access, it is very likely that location X+1 will be accessed. • Benefit of the cache with data blocks

Current CPUs

Flip Flop Example Yulia Newton CS 147, Fall 2009 SJSU

Problem Implement the following state diagram using T Flip-Flop(s) and J-K Flip-Flop(s)

Number Of Flip-Flops Needed • Need 2 Flip-Flops • 1 T Flip-Flop • 2 JK Flip-Flop

Steps To Solve The Problem Step 1 – Create state table Step 2 – K-Maps for QA+ and QB+ Step 3 – K-Maps for T and JK Flip-Flops Step 4 – Draw Flip-Flop diagram

State Table Derived directly from the state diagram:

K-Maps for QA+ and QB+ QA+ QB+

JK Truth Table Let’s revisit JK Flip-Flops:

T Truth Table Let’s revisit T Flip-Flops:

K-Map for T T = QA’QB

K-Map for J and K J K J = XQA’+ X’ QA K = X

FSM, Cache memory

FSM, Cache memory

Presentation Transcript

Cache Memory

Cache memory

Cache Memory

Cache Memory

Cache Memory

Cache Memory

CACHE MEMORY

Cache Memory

Cache Memory

Cache Memory

Cache Memory

Cache memory

Cache Memory

Cache Memory

Cache Memory

Cache Memory

Cache Memory

Memory - Cache

Cache memory

Cache Memory

Cache Memory

Cache Memory