310 likes | 562 Views
OpenRISC 1000. Yung-Luen Lan, b9090102. Cache Model. Perspective of the Programming Model. Hence, the hardware implementation details (cache organization. size, ... etc.) that are invisible to this topic. The implementation should work without cache. Why Cache?. Memory Hierarchy.
E N D
OpenRISC 1000 Yung-Luen Lan, b9090102
Cache Model • Perspective of the Programming Model. • Hence, the hardware implementation details (cache organization. size, ... etc.) that are invisible to this topic. • The implementation should work without cache.
Why Cache? • Memory Hierarchy
Cache Concept • How do we know if a data item is in the cache? • If it is, how do we find them?
Cache Coherency • Multiple Processor
Multiprocessor Cache Coherency • Extra snooping hardware to detect the action of other processors. • Snoopy protocol • Write-invalidate • Write-update
Snoopy Protocol • Write-invalidate: The writing processor causes all copies in other caches to be invalidated before changing its local copy. The writing processor issue an invalidation signal over the bus, and all caches check to see if they have a copy. If so, they must invalidate the block. • Write-update: The writing processor broadcast the new data over the bus, and all copies are then updated with the new value.
DCCR • Data Cache Control Register • Accessible with the l.mtspr/l.mfspr instructions in supervisor mode. • Enable Ways: • 0000 0000 All ways disabled/lock • 1111 1111 All ways enabled/unlock
ICCR • Instruction Cache Control Register • Accessible with the l.mtspr/l.mfspr instructions in supervisor mode. • Enable Ways: • 0000 0000 All ways disabled/lock • 1111 1111 All ways enabled/unlock
Cache Control Register (cont.) • If the cache does not implement way locking, the DCCR/ICCR is not required to be implemented.
Cache Management • Memory accesses caused by cache management are not recorded (unlike load or store instructions) and cannot invoke any exception. • Instruction caches do not need to be coherent with the memory or caches of other processors. Software must make the instruction cache coherent with modified instructions in the memory. A typical way to accomplish this is: • Data cache block write-back (update of the memory) • l.csync (wait for update to finish) • Instruction cache block invalidate (clear instruction cache block) • Flush pipeline
Data Cache Block Prefetch • Optional special-purpose register. • accessible with the l.mtspr/l.mfspr instructions in both user and supervisor modes. • 32bits/64bits • Optional • The DCBPR is written with the effective address and the corresponding block from memory is prefetched into the cache. (write only)
Data Cache Block Flush • The DCBFR is written with the effective address. • If coherency is required then the corresponding: • Unmodified data cache block is invalidated in all processors. • Modified data cache block is written back to the memory and invalidated in all processors. • Missing data cache block in the local processor causes that modified data cache block in other processor is written back to the memory and invalidated. If other processors have unmodified data cache block, it is just invalidated in all processors. • If coherency is not required then the corresponding: • Unmodified data cache block in the local processor is invalidated. • Modified data cache block is written back to the memory and invalidated in local processor. • Missing cache block in the local processor does not cause any action.
Data Cache Block Invalidate • The DCBIR is written with the effective address. If coherency is required then the corresponding: • Unmodified data cache block is invalidated in all processors. • Modified data cache block is invalidated in all processors. • Missing data cache block in the local processor causes that data cache blocks in other processors are invalidated. • If coherency is not required then corresponding: • Unmodified data cache block in the local processor is invalidated. • Modified data cache block in the local processor is invalidated. • Missing cache block in the local processor does not cause any action.
Data Cache Block Write-Back • The DCBWR is written with the effective address. • If coherency is required then: • the corresponding data cache block in any of the processors is written back to memory if it was modified. • If coherency is not required then: • the corresponding data cache block in the local processor is written back to memory if it was modified.
Data Cache Block Lock • Optional • The DCBLR is written with the effective address. The corresponding data cache block in the local processor is locked. • If all blocks of the same set in all cache ways are locked, then the cache refill may automatically unlock the least-recently used block.
Instruction Cache Block Prefetch • Optional • The ICBPR is written with the effective address and the corresponding block from memory is prefetched into the instruction cache.
Instruction Cache Block Invalidate • The ICBIR is written with the effective address. • If coherency is required then the corresponding instruction cache blocks in all processors are invalidated. • If coherency is not required then the corresponding instruction cache block is invalidated in the local processor.
Instruction Cache Block Lock • Optional • The ICBLR is written with the effective address. The corresponding instruction cache block in the local processor is locked. • If all blocks of the same set in all cache ways are locked, then the cache refill may automatically unlock the least-recently used block. • Missing cache block in the local processor does not caus e any action.
Cache/Memory Coherency • synchronize. • In systems that do not provide cache coherency with the PTE attributes (because they do not implement a memory management unit), it may be provided through explicit cache management. • Cache coherency in systems with virtual memory can be provided on a page-by-page basis with PTE attributes. The attributes are: • Cache Coherent (CC Attribute) • Caching-Inhibited (CI Attribute) • Write-Back Cache (WBC Attribute) • When the memory/cache attributes are changed, it is imperative that the cache contents should reflect the new attribute settings. This usually means that cache blocks must be flushed or invalidated.
Pages Designated as Cache Coherent Pages • CC=0: do not need cache coherency. • CC=1: need cache coherency. • To improve performance of uniprocessor systems, memory pages should not be designated as CC=1.
Pages Designated as Caching-Inhibited Pages • CI=1, Memory accesses was directly into the main memory, without any cache. • The target content should never be available in the cache. • When OS sets a page CI, it should flush the corresponding cache block to prevent the accident copy.
Pages Designated as Write-Back Cache Pages • WBC=0: Write to both cache and memory. • WBC=1: Only write to local cache. (Requires cache snooping hardware support)
Power Management (optional) • Slow down feature • Doze mode • Sleep mode • Suspend mode • Dynamic clock gating feature