250 likes | 444 Views
NET+OS 6.1 Training. Cache API. NET+OS 6.1 Cache API. H/W Features NET+OS Cache Initialization Cache and DMA Configuring Cache API Functions. NS9750 Cache Features. Built into ARM9EJ-S CPU core Has memory management unit (MMU) 4K of data cache 8K of instruction cache
E N D
NET+OS 6.1 Cache API • H/W Features • NET+OS Cache Initialization • Cache and DMA • Configuring Cache • API Functions
NS9750 Cache Features • Built into ARM9EJ-S CPU core • Has memory management unit (MMU) • 4K of data cache • 8K of instruction cache • Main write buffer has 16-word data buffer and four-address buffer • Data cache write-back buffer has eight data words and single address buffer • Memory controller has 4 16-word buffers
MMU Functions • Controls what sections of the address space can be accessed. • Determines cache mode for different sections of the address space. • Creates virtual address maps. This feature is not used by NET+OS.
Access Modes • MMU_CLIENT_RW: Any process can read and write to this address region. This access mode is used for all memory and registers. • MMU_NO_ACCESS: Any access to this address region causes an abort exception. NET+OS sets this address mode on all invalid addresses.
NS9750 - Defect • The no-access mode is very important for NS9750. • Revision 1 of the chip has a bug which will cause the processor to hang if software reads or writes to some invalid addresses. Setting invalid regions to no-access catches these accesses before they hang the processor.
Buffering vs. Caching • Cached data can be stored in the processor’s cache for an indeterminate amount of time until the cache line is needed for another piece of memory, or the data value changes. • When a write is buffered, the data is only stored until the bus becomes available at which time it is written to memory. This prevents the processor from stalling when it writes to memory or a peripheral.
Data Cache Modes • Non-buffered: Data caching and buffering is disabled. • Buffered: Data cache is disabled, but writes are buffered. • Write-through: Reads are cached. Writes are buffered, but not cached. • Write-back: Reads are cached, writes are cached and buffered.
ARM9 and Cache • ARM9 was designed to use cache and write buffers. It depends on it. • Performance is very bad when cache and buffering are disabled. • Instruction cache should always be on. • Write buffers should always be turned on.
Cache Initialization • Instruction cache enabled in nccInit() by the call to MCEnableICache(). • Data cache and MMU memory protection are set up in main() by NAEnableMmu(). • First call to nonCacheMalloc() initializes non-cached memory region.
Cache and DMA • DMA transfers bypass cache. • Cache can become incoherent if DMA transfers data to or from memory that has been cached. • Either DMA must be limited to non-cached buffers, or care must be taken to invalidate or clean cache before starting DMA transfers to cached buffers.
Using Non-cached Buffers • A section of RAM is set up that is not cached and can, therefore, be safely used with DMA. • Use nonCachedMalloc() to allocate buffers. • Use nonCachedFree() to release them.
Using Cached Buffers • Functions are provided to prepare a cached buffer for a DMA transfer by either cleaning or invalidating the cache associated with it. • Call NABeforeDMA() to prepare a cached buffer for a DMA transfer. • Call NAAfterDMA() when the transfer is complete (current implementation does nothing). • These functions can be used with buffers from the non-cached region (they will do nothing).
Configuration Parameters • mmuTable in customizeCache.c. • #defines in bsp.h. • Constants in customize.ldr/lx.
mmuTable • Sets cache mode for memory regions: not cached, buffered, write-through, or write back. • Sets memory protection level: read-write or no access. • Sets page sizes for memory regions: 1 Meg, 64K, or 4K.
mmuTable • Table contains a list of address ranges and their cache/memory protection settings. • All address ranges not in the list are set to no-access. • Software can change cache mode and access level on the fly. • Page sizes cannot be changed on the fly.
Changing mmuTable • Customer should adjust size of RAM and ROM regions to match their application. • Page sizes should be kept as large as possible for best performance.
Example mmuTable mmuTableType const mmuTable[] = { /* Start End Page Size Cache Mode User Access */ /* ========== ========== ================ ============= =========== */ {0x00000000, 0x00ffffff, MMU_PAGE_SIZE_1M, MMU_WRITE_THROUGH, MMU_CLIENT_RW}, {0x50000000, 0x57ffffff, MMU_PAGE_SIZE_1M, MMU_BUFFERED, MMU_CLIENT_RW}, {0x80000000, 0x8fffffff, MMU_PAGE_SIZE_1M, MMU_BUFFERED, MMU_CLIENT_RW}, {0x90000000, 0x900001d3, MMU_PAGE_SIZE_4K, MMU_BUFFERED, MMU_CLIENT_RW}, {0x90100000, 0x90103163, MMU_PAGE_SIZE_4K, MMU_BUFFERED, MMU_CLIENT_RW}, {0x90110000, 0x901101d3, MMU_PAGE_SIZE_4K, MMU_BUFFERED, MMU_CLIENT_RW}, {0x90200000, 0x9020007b, MMU_PAGE_SIZE_4K, MMU_BUFFERED, MMU_CLIENT_RW}, {0x90300000, 0x9030007b, MMU_PAGE_SIZE_4K, MMU_BUFFERED, MMU_CLIENT_RW}, {0x90400000, 0x9040017b, MMU_PAGE_SIZE_4K, MMU_BUFFERED, MMU_CLIENT_RW}, {0x90500000, 0x9050000F, MMU_PAGE_SIZE_4K, MMU_BUFFERED, MMU_CLIENT_RW}, {0x90600000, 0x90600093, MMU_PAGE_SIZE_4K, MMU_BUFFERED, MMU_CLIENT_RW},
Example mmuTable cont. {0xA0000000, 0xA00fffff, MMU_PAGE_SIZE_1M, MMU_BUFFERED, MMU_CLIENT_RW}, {0xA0100000, 0xA01fffff, MMU_PAGE_SIZE_1M, MMU_BUFFERED, MMU_CLIENT_RW}, {0xA0200000, 0xA02fffff, MMU_PAGE_SIZE_1M, MMU_BUFFERED, MMU_CLIENT_RW}, {0xA0300000, 0xA0301FFF, MMU_PAGE_SIZE_4K, MMU_BUFFERED, MMU_CLIENT_RW}, {0xA0400000, 0xA040002B, MMU_PAGE_SIZE_4K, MMU_BUFFERED, MMU_CLIENT_RW}, {0xA0401000, 0xA0401007, MMU_PAGE_SIZE_4K, MMU_BUFFERED, MMU_CLIENT_RW}, {0xA0600000, 0xA0601400, MMU_PAGE_SIZE_4K, MMU_BUFFERED, MMU_CLIENT_RW}, {0xA0700000, 0xA070027c, MMU_PAGE_SIZE_4K, MMU_BUFFERED, MMU_CLIENT_RW}, {0xA0800000, 0xA08003FF, MMU_PAGE_SIZE_4K, MMU_BUFFERED, MMU_CLIENT_RW}, {0xA0900000, 0xA0900223, MMU_PAGE_SIZE_4K, MMU_BUFFERED, MMU_CLIENT_RW} };
Parameters in bsp.h • BSP_AUTOMATICALLY_ENABLE_INSTRUCTION_CACHE should always be TRUE. • BSP_CACHE_NETWORK_HEAP should be FALSE for most applications. Caching packet buffers lowers overall performance because it causes other buffers to be bumped out of cache.
Linker Scripts • TTB_SIZE sets size of MMU Translation Table. Make this value larger if linker complains that .ttb section is full or has overflowed. • NON_CACHE_MALLOC_SIZE sets size of non-cached region. May need to increase if customer adds new drivers that use non-cached buffers.
Cache API • NAEnableMmu(): Called from main() to build translation table and start MMU. • NASetCacheAccessModes(): Can be used to change cache and access settings. • nonCachedMalloc(): Get a buffer from the non-cached region of memory. • nonCachedFree(): Return buffer to non-cached region. • NABeforeDma(): Prepare buffer for DMA. • NAAfterDma(): Process buffer after DMA.
Cache API cont. • NAInvalidateBuffer(): Indicate that buffer contents have changed and contents of cache are now invalid. • NACleanBuffer(): Update buffer with contents of cached data. • NAFlushCacheWriteBuffers(): Stop processor until cache write buffers are empty. Use for self modifying code.
Cache API cont. • MCEnableICache(): Turn instruction cache on. MMU does not need to be enabled. • MCDisableICache(): Turn instruction cache off. • MCInitMMU(): Program MMU to use translation table in .ttb section and turn it on. • MCEnableDCache(): Turn data cache on. Translation table must be setup and MMU enabled before calling this function. • MCDisableDCache(): Turn data cache off. Has side effect of disabling memory protection.