660 likes | 991 Views
Cache Manager. Orlando Robertson Brown Bag Seminar Microsoft Windows Internals 4 th Edition. Cache Manger. What is the cache manager?
E N D
Cache Manager Orlando Robertson Brown Bag Seminar Microsoft Windows Internals 4th Edition
Cache Manger • What is the cache manager? • a set of kernel-mode functions and system threads that cooperate with the memory manager to provide data caching for all Windows file system drivers (both local and network) • how the cache manager works, including its key internal data structures and functions • how it’s sized at system initialization time • how it interacts with other elements of the operating system • how you can observe its activity through performance counters • describe the five flags on the Windows CreateFile function that affect file caching
Key Features Cache Manager • Supports all file system types (local and network) • Uses the memory manger to control what parts of what files are in physical memory • Uses virtual block data caching • Allowing for intelligent read-ahead and high-speed access to the cache without involving file system drivers • Supports “hints” passed by applications at file open time • Ex. random vs. sequential access, temporary file creation, … • Supports recoverable file systems to recover data after a system failure • Ex. file systems using transaction logging
Single Centralized Cache • caches all externally stored data • local hard disks, floppy disks, network file servers, or CD-ROMs • Any data can be cached • user data streams or file system metadata • Windows cache method depends on the type of data being cached
Memory Manger • cache manager never knows how much cached data is actually in physical memory • cache manager accesses data by mapping views of files into system virtual address spaces using standard section objects • Section objects are the basic primitive of the memory manager • memory manager pages in blocks that aren’t in physical memory as addresses in the mapped views are accessed
Memory Manger • cache manager copies data to or from virtual addresses and relies on the memory manager to fault the data into (or out of) memory as needed • avoids generating read or write I/O request packets (IRPs) to access data for files it’s caching • memory manager allowed to make global trade-offs on how much memory to give to system cache vs. user processes • design allows for processes that open cached files to see the same data as processes that are mapping the same files into their user address spaces
Cache Coherency • cache manager ensures that any process accessing cached data will get the most recent version • cache manager and user applications that map files into their address spaces use the same memory management file mapping services • memory manager guarantees that it has only one representation of each unique mapped file • memory manager maps all views of a file (even if they overlap) to a single set of pages in physical memory
Virtual Cache Blocking • logical block caching: cache manager keeps track of which blocks of a disk partition are in the cache • Novell NetWare, OpenVMS, and older UNIX systems • virtual block caching: cache manager keeps track of which parts of which files are in the cache • Windows cache manager • cache manager maps file portions of 256-KB into system virtual address spaces • uses special system cache routines located in the memory manager • intelligent read-ahead possible • cache manager can predict where the caller might be going next • fast I/O possible • I/O system can address cached data to satisfy an I/O request bypassing the file system
Stream Based Cache • Stream: a sequence of bytes within a file • NTFS allows a file to contain more than one stream • cache manager designed to cache each stream independently • NTFS can organize its master file table into streams and cache these streams • caches streams are identified by both a filename and stream name (if more than one stream exists)
Recoverable File System Support • Half-completed I/O operations can corrupt a disk volume and render an entire volume inaccessible • Recoverable file systems are designed to reconstruct the disk volume structure after a system failure (NTFS) • NTFS maintains a log file which records every intended update to the file system structure (the file system’s metadata) before writing changes to the volume
Recoverable File System Support • cache manager and file system must work together to ensure that the following actions occur in sequence: • The file system writes a log file record documenting the volume update it intends to make. • The file system calls the cache manager to flush the log file record to disk. • The file system writes the volume update to the cache, it modifies its cached metadata. • The cache manager flushes the altered metadata to disk, updating the volume structure.
Recoverable File System Support • logical sequence number (LSN) identifies the record in its log file and corresponds to the cache update • LSN supplied by file system during data writes to the cache • cache manager tracks LSNs associated with each page in the cache • data streams marked as “no write” by NTFS are protected from page writes before the corresponding log records are written
Recoverable File System Support • A file system to flush a group of dirty pages to disk • cache manager determines the highest LSN associated with the pages to be flushed and reports that number to the file system • file system calls the cache manager back to flush log file data up to the point represented by the reported LSN • cache manager flushes the corresponding volume structure updates to disk • file system and the cache manager records what it’s going to do before actually doing it thereby providing the recoverability of the disk volume after a system failure
Cache Virtual Memory Management • cache manager given a region of system virtual address spaces to manage (instead of a region of physical memory) • cache manager divides each address space into 256-KB slots called views • cache manager maps views of files into slots in the cache’s address space on a round-robin basis
Cache Virtual Memory Management • cache manager guarantees that active views are mapped • A view is marked active only during a read or write operation to or from the file • cache manager unmaps inactive views of a file as it maps new views • Except for processes specifying the FILE_ FLAG_RANDOM_ACCESS flag in the call CreateFile • Pages for unmapped views are sent to the standby or modified lists • FILE_FLAG_SEQUENTIAL_SCAN flag moves pages to the front of the lists
Cache Virtual Memory Management • cache manager needs to map a view of a file and there are no more free slots in the cache • unmap the least recently mapped inactive view and use that slot • If no views are available return an I/O error
Cache Size • Windows computes the size of the system cache • virtual and physical • size of the system cache depends on a number of factors • memory size • version of Windows
LargeSystemCache • A registry value • HKLM\SYSTEM\CurrentControlSet\Control\SessionManager\Memory Management\LargeSystemCache • Affects both the virtual and physical sizes of the cache • Default value is 0 • Windows 2000 Professional and Windows XP • Default value is 1 • Windows Server systems
LargeSystemCache • can modify LargeSystemCache • System -> Advanced -> Performance -> Settings -> Advanced • Memory Usage
Cache Virtual Size • system cache virtual size is dependent on the physical memory • default virtual size is 64 MB • algorithm for virtual size of the system cache for a computer • 128 MB + (physical memory - 16 MB) / 4 MB * 64 MB = virtual memory
Cache Virtual Size • x86 system virtual size limited to 512 MB • unless registry value LargeSystemCache = 1 • cache virtual memory limited to 960 MB • more cache virtual memory results in fewer view unmaps and remap operations
Cache Working Set Size • cache manager controls the size of system cache • dynamically balance demands for physical memory between processes and operating system • system cache shares a single system working set • cache data, paged pool, pageable Ntoskrnl code, pageable driver code • memory manager favors the system working set over processes running on the system • registry value LargeSystemCache = 1 • examine the physical size of the system cache on the system working set by examining the performance counters or system variables
Cache Physical Size • system working set does not necessarily reflect the total amount of file data cached in physical memory • file data might be in the memory manager’s standby or modified page lists • standby list can consume nearly all the physical memory outside the system working set • No other demand for physical memory • total amount of file data cached are controlled by the memory manager • Considered in some sense the real cache manager
Cache Physical Size • cache manager subsystem acts as a facade for accessing file data thru the memory manager • important for read-ahead and write-behind policies • Task Manager show total amount of file data that’s cache on a system • value named System Cache
Cache Data Structures • VACB for each 256-KB slot in the system cache • private cache map for each opened cached file • Information for read-ahead • shared cache map structure for each cached file • points to mapped views of the file
Systemwide Data Structures • virtual address control blocks (VACBs) track of the state of the views in the system cache • cache manager allocates all the VACBs required to describe the system cache • Each VACB represents one 256-KB view in the system cache
Systemwide Data Structures • first field in a VACB is the virtual address of the data in the system cache • second field is a pointer to the shared cache map structure, • third field identifies the offset within the file at which the view begins • forth field identifies how many active reads or writes are accessing the view
Per-File Cache Data Structures • shared cache map structure describes the state of the cached file • size and valid data length (for security reasons) • list of private cache maps • VACBs for currently mapped views • section object • private cache map structure contains location of the last two reads for intelligent read-ahead • section object describes the file’s mapping into virtual memory
Per-File Cache Data Structures • VACB index array is an array of pointers to VACBs cache manager maintains to track which views are mapped into the system cache • cache manager uses the file’s VACB index array to see if the requested data has been mapped into the cache • data reference is in the cache for nonzero array entry • data reference is not in the cache for zero array entry
File System Interfaces • file system driver determines whether some part of a file is mapped in the system cache • call the CcInitializeCacheMap function if not • file system driver calls one of several functions to access data in a file • copy method copies user data between cache buffers in system space and a process buffer in user space • mapping and pinning method uses virtual addresses to read and write data directly to cache buffers • physical memory access method uses physical addresses to read and write data directly to cache buffers. • file system drivers must provide two versions of the file read operation to handle a page fault • cached and noncached
File System Interfaces • typical interactions for cache manager, memory manager, and file system drivers in response to user read/write file I/O • cache manager is invoked by a file system through the copy interfaces • cache manager creates a view and reads the file data into the user buffer • copy operation generates page faults as it accesses each previously invalid page • memory manager initiates noncached I/O into the file system driver to retrieve the data
Copying to & from the Cache • system cache is in system space • not available from user mode due to a potential security hole • user application file reads and writes to cached files must be serviced by kernel-mode routines • data copied between cache’s buffers in system space and application’s buffers in process address space
Caching w/ Mapping & Pinning Interfaces • file system drivers need to read and write the data that describes the files • metadata or volume structure data • cache manager provides the functions file system drivers to modify data directly in the system cache • mapping and pinning interfaces resolves a file system’s buffer management problem • doesn’t predict the maximum number of buffers it will need • eliminates the need for buffers by updating the volume structure in virtual memory
Caching w/ DMA Interfaces • direct memory access (DMA) functions are used to read from or write to cache pages without intervening buffers • Ex. network file system doing a transfer over the network • DMA interface returns to the file system the physical addresses of cached user data • used to transfer data directly from physical memory to a network device • can result in significant performance improvements for large transfers • a memory descriptor list (MDL) is used to describe physical memory references
Fast I/O • Fast I/O is a means of reading or writing a cached file without going through the work of generating an IRP • Fast I/O doesn’t always occur • first read or write to a file requires setting up the file for caching • during an asynchronous read or write and caller stalled during paging I/O operations • file in question has a locked range of bytes
A thread performs a read or write operation the request passes to the fast I/O entry point of the file system driver calls the cache manager read or write routine to access the file data directly in the cache. cache manager translates the supplied file offset into a virtual address in the cache. cache manager copies the data from the cache into the buffer or copies the data from the buffer to the cache. read-ahead information in the caller’s private cache map is updated (reads) dirty bit of any modified page in the cache is set so that the lazy writer will know to flush it to disk (writes) modifications are flushed to disk (write-through) Fast I/O
Read Ahead & Write Behind • cache manager implements reading and writing file data for the file system drivers • file I/O only when a file opened without the FILE_FLAG_NO_BUFFERING flag • file I/O using the Windows I/O functions • excluding mapped files
Intelligent Read Ahead • cache manager uses the principle of spatial locality to perform intelligent read-ahead • predicting what data the calling process is likely to read next • File read-ahead for logical block caching is more complex • asynchronous read-ahead with history extend read-ahead benefits to cases of strided data accesses • uses last two read requests in the private cache map for the file handle being accessed • FILE_FLAG_SEQUENTIAL_SCAN flag makes read-ahead even more efficient with CreateFile function • cache manager doesn’t keep a read history • reads ahead two times as much data
Intelligent Read Ahead • cache manager’s read-ahead is asynchronous • performed in a thread separate from the caller’s thread and proceeds concurrently with the caller’s execution • cache manager first accesses the requested virtual page • then queues an additional I/O request to retrieve additional data to a system worker thread • worker thread prereads additional data • preread pages are faulted into memory while the program continues executing • data already in memory when the caller requests it • FILE_FLAG_ RANDOM_ACCESS flag for no predictable read pattern
Write Back Caching & Lazy Writing • lazy write means that data written to files is first stored in cache pages and then written to disk later • Write operations flushed to disk all at once • reduces overall number of disk I/O operations • memory manager flushes cache pages to disk • cache manager request it • demand for physical memory is high • decision when to flush the cache is important • Flushing too frequently slows system performance with unnecessary I/O • Flushing too rarely runs the risk of losing modified file data during a system failure