200 likes | 294 Views
laboratory. Versailles Saint Quentin University. Toolbox for Dimensioning Windows Storage Systems. Jalil Boukhobza , Claude Timsit jalil.boukhobza@prism.uvsq.fr 12/09/2006. Outline. Introduction Overview of the Windows I/O subsystem architecture The developed tools I/O benchmarking
E N D
laboratory Versailles Saint Quentin University Toolbox for Dimensioning Windows Storage Systems Jalil Boukhobza, Claude Timsit jalil.boukhobza@prism.uvsq.fr 12/09/2006
Outline • Introduction • Overview of the Windows I/O subsystem architecture • The developed tools • I/O benchmarking • Storage parameter extraction • I/O simulation • Summary PRiSM Lab/ University of Versailles
Introduction • Windows I/O system is poorly studied • CreateFile(): many file access modes different caching algorithms big performance fluctuations for a given workload (ratio 2 to 10). • Disk subsystems built independently from OS • Interaction with the OS are not easily predictable What is the performance of a given workload on a given system architecture for a defined I/O strategy on Windows systems and how to optimize it ? PRiSM Lab/ University of Versailles
Overview of the Windows I/O system architecture • Different file access modes in the CreateFile() function: • Without using the file system cache: no buffer mode (FILE_FLAG_NO_BUFFERING) • Using the file system cache: sequential, normal, write through modes. FILE_FLAG_SEQUENTIAL_SCAN, FILE_ATTRIBUTE_NORMAL, FILE_FLAG_WRITE_THROUGH I/O request FastIO File System Driver Storage Device Driver Cache Manager Page miss Storage Device Virtual Memory Manager PRiSM Lab/ University of Versailles
Developed tools 1. I/O benchmarking • All Windows file access modes Win32 CreateFile(): • Normal, sequential, random, no buffer, write through • Request sizes • Sequential / Random / interleaved (accesses) • Flexible test file selection (zoning) • Control on test file fragmentation (file defragmenter and mover) • Results: • I/O throughputs • Response times PRiSM Lab/ University of Versailles
2.Storage parameter extraction • Configuration parameters: partly provided by manufacturers (zoning information, cache segment size and number) • Performance parameters: measured to discover the real application performance that may be different from the peak advertised performance (seek times, memory to memory and disk cache to memory throughput, etc.) • Disk cache algorithms: this tool helps users to identify those different algorithms (e.g. read ahead & lazy write) PRiSM Lab/ University of Versailles
Cache segment size • Read block of size T from disk • Re-read that block: if entirely loaded from the disk cache -> segment size ≥T, increment T else decrement T • Empty the cache Study the periodicity to find track size Seek times Example • Disk cache updating algorithms Generally simple algorithms (LRU, FIFO, LFU, etc.) that can be tested once the segment size known by issuing different read block sequences and then re-read the blocks to see which one is accessed from the disk (and so has been ejected from the cache). Per request response time PRiSM Lab/ University of Versailles
3. The I/O simulation tool (WinIOSim) • Goals: • Application optimization: identifying the best I/O strategy for an application I/O workload on a given architecture. • Hardware optimization: finding the optimal hardware configuration for a given I/O workload. • What’s new ? • Implementation of Windows specific cache algorithms depending on the access modes identified by reverse engineering work. • Specific sequences of I/O requests issued by the system and application process for each access mode • Disk subsystem reactions to these algorithms • Specific reactions for specific sequences (issued by the file system cache depending on the disk PRiSM Lab/ University of Versailles
The WinIOSim architecture PRiSM Lab/ University of Versailles
The WinIOSim modules • I/O generators: • Workload generator • Request types, sizes, number, inter arrival times access modes, requested addresses, etc. • Different possible distributions for each parameter (Poisson, uniform, exponential, etc.) thanks to OMNET++. • Implemented request criticality (synchronous and asynchronous requests). • Trace files extracted using Filemon. • Process memory and file system cache modules: simulating the data copy operations, updating policies, etc. PRiSM Lab/ University of Versailles
The WinIOSim modules (2) • Application and system processes: • Request flow control. • Request grouping and splitting. • File system cache prefetching algorithms, lazy write and write through algorithms depending on access modes. • Both communicate to issue the final request sequence (as seen by the disk). • Different buses: simulating bus throughput, delays, sharing. PRiSM Lab/ University of Versailles
The WinIOSim modules (3) • IO scheduler controls the flow of requests to the disk subsystem • Queuing system: FIFO, SCAN, LOOK. • Disk • Mapping, zoning, spare area, number of platters, seek times, rotational speed, head switching times, track and cylinder skew, etc. • Disk cache • Segmentation, read ahead algorithms, lazy write and write through algorithms, cache updating policies, etc. PRiSM Lab/ University of Versailles
One requested block: 3 blocks of 64KB 64KB block loaded by the system process 64KB block loaded by the application process 1 1 1 2 2 2 Simulator’s file system cache strategies • Windows prefetching algorithms: No buffer mode • Read operations: • Sequential mode: loading data sequentially Bn, Bn+1, Bn+2, Bn+3, etc. • Normal mode (default) System process: B1 B2 B3 B4 3 3 3 What are the disk cache reactions? Will it load a part of these data ? The final sequence of request blocks is: B1,1,B1,2, B1,3, B3,1, B2,1, B3,2,B2,2, B3,3,B2,3,B4,1 ,B4,2, .. PRiSM Lab/ University of Versailles
Req 1 Req 3 Req 2 Req 4 Req 5 Req 6 Req 7 Req 8 Simulator’s file system cache strategies (2) • Write operations: • Sequential and normal modes : for one request some blocks are flushed on the disk and the others on the file system cache (later on flushed on the disk). • Write through mode • Each written block -> file system cache -> disk cache -> disk + modification of a system file on the disk -> acknowledge. Example with a 320KB request size: 1 64KB block copied to the disk Disk 64KB block copied to the file system cache and flushed later on to the disk File system cache flush PRiSM Lab/ University of Versailles
Configuration of the simulator • Inputs • I/O generator configuration • Modeled by the user • Real I/O traces • The simulated architecture definition • If existing: • Obtained from manufacturers (rarely complete) • Obtained using the WIOTester parameter extraction tool we developed. • Outputs • Response times and throughputs (2 main metrics for I/Os) • The different states of all the modules at each stage of the simulation PRiSM Lab/ University of Versailles
Validation of the simulator • Measures (SQLio & WioTestser) Vs Simulation (WinIOSim) • For read operations: • Sequential access with: • “no buffer” mode • “normal” mode • Random access with: • “no buffer” mode • “normal” mode • For the write operations: • “Normal” mode • “No buffer” mode PRiSM Lab/ University of Versailles
Tested architectures PRiSM Lab/ University of Versailles
Validation results PRiSM Lab/ University of Versailles
Summary • Efficient tool for Windows I/O system performance prediction and optimization. Based on the complementarity of measures and simulations. • Flexible and dedicated I/O benchmarking tool. • I/O parameter extraction tool. • Very accurate and flexible simulations of the whole Windows IO system: from application to disk (<10% error). • Simulation of the interactions between the modules for example file system cache and disk cache. PRiSM Lab/ University of Versailles
Thank you ! Questions ? www.prism.uvsq.fr/~jboukh jalil.boukhobza@prism.uvsq.fr PRiSM Lab/ University of Versailles