Measured Performance of Commodity Operating Systems

Measured Performance of Commodity Operating Systems Work done at Harvard University CSE-597A Presentation V.N.Murali

Initial Remarks • Any ideas? • Windows for Workgroups:Most expensive due to frequent changes in machine mode and system call hooks. • NT:more efficient than above,overhead due to microkernel-like design(Win-32 API is a user-level server) • NetBSD:Most efficient

Components

Device Drivers • Windows::VxD(virtual Device Drivers) which are activated by hardware interrupts,system calls and/or entry points. • NetBSD::/dev file is installed.Load a driver and access the counter using the device file.Configured using ioctl() • NT::DDk is the API for driver development.Loaded into the NT executive

Microbenchmarks • Null :: counter access latency. • Syscall :: minimum system call latency • Exec :: latency to load and run a program • Memory access time :: access times for arrays (various sizes) • File System • BitBlt :: Graphics subsystems • Network

Application workloads • Wish :: command interpreter for the Tcl language (test for GUI subsystem,CPU intensive,very little Disk activity) • Ghostscript • WWW server • (ASSUMPTIONS:Single user mode,little or no background activity)

Metrics • Cycle counts. • Instruction counts ? ( Would this be a correct metric ?) • Data read/write references ??(Similar arguments) • Cache and TLB misses (Icache,Dcache,ITLB,DTLB)

RESULTSNull micro • Least instruction count : Windows. Why? • Maximum instruction count: Windows NT.Why? • Higher CPI : Windows.Why? • Highest cycle counts : NT.Why? • Highest instruction cache miss : NT.Why?

RESULTSSyscall micro • Used a simple system call(dup) on UNIX and NT.Get_extended_error_info (16bitWin,int21),get_interrupt_vector (32bit,int 21).. • Overhead for dup in NT and BSD is same as Null benchmark.Similar behavior in I-cache misses for NT. • Most efficient in terms of cycle counts? Windows 32 bit system call?

Contd. • Windows 16 bit is very expensive?Why? • Hooks?System calls can be intercepted in DOS.Used for CDROM drivers,caching etc • Most Efficient:NetBSD and Windows 32 bit.Both NT and Windows 16 are expensive.

RESULTSExec micro • NetBSD uses a vfork() and exec() combo while NT uses CreateProcess().Windows has a shared address space,so nothing is created. • Most efficient : NetBSD static.Why? • Worst :NT dynamic.Why?

Contd.. • Windows overhead is higher than NetBSD static and lower than either OS’s dynamic linkage • Maximum data reads and writes : NT.Why? • Dynamic linkage incurs a lot of overhead.

RESULTSMemory Access time • Repeated references to arrays of various sizes using stride=128bytes. • 8k on chip cache and 256K onboard cache. • NT uses a deterministic page mapping policy => similar performance upto 256K and smooth degradation afterwards • NetBSD uses a non-determinstic page mapping policy.Hence poor performance for >8k size

Contd. • NT’s file system buffer cache is integrated to the VM system.This rings a bell doesn’t it?? • 64k segment size in Windows limits performance

RESULTSFile System Micro • 3 activities,a)Hit in disk cache b)Access to small files c)Creation • Tested NTFS and FAT32 for Windows NT,FAT 16,32 for Windows,FFS for NetBSD. • Least overhead: NetBSD • Most overhead : NT/NTFS,FAT 16.

Contd.. • Meta data updates :NTFS performs the best because it logs them,while FFS performs poorly because of synchronous writes.

RESULTSGraphics micro • Display an array of pixels repeatedly • Worst performance . NetBSD.Why? • NETWORK THROUGHPUT • Worse code locality for NT => instruction cache and TLB misses. • However throughput is comparable to BSD.Limitation is due to the ethernet.

Summary • Frequent CPU mode changes in Windows : expensive • 64K segment size is a limitation • Higher instruction counts and cache misses for NT ?Why? • Efficient graphics and relaxed file system semantics in NT

Application workloads • Wish:Overhead includes context switches and IPC.Highest overhead in Windows.Why? • NT is worse than BSD.Why? • Ghostscript: Best performance by NT and Windows • Web Server:Best results for NetBSD and Windows,Worst results for NT.NT is believed to have a very poor network implementation.

Measured Performance of Commodity Operating Systems

Measured Performance of Commodity Operating Systems

Presentation Transcript

Disco: Running Commodity Operating Systems on Scalable Multiprocessors

Improving the Reliability of Commodity Operating Systems

Disco: Running Commodity Operating Systems on Scalable Multiprocessors

Functionalities of Operating Systems

HPMMAP: Lightweight Memory Management for Commodity Operating Systems

FUNCTIONS OF OPERATING SYSTEMS

Characteristics of operating systems

Measured Performance of Commodity Operating Systems

Disco: Running Commodity Operating Systems on Scalable Multiprocessors

Performance Evaluation of Real-Time Operating Systems

Disco: Running Commodity Operating Systems on Scalable Multiprocessors

Performance Evaluation of Commodity iSCSI-based Storage Systems

Process isolation for cloud computing using commodity operating systems

IMPROVING THE RELIABILITY OF COMMODITY OPERATING SYSTEMS

Principles of Operating Systems

Disco: Running Commodity Operating Systems on Scalable Multiprocessors

OPERATING SYSTEMS SYSTEMS

Principles of Operating Systems

Understanding Performance in Operating Systems

Improving the Reliability of Commodity Operating Systems

Disco: Running Commodity Operating Systems on Scalable Multiprocessors

Improving the Reliability of Commodity Operating Systems