170 likes | 754 Views
uClinux vs Linux Context Switching and IPC Performance Comparison. 2004.09.10 Heechul Yun Digital Media R&D Center. Contents. Introduction ARM9 Cache Architecture Benchmark IPC Performance Context Switching Performance Conclusion Future Works. Introduction. Objective
E N D
uClinux vs Linux Context Switching and IPC Performance Comparison 2004.09.10 Heechul Yun Digital Media R&D Center
Contents • Introduction • ARM9 Cache Architecture • Benchmark • IPC Performance • Context Switching Performance • Conclusion • Future Works
Introduction • Objective • Fair performance comparison of uClinux and Linux • Focused on Context Switching and IPC • Context switch • Restore register and address space of next process • IPC (Inter Process Communication) • Send & recv messages between processes
uClinux vs Linux • Linux • Separate virtual address space for each process • Need to recover address space on context switching • uClinux • Single shared address space for all process • No need to recover address space on context switching uClinux may be better on context switching
Virtually-indexed caches: Flush cache on context switch Direct cost : 1k~18k cycles Indirect cost : up to 54k cycles Up to 270us on 200MHz VA PA TLB Perm Memory CPU I-Cache D-Cache Data ARM9 MMU Architecture Observation: • Can avoid cache flush if no address overlap
ARM926EJ-S Virtual Cache Architecture 0 32 12 11 5 4 2 1 • Fully virtual address based index & tag • Separate 4way set-associative 16K I&D Cache. • 8word for each cache line Tag Index Word Byte 1 2 3 4 . TAG . 1 2 3 4 2 5 6 7 8 . . . . 128 = = Hit Read data
Master Child1 Child2 read read read write write write FIFO 0 FIFO 1 FIFO 2 Benchmark • Lmbench2 [Lmbench] • Famous OS benchmark • Modified for uClinux (vfork, FIFO) • lat_ctx, lat_fifo, bw_pipe is used.. ‘lat_ctx’ FIFO architecture
Benchmark Setup • Same H/W, Same Benchmark program are used • The only difference is kernel (uClinux, Linux) LMBench2 lat_fifo, lat_ctx, bw_pipe used App Linux-2.6.7 uClinux 2.6.7 Kernel H/W: SMDK24A0 - ARM926ejs based S3C24A0 - 16K I&D Cache H/W
Context Switching Performance • 0KB workload • Each proc immediately switch to next • pure ctx overheadcomparison
Conclusion • The first fair performance comparison between uClinux and Linux • Same H/W platform • Same benchmark S/W • uClinux has better IPC & Context switching performance • Because cache is valid on context switching • Beneficial for • Real-time critical application • IPC oriented application
Future Work (?) • Extending benchmarks • Interrupt latency, …. • Share the result with community • Improving uClinux • Need protection • inherit ‘Single Address Space Operating System’ research for 64bit processors • More compatibility • No fork() • Fixed heap & stack size
Reference • [FASS’03] Adam Wiggins et el. “Implementations of Fast Address-Space Switching and TLB Sharing on the StrongARM Processor”, in the Proceddings of the 8th Australia-Pacific Computer Systems Architecture Conference, Aizu-Wakmatsu City, Japan, September 2003. • [LmBench’96] McVoy, L., Staelin, C. “lmbench: Portable tools for performance analysis”. In: Proceedings of the 1996 • [UC] uClinux/ARM 2.6 Project.http://opensrc.sec.samsung.com/ • USENIX Technical Conference, San Diego, CA, USA (2996) • [24A0] Samsung S3C24A0 Product Datasheet.http://www.samsung.com/Products/Semiconductor/SystemLSI/MobileSolutions/MobileASSP/MobileComputing/S3C24A0/S3C24A0.htm • [926] ARM926EJ-S Technical Reference Manual. http://www.arm.com/pdfs/DDI0198D_926_TRM.pdf • [ARM] ARM Architecture Reference Manual. ARM LTD.