270 likes | 463 Views
Online Memory Compression for Embedded Systems. Lei Yang & Robert P. Dick Northwestern University Haris Lekatsas & Srimat Chkradhar NEC Laboratories America Presented by Adam Schindelar. ACM Transactions on Embedded Computing Systems, Vol. 9, No. 3. Objective.
E N D
Online Memory Compression for Embedded Systems Lei Yang & Robert P. Dick Northwestern University HarisLekatsas & SrimatChkradhar NEC Laboratories America Presented by Adam Schindelar ACM Transactions on Embedded Computing Systems, Vol. 9, No. 3.
Objective • To increase memory capacity in embedded systems • Using compression • Not changing hardware or application design • Maintain high performance • Low energy consumption Online Memory Compression for Embedded Systems
Motivation • Embedded system for secure network transactions • Memory requirements overran initial estimate • Two ways to solve the problem: • Redesign hardware Increasing time-to-market and $$$ OR… 2) Memory compression Online Memory Compression for Embedded Systems
CRAMES • “Compressed RAM for Embedded Systems” • Software-based RAM compression technique • Implemented as a loadable Linux kernel module • Evaluated on a PDA Online Memory Compression for Embedded Systems
Related Work • Code Compression • Often hardware-based • IBM MXT • Data Compression • Often software-based to reduce disk I/O • Two Types • Compressed caching • Swap compression • Filesystem Compression • “Cramfs” – read-only compressed Linux filesystem Online Memory Compression for Embedded Systems
Previous approaches • The few compression schemes in Embedded Sys: • Cannot handle dynamic data memory • Redesign + special-purpose hardware • Performance and energy consumption was bad Online Memory Compression for Embedded Systems
So, why CRAMES? • Handles online data memory compression and in-RAM filesystem compression • Requires no special hardware no redesign • No change to applications • Not that much overhead (Algorithm, memory alloc.) • Specifically for embedded systems Online Memory Compression for Embedded Systems
Design: Virtual Memory Swapping • Uses swapping to decides which pages to (de)compress • Compressed pages swapped out to “compressed RAM device” • RAM divided into two parts • Compressed areas are stored in linked list • When the device cannot handle new write request, it requests more memory from kernel Online Memory Compression for Embedded Systems
Design: Block-based Data Compression • CRAMES performs compression at the page level • Compression ratio: compressed memory / original memory size. Lower = BETTER. • Researched existing compression algorithms • Used a 64MB swap data file from a workstation • Later divided into uniform-sized blocks • LZO came out on top Online Memory Compression for Embedded Systems
Design: Kernel Memory Allocation • Needs to solve the following problems: • Efficiently allocating or locating a compressed page in the swap device • Mapping between the virtual locations of uncompressed pages and actual locations in compressed area • Maintain linked list of free slots that are merged when appropriate Online Memory Compression for Embedded Systems
Memory allocators CRAMES builds upon the kernel memory allocation (KMA) problem. Memory allocators implemented: Resource Map Allocator Simple Power-of-Two Freelists MuKusick-Karels Allocator Buddy System Lazy Buddy Algorithm Online Memory Compression for Embedded Systems
CRAMES with the Filesystem If the 12MB for system storage utilized 60% compression ratio AND 8MB device for swapping had 50% compression ratio… (8 / .5) = 16MB (12 / .6) = 20MB 36MB + 10MB + 2MB = 48MB Online Memory Compression for Embedded Systems
Implementation • CRAMES must register with kernel • CRAMES mapping table: • Good Request Handling for memory efficiency Online Memory Compression for Embedded Systems
Evaluation • Performance and power consumption of applications running on Sharp Zaurus SL-5600 PDA, 32MB RAM • With and without CRAMES • For Filesystem on Zaurus • For Swapping on Zaurus • Used Original RAM size • Constrained memory size Online Memory Compression for Embedded Systems
Filesystem experiments CRAMES was used to create a compressed RAM device for EXT2 filesystem on the Zaurus. Capable of reducing the RAM requirement, with little penalties! Average compression ratio 63% Online Memory Compression for Embedded Systems
Swapping – Original memory • Variety of applications were used to evaluate • Wrote special software to monitor user input • Stores input and timing information in a file • This file can then later be replayed to simulate identical user Online Memory Compression for Embedded Systems
Swapping • Applications can be grouped as follows: • Small working datasets: adpcm, mpeg2, jpeg, Hancom Word, Hancom Sheet, Calculator • Working datasets nearly as large as physical memory (still barely able to run without CRAMES): 500 x 500 matrix, Opera, Primtest, Quasar • Working datasets too large to fit into physical memory: Opera + Quasar + large matrix + Media Player Online Memory Compression for Embedded Systems
Swapping – reduced memory • Artificially constrained the memory size: Used a simple kernel module that did not take part in swapping. • Applications from MediaBench and one matrix multiplication app. No GUI apps. No playback system • No real latency with or without compression when RAM is higher than 24MB Online Memory Compression for Embedded Systems
Without compression, the kernel must rely on page reclamation from buffer caches to get enough memory to run the applications. Therefore, it takes longer, and effects performance and energy consumption. Online Memory Compression for Embedded Systems
Interesting…512 x 512 matrix multiplication Cannot run with 20-21MB without CRAMES. Power consumption and energy consumption actually IMPROVES because CRAMES starts compressing as soon as the free memory available becomes dangerously low (kernel then has to do less work). Online Memory Compression for Embedded Systems
Conclusions • CRAMES is capable of doubling the amount of memory available to applications, with tiny performance and energy consumption penalties. • When reduced RAM to as low as 20MB, all benchmarks execute with on average 9.5% increase in execution time. • No CRAMES (and reduced RAM)…benchmarks become unstable or suffer extreme performance issues. • Experiments with EXT2 filesystem, CRAMES increased available storage at least 40%. Online Memory Compression for Embedded Systems
Questions? Online Memory Compression for Embedded Systems