200 likes | 282 Views
Efficient and Flexible Architectural Support for Dynamic Monitoring. YUANYUAN ZHOU, PIN ZHOU, FENG QIN, WEI LIU, & JOSEP TORRELLAS UIUC. Outline. Background iWatcher Functionality iWatcher Design Performance Conclusion. Static or Dynamic Monitoring?. Static Monitoring
E N D
Efficient and Flexible Architectural Support for Dynamic Monitoring YUANYUAN ZHOU, PIN ZHOU, FENG QIN, WEI LIU, & JOSEP TORRELLAS UIUC
Outline • Background • iWatcher Functionality • iWatcher Design • Performance • Conclusion
Static or Dynamic Monitoring? • Static Monitoring • Needs annotation, programmer work • Difficult for unsafe languages (C, C++) • Dynamic Monitoring • Large instrumentation cost • Significant slowdown, performance loss • Dynamic is stronger than Static Monitoring • Dynamic based on actual execution path
Code or Location Controlled Dynamic Monitoring? • Code-Controlled Monitoring • Monitoring performed by special instructions • Assertions & Dynamic Checkers belong here • No hardware support needed • Location-Controlled Monitoring • Monitoring performed only when program accesses watched memory locations by any way • Hardware support is usually required • iWatcher and hardware-assisted watchpoints
iWatcher Functionality • Flexible and low-overhead dynamic monitoring • With hardware support • Without expensive exceptions • The program has its own internal light-weight exception handler, the monitoring function • When a watched memory address is accessed, the monitoring function is automatically executed.
iWatcher Functionality (cont) • If the check of the monitoring action fails, then: • Report, simply report error (non-interactive) • Break, raise a hardware exception, switching control to the debugger • Rollback, revert to a safe checkpoint • For the same address, more than one monitors may be watching.
iWatcher – Software Level int x, *p; /* assume invariant: x = 1 */ iWatcherOn(&x, sizeof(int), READWRITE, BreakMode, &MonitorX, &x, 1); ... p = foo(); /* a bug: p points to x incorrectly */ *p = 5; /* line A: a triggering access */ z = Array[x]; /* line B: a triggering access */ ... iWatcherOff(&x, sizeof(int), READWRITE, &MonitorX); bool MonitorX(int *x, int value){ return (*x == value); }
How to monitor a location? • When iWatcherOn() is called • Add monitoring function to (software) CheckTable • If size < LargeRegion → all words are transferred to L2 cache and tagged • update L1 if necessary • If size > LargeRegion → the entire area is tagged in the Range Watch Table (RWT) • If RWT full, proceed as if size < LargeRegion
How to monitor a location? (cont) • If a word is evicted from L2, store the watch bits (if valid) in Victim WatchFlag Table VWT • If VWT full, O/S support (rare) • When the word is restored, copy the watch bits from VWT • When iWatcherOff is called: • Remove monitoring function from Check Table • If no monitors are watching this area, update VWT, RWT, L1 and L2 bits as necessary.
How to detect a triggering access? • Out of Order Execution, Pipelining → • Not all instructions will commit • For each Load/Store • Check if valid entry exists in RWT • Bring word and WatchFlag from cache (load) or prefetch word to cache and get WatchFlag (store) • Store the flags in the ReOrder Buffer (ROB) • Upon retirement of instruction (if it retires), jump to the monitor, if bits are set.
How to Trigger Monitoring Functions? • When a triggering access is detected • Save processor status and jump to Main_Check_Function Register • The monitor scans the CheckTable and calls serially all monitors that: • Watch this address • For this access mode • For performance, the Thread-Level Speculation (TLS) mechanism may be used.
Performance Compared to Valgrind • 4-179% overhead, 25-169x less than Valgrind
Performance with/without TLS • Up to 30% reduction in two cases
Performance varying the fraction of triggering loads and TLS
Performance varying the size of monitoring function and TLS • Above 4 contexts there is no significant improvement
Conclusion • Some Hardware Changes • <180% overhead if 20% of loads are monitored • Detects most bugs • Buffer Overflow • Memory Leaks • Access to non-allocated or non-initialized • …