Accurate and Efficient Filtering for the Intel Thread Checker Race Detector

Accurate and Efficient Filtering for the Intel Thread Checker Race Detector By Paul Sack, Brian E. Bliss, Zhiqiang Ma, Paul Petersen, Josep Torrellas 2014-10-23 OS Lab Ok-Kyoon Ha 2006 ACM

Motivation • debugging data races is a difficult task • detector has two common types of algorithms - Lockset-based algorithm & Vector clock-based algorithm • data race-detection tools - have reasonable overheads (2x slowdowns) - do not provide much useful information or have limited usage models • Intel Thread Checker - provide an abundance of useful information and have few usage constrains - have high performance costs (233x slowdowns) SBMP06

Overheads of Intel’s Thread Checker - instrumentation alone: slowdown of 22x - full algorithm: slowdown of 233x - memory overhead: imposes a 20x SBMP06

Approach • Objective - to reduce the amount of work done by the algorithm • Filtering useless references SBMP06

Three Filters (1/3) • Stack Filter - filter if one thread accesses another’s stack - cannot cause data races to be lost and is very efficient • Implementation Issues of Stack Filter - the simplest filter and has the lowest overhead - compares the memory reference address with the stack base and limit address SBMP06

Three Filters (2/3) • Duplicate Filter - maintain the first load and store references to a variable in each segments - filter duplicate references in segments - can only cause Thread Checker to lose duplicate data races • Implementation Issues of Duplicate Filter - slower than the stack filter - maintains filter tables that organized 4 fields add size type ID T1 T2 add size type ID SBMP06

R1, W1 R1, W1 W’ R’ R R, W W Three Filters (3/3) • FSM Filter - base the Eraser state machine - filter reference in the Private state and in the Shared Read Only state - filter the initial references (Uninit → Private, Private → SHD RO) UNINIT PRIVATE SHR RW SHR RO Eraser state machine SBMP06

Experimental Setup • Environments - 4-way 2.5GHz Pentium 4 workstation - use the SPLASH-2 applications - run with 4 threads on 4 processors • Measurements - filtering statistics are collected by running each application three times - performance results are collected by running each application nine times - each application is run in Thread Checker with and without three filters - compare the number of data-race bugs reported with and without the filters SBMP06

Filtering Effectiveness Incremental filtering effectiveness Different filter combinations SBMP06

Performance Speedups obtained with filtering SBMP06

Data-race Detection Characterizing the impact of the three filers combined SBMP06

Conclusions and Future Work • Conclusion - Intel Thread Checker slowdown of 233x on average - filtering out the vast majority of memory references - develop three filters that filter 98% of all memory references - speedups of 3.3x on average • Future Work - improve the FSM filter - to improve the other overhead sources in Thread Checker SBMP06

Accurate and Efficient Filtering for the Intel Thread Checker Race Detector

Accurate and Efficient Filtering for the Intel Thread Checker Race Detector

Presentation Transcript

Efficient Merging and Filtering Algorithms for Approximate String Searches

ACCURATE PERCEPTIONS: RACE AND GENDER IN THE UNIVERSITY SETTING

Classification.NET: Efficient and Accurate Classification in C#

Efficient Probe Filtering

Efficient and accurate surveying with GPS/Glonass

Accurate perceptions: Race and gender in the university setting

Efficient Merging and Filtering Algorithms for Approximate String Searches

Accurate and Efficient Accreditation Documentation Preparation

Efficient and accurate algorithms for peptide mass spectrometry

Efficient Query Filtering for Streaming Time Series

Efficient Merging and Filtering Algorithms for Approximate String Searches

Algorithms for Efficient Collaborative Filtering

Accurate, Efficient, and Adaptive Calling Context Profiling

ACCURATE PERCEPTIONS: RACE AND GENDER IN THE UNIVERSITY SETTING

Efficient Data-Race Detection

Accurate And Efficient SLA Compliance Monitoring

Increasing the probability of selecting efficient and accurate coders

A Novel Hemispherical Basis for Accurate and Efficient Rendering

Efficient Query Filtering for Streaming Time Series

Efficient PCF shadowmap filtering

Online Plagiarism Checker With Accurate Percentage

Efficient and Accurate Accounting Tax Prep