1 / 34

Correcting Threading Errors with Intel® Parallel Inspector

Correcting Threading Errors with Intel® Parallel Inspector. Objectives. After successful completion of this module you will be able to… Use Parallel Inspector to detect and identify a variety of threading correctness issues in threaded applications

fleur
Download Presentation

Correcting Threading Errors with Intel® Parallel Inspector

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Correcting Threading Errors with Intel® Parallel Inspector

  2. Objectives • After successful completion of this module you will be able to… • Use Parallel Inspector to detect and identify a variety of threading correctness issues in threaded applications • Determine if library functions are thread-safe Intel® Parallel Inspector

  3. Agenda • What is Intel® Parallel Inspector? • Detecting race conditions • Detecting potential for deadlock • Checking library thread-safety Intel® Parallel Inspector

  4. Motivation • Developing threaded applications can be a complex task • New class of problems are caused by the interaction between concurrent threads • Data races or storage conflicts • More than one thread accesses memory without synchronization • Deadlocks • Thread waits for an event that will never happen Intel® Parallel Inspector

  5. Intel® Parallel Inspector • Debugging tool for threaded software • Plug-in to Microsoft* Visual Studio* • Finds threading bugs in OpenMP*, Intel® Threading Building Blocks, and Win32* threaded software • Locates bugs quickly that can take days to find using traditional methods and tools • Isolates problems, not the symptoms • Bug does not have to occur to find it! Intel® Parallel Inspector

  6. Intel® Parallel Inspector Features • Integrated into Microsoft Visual Studio .NET* IDE • 2005 & 2008 Editions • Supports different compilers • Microsoft* Visual* C++ .NET* • Intel Parallel Composer • View (drill-down to) source code for Diagnostics • One-click help for diagnostics • Possible causes and solution suggestions Intel® Parallel Inspector

  7. Parallel Inspector: Analysis • Dynamic as software runs • Data (workload) -driven execution • Includes monitoring of: • Thread and Sync APIs used • Thread execution order • Scheduler impacts results • Memory accesses between threads Code path must be executed to be analyzed Intel® Parallel Inspector

  8. Parallel Inspector: Before You Start • Instrumentation: background • Adds calls to library to record information • Thread and Sync APIs • Memory accesses • Increases execution time and size • Use small data sets (workloads) • Execution time and space is expanded • Multiple runs over different paths yield best results Workload selection is important! Intel® Parallel Inspector

  9. Workload Guidelines • Execute problem code once per thread to be identified • Use smallest possible working data set • Minimize data set size • Smaller image sizes • Minimize loop iterations or time steps • Simulate minutes rather than days • Minimize update rates • Lower frames per second Finds threading errors faster! Intel® Parallel Inspector

  10. Building for Parallel Inspector • Compile • Use dynamically linked thread-safe runtime libraries (/MDd) • Generate symbolic information (/ZI) • Disable optimization (/Od) • Link • Preserve symbolic information (/DEBUG) • Specify relocatable code sections (/FIXED:NO) Intel® Parallel Inspector

  11. Binary Instrumentation • Build with supported compiler • Running the application • Must be run from within Parallel Inspector • Application is instrumented when executed • External DLLs are instrumented as used Intel® Parallel Inspector

  12. Starting Parallel Inspector • Build the Debug version of the application with appropriate flags set Intel® Parallel Inspector

  13. Starting Parallel Inspector • Select Parallel Inspector from the Tools menu You can choose to look for • Memory Errors • Threading Errors Intel® Parallel Inspector

  14. Starting Parallel Inspector • The Configure Analysis window pops up Select the level of analysis to be carried out by Parallel Inspector • The deeper the analysis, the more thorough the results and the longer the execution time Click Run Analysis Intel® Parallel Inspector

  15. Starting Parallel Inspector • The initial (raw) results come up after analysis Click the Interpret Results button to filter the raw data into more human consumable formats Intel® Parallel Inspector

  16. Starting Parallel Inspector • The analysis results are gathered together in related categories Double-click a line from the Problem Sets pane to see the source code that generated the diagnostic Intel® Parallel Inspector

  17. Starting Parallel Inspector • The source lines involved in a data race can be shown Intel® Parallel Inspector

  18. Activity 1a - Potential Energy • Build and run serial version • Build threaded version • Run application in Parallel Inspector to identify threading problems Intel® Parallel Inspector

  19. Race Conditions • Execution order is assumed but cannot be guaranteed • Concurrent access of same variable by multiple threads • Most common error in multithreaded programs • May not be apparent at all times Intel® Parallel Inspector

  20. Solving Race Conditions • Solution: Scope variables to be local to threads • When to use • Value computed is not used outside parallel region • Temporary or “work” variables • How to implement • OpenMP scoping clauses (private, shared) • Declare variables within threaded functions • Allocate variables on thread stack • TLS (Thread Local Storage) API Intel® Parallel Inspector

  21. Solving Race Conditions • Solution: Control shared access with critical regions • When to use • Value computed is used outside parallel region • Shared value is required by each thread • How to implement • Mutual exclusion and synchronization • Lock, semaphore, event, critical section, atomic… • Rule of thumb: Use one lock per data element Intel® Parallel Inspector

  22. Activity 1b - Potential Energy • Fix errors found by Parallel Inspector Intel® Parallel Inspector

  23. Deadlock • Caused by thread waiting on some event that will never happen • Most common cause is locking hierarchies • Always lock and un-lock in the same order • Avoid hierarchies if possible DWORD WINAPI threadA(LPVOID arg) { EnterCriticalSection(&L1); EnterCriticalSection(&L2); processA(data1, data2); LeaveCriticalSection(&L2); LeaveCriticalSection(&L1); return(0); } ThreadB: L2, then L1 DWORD WINAPI threadB(LPVOID arg) { EnterCriticalSection(&L2); EnterCriticalSection(&L1); processB(data2, data1) ; LeaveCriticalSection(&L1); LeaveCriticalSection(&L2); return(0); } ThreadA: L1, then L2 Intel® Parallel Inspector

  24. Thread 4 swap(Q[986], Q[34]); Thread 1 Grabs mutex 34 Grabs mutex 986 swap(Q[34], Q[986]); Deadlock • Add lock per element • Lock only elements, not whole array of elements typedef struct { // some data things SomeLockType mutex; } shape_t; shape_t Q[1024]; void swap (shape_t A, shape_t B) { lock(a.mutex); lock(b.mutex); // Swap data between A & B unlock(b.mutex); unlock(a.mutex); } Intel® Parallel Inspector

  25. Windows* Critical Section • Lightweight, intra-process only mutex • Most useful and most used • New type • CRITICAL_SECTION cs; • Create and destroy operations • InitializeCriticalSection(&cs) • DeleteCriticalSection(&cs); Intel® Parallel Inspector

  26. Windows* Critical Section • CRITICAL_SECTIONcs ; • Attempt to enter protected code • EnterCriticalSection(&cs) • Blocks if another thread is in critical section • Returns when no thread is in critical section • Upon exit of critical section • LeaveCriticalSection(&cs) • Must be from obtaining thread Intel® Parallel Inspector

  27. #define NUMTHREADS 4 CRITICAL_SECTION g_cs; // why does this have to be global? int g_sum = 0; DWORD WINAPI threadFunc(LPVOID arg ) { int mySum = bigComputation(); EnterCriticalSection(&g_cs); g_sum += mySum; // threads access one at a time LeaveCriticalSection(&g_cs); return 0; } main() { HANDLE hThread[NUMTHREADS]; InitializeCriticalSection(&g_cs); for (int i = 0; i < NUMTHREADS; i++) hThread[i] = CreateThread(NULL,0,threadFunc,NULL,0,NULL); WaitForMultipleObjects(NUMTHREADS, hThread, TRUE, INFINITE); DeleteCriticalSection(&g_cs); } Example: Critical Section Intel® Parallel Inspector

  28. Activity 2 - Deadlock • Use Intel® Parallel Inspector to find and correct the potential deadlock problem. Intel® Parallel Inspector

  29. Thread Safe Routines • All routines called concurrently from multiple threads must be thread safe • How to test for thread safety? • Use OpenMP and Parallel Inspector for analysis • Use sections to create concurrent execution Intel® Parallel Inspector

  30. Check for safety issues between Multiple instances of routine1() Instances of routine1() and routine2() Set up sections to test all permutations Still need to provide data sets that exercise relevant portions of code Thread Safety Example • #pragma omp parallel sections { #pragma omp section routine1(&data1); #pragma omp section routine1(&data2); #pragma omp section routine2(&data3); } Intel® Parallel Inspector

  31. It is better to make a routine reentrant than to add synchronization • Avoids potential overhead Two Ways to Ensure Thread Safety • Routines can be written to be reentrant • Any variables changed by the routine must be local to each invocation • Don’t modify globally shared variables • Routines can use mutual exclusion to avoid conflicts with other threads • If accessing shared variables cannot be avoided • What if third-party libraries are not thread safe? • Will likely need to control threads access to library Intel® Parallel Inspector

  32. Activity 3 – Thread Safety • Use OpenMP framework to call library routines concurrently • Three library calls = 6 combinations to test • A:A, B:B, C:C, A:B, A:C, B:C Intel® Parallel Inspector

  33. Intel® Parallel InspectorWhat’s Been Covered • Threading errors are easy to introduce • Debugging these errors by traditional techniques is hard • Intel® Parallel Inspector catches these errors • Errors do not have to occur to be detected • Greatly reduces debugging time • Improves robustness of the application Intel® Parallel Inspector

  34. Intel® Parallel Inspector

More Related