1 / 30

Concurrency Analysis Platform And Tools For Finding C o ncurrency Bugs

TL58. Concurrency Analysis Platform And Tools For Finding C o ncurrency Bugs.  Thomas Ball Principal Researcher Microsoft Corporation.  Madan Musuvathi Researcher Microsoft Corporation.  Shaz Qadeer Senior Researcher Microsoft Corporation.  Sebastian Burckhardt Researcher

Gabriel
Download Presentation

Concurrency Analysis Platform And Tools For Finding C o ncurrency Bugs

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. TL58 Concurrency Analysis Platform And Tools For Finding Concurrency Bugs  Thomas Ball Principal Researcher Microsoft Corporation  Madan Musuvathi Researcher Microsoft Corporation  Shaz Qadeer Senior Researcher Microsoft Corporation  Sebastian Burckhardt Researcher Microsoft Corporation

  2. Concurrency Is HARD ! • Rare thread interleavings can result in bugs • These bugs are hard to find, reproduce, and debug • Heisenbugs: Observing the bug can “fix” it ! • A huge productivity problem • Developers and testers can spend weeks chasing a single Heisenbug

  3. Main Takeaways • You can find and reproduce Heisenbugs • new automatic tool called CHESS • for Win32 and .NET • CHESS used extensively inside Microsoft • Parallel Computing Platform (PCP) • Singularity • Dryad/Cosmos • Releasing via DevLabs

  4. demo Why Is Concurrency Hard?  Madan Musuvathi Researcher Microsoft Corporation

  5. Concurrency Analysis Platform (CAP) • Goal: Drive a program along an interleaving of choice • Interleaving decided by user or by a program/tool • Today: Controlling/observing concurrency is difficult • Manual and intrusive process • Enables lots of concurrency tools: • Test a program along a set of interleavings • Reproduce Heisenbugs • Program understanding / debugging • ...

  6. demo Taming Concurrency  Madan Musuvathi Researcher Microsoft Corporation

  7. CAP Architecture Coverage Unmanaged Program Repro Monitors Memory Model bugs Visualization Debugging Testing Data races Windows CAP Managed Program • Record the interleaving executed • Drive the program along an interleaving .NET CLR

  8. CAP Specifics • Ability to explore all interleavings • Need to understand complex concurrency APIs (Win32 and System.Threading) • Threads, threadpools, locks, semaphores, async I/O, APCs, timers, … • Does not introduce false behaviors • Any interleaving produced by CAP is possible on the real scheduler

  9. Overview • Concurrency Analysis Platform (CAP) • CHESS : find/reproduce Heisenbugs • Integration with Visual Studio • Demo on CCR Heisenbug • Future CAP tools • FeatherLite: Data-race detection • Sober: Memory-model bugs

  10. CHESS: Find And Reproduce Heisenbugs Program While(not done) { TestScenario() } CHESS runs the scenario in a loop CHESS TestScenario() { … } • Every run takes a different interleaving • Every run is repeatable • Uses the CAP scheduler • To control and direct interleavings CAP Win32/.NET • Detect • Assertion violations • Deadlocks • Dataraces • Livelocks Kernel: Threads, Scheduler, Synchronization Objects

  11. Number of executions: nnk Exponential in both n and k For n=2, k = 100 > # of atoms in the universe Limits scalability to large programs CHESS challenge Programs have LOTS of interleavings Thread 1 Thread n x = 1; … … … … … x = k; x = 1; … … … … … x= k; … k steps each n threads Goal: Scale CHESS to large programs (large k)

  12. Preemption Bounding • Focus on executions with small number of preemptions • Unexpected preemptions cause bugs Thread 1 Thread 2 x = 1; if (p != 0) { x = p->f; } x = 1; if (p != 0) { p = 0; preemption x = p->f; } non-preemption

  13. Number of interleavings: nnk Interleavings with c preemptions (n2k)c. nn For n=2, k=100, c=2 < 1 million interleavings Analysis techniques reduce this further Managing Astronomical Number Of Interleavings Thread 1 Thread n x = 1; … … … … … x = k; x = 1; … … … … … x= k; … k steps each n threads

  14. demo CHESS In VSTS  Thomas Ball Principal Researcher Microsoft Corporation

  15. demo Real Scenario: Heisenbug In CCR

  16. George Chrysanthakopoulus’ Challenge

  17. CHESS Internal Customers • Parallel Computing Platform • PLINQ: Parallel LINQ • CDS: Concurrent Data Structures • STM: Software Transactional Memory • TPL: Task Parallel Library • ConcRT: Concurrency RunTime • CCR: Concurrency Coordination Runtime • Dryad/Cosmos • Singularity (Research OS from MSR) • CHESS can systematically test the boot and shutdown process

  18. announcing CHESS athttp://msdn.microsoft.com/devlabs/

  19. Overview • Concurrency Analysis Platform (CAP) • CHESS: Find/reproduce Heisenbugs • Integration with Visual Studio • Demo on CCR Heisenbug • Future CAP tools • FeatherLite: Data-race detection • Sober: Memory-model bugs

  20. FeatherLiteLightweight data-race detection • Data-races: Access to data with insufficient synchronization • Data-races are a common source of concurrency errors Thread 1 Thread 2 EnterCS(cs) if (p != 0) { x = p->f; } LeaveCS(cs) p = 0;

  21. Sampling To Reduce Overhead • Existing data-race detection tools have a large runtime overhead • More than 10X slowdown • Process every memory access • Intelligent sampling algorithms • Process < 5% of the memory accesses • Less than 30% runtime overhead • Existing tools > 1000% overhead

  22. FeatherLite + CAP"Active" data-race detection • Programs have “benign” data-races • Do not result in program crashes • Example: Updating statistical counters without holding a lock • FeatherLite drives the program along the two outcomes of the data-race

  23. SoberTool for finding memory-model errors • Expert programmers use “lock-free” techniques • Use low-level synchronizations, volatile variables, … • For performance • Such programs are exposed to memory-model issues • Compiler can reorder instructions • Hardware can reorder/delay memory accesses • Result: Hard-to-find bugs that are hard-to-understand

  24. Sober Quiz • Can both threads DoWork() at the same time? // initial state volatile bool flag1 = false; volatile bool flag2 = false; Thread 1 Thread 2 flag1 = true; if( !flag2 ) DoWork(); flag2 = true; if( !flag1 ) DoWork();

  25. Conclusion • Concurrency Analysis Platform (CAP) for controlling thread interleavings • Enables lots of concurrency tools • CHESS – Find and reproduce Heisenbugs • Don’t stress, use CHESS • Look for download on http://msdn.microsoft.com/devlabs/ • Future CAP tools • FeatherLite: Lightweight data-race detection • Sober: Tool for finding memory-model bugs

  26. Evals & Recordings Please fill out your evaluation for this session at: This session will be available as a recording at: www.microsoftpdc.com

  27. Q&A Please use the microphones provided

  28. © 2008 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

More Related