1 / 4

E81 CSE 532S: Advanced Multi-Paradigm Software Development

E81 CSE 532S: Advanced Multi-Paradigm Software Development. C++11 Concurrency Design. Chris Gill Department of Computer Science and Engineering Washington University in St. Louis cdgill@cs.wustl.edu. Dividing Work Between Threads. Static partitioning of data can be helpful

Download Presentation

E81 CSE 532S: Advanced Multi-Paradigm Software Development

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. E81 CSE 532S: Advanced Multi-Paradigm Software Development C++11 Concurrency Design Chris Gill Department of Computer Science and Engineering Washington University in St. Louis cdgill@cs.wustl.edu

  2. Dividing Work Between Threads • Static partitioning of data can be helpful • Makes threads (mostly) independent, ahead of time • Threads can read from and write to their own locations • Some partitioning of data is necessarily dynamic • E.g., Quicksort uses a pivot at run-time to split up data • May need to launch (or pass data to) a thread at run-time • Can also partition work by task-type • E.g., hand off specific kinds of work to specialized threads • E.g., a thread-per-stage pipeline that is efficient once primed • Number of threads to use is a key design challenge • E.g., std::thread::hardware_concurrency() is only a starting point (blocking, scheduling, etc. also matter)

  3. Factors Affecting Performance • Need at least as many threads as hardware cores • Too few threads makes insufficient use of the resource • Oversubscription increases overhead due to task switching • Need to gauge for how long (and when) threads are active • Data contention and cache ping-pong • Performance degrades rapidly as cache misses increas • Need to design for low contention for cache lines • Need to avoid false sharing of elements (in same cache line) • Packing or spreading out data may be needed • E.g., localize each thread’s accesses • E.g., separate a shared mutex from the data that it guards

  4. Additional Considerations • Exception safety • Affects both lock based and lock-free synchronization • Use std::packaged_taskand std::future to allow for an exception being thrown in a thread (see listing 8.3) • Scalability • How much of the code is actually parallizable? • Various theoretical formulas (including Amdahl’s) apply • Hiding latency • If nothing ever blocks you may not need concurrency • If something does, concurrency makes parallel progress • Improving responsiveness • Giving each thread its own task may simplify, speed up tasks

More Related