1 / 9

Thread-Specific Storage (TSS)

Explore the benefits of thread-local storage, TSS patterns, and TSS emulation in C++11 for efficient, thread-global data management. Understand implementation options and costs for achieving optimized performance.

romo
Download Presentation

Thread-Specific Storage (TSS)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Thread-Specific Storage (TSS) Chris Gill and Venkita Subramonian E81 CSE 532S: Advanced Multi-Paradigm Software Development

  2. Thread Local Storage in C++11 • A variable can be declared thread_local as of C++11 • Lifetime is the lifetime of the thread • Useful for data that are logically global to the thread • Good for avoiding passing references to it up and down call stack • E.g., if data are made extern, or static, or put in a namespace, etc. • Good fences make good neighbors • Not visible to other threads (unless a pointer/reference is given away) • What if there are many different thread-specific data? • If all threads use instances of all the same types all the time, can put them in a structand make instances of the struct thread local • Otherwise, thread-specific storage (TSS) pattern can help

  3. A More Complete and General Solution:Thread-Specific Storage (TSS) Pattern • Logically thread-global access point • Maps index to object • Index is a 2-tuple • e.g., an STL pair of <key,std::thread::id> • Avoids lock overhead • Separate copy per <key, std::thread::id> • Logically a mxn table • Sparse/dense, small/large • Implement accordingly A TSS table points to different kinds of thread-specific objects tid1 tid2 tid3 tid4 key1 TSS table key2 connections key3 errno values

  4. Alternative Table Implementations Hash Map key1 tid2 key1 tid4 • 2-D array is good for many use-cases • Small #s of threads, keys • And/or densely populated • May avoid data races • Hash map, skip-list, etc. may be better for others • Large row/column sizes • Sparsely populated • But, adds some overhead • Data races may occur key3 tid1 key3 tid3 key3 tid4

  5. TSS and Resource Indexing thread-specific objects • Multiple object lookup keys • Each key in a thread is for a different object • Explicit tid indexing • Used when a thread needs to cross-reference another’s TSS • Watch out for race conditions • Avoid locking if at all possible • Benefit of thread id indexing • Threads remain mostly unaware of each other’s TSS resources • As if each were the only thread in the process that uses TSS • Unless a thread compares the thread id it is given with its own via std::this_thread::get_id() distinguished by keys distinguished by thread ids

  6. Key issues Identity of the distributable thread abstraction (GUID) Mapping and remapping DT to different local threads E.g., when DT makes a remote call, release local thread to reactor E.g., when DT makes a nested call back onto the same host Distributable Thread (DT) TSS Variant Remote call carries DT’s parameters with it Binding of a single DT to different local OS threads Host 1 Host 2 RTCORBA 2.0 Scheduler RTCORBA 2.0 Scheduler remote calls and returns <GUID2, TID2> <GUID1, TID1> <GUID1, TID1> <GUID1, TID2>

  7. Distributable Thread (DT) TSS Variant • A distributable thread can use thread-specific storage • Avoids locking of global data • Context: OS provided TSS is efficient, uses OS thread id • Problem: distributable thread may span OS threads • Difficult to access prior storage • Solution: TSS emulation • based on <GUID,tid> pair • also useful idea on platforms that don’t provide native TSS • Key question to answer • What is the cost of TSS emulation compared to the OS provided version of TSS?

  8. TSS Emulation Costs (Mgeta, RTAS04) • Pentium tick timestamps • Nanosecond resolution on 2.8 GHz P4, 512KB cache, 512MB memory • RedHat 7.3, real-time class • Called create repeatedly • Then, called write/read repeatedly on one key • Upper graph shows scalability of key creation • Cost scales linearly with number of keys in OS, ACE TSS • Emulation costs ~2usec more per key creation • Lower graph shows the emulated write costs ~1.5usec, read ~.5usec more

  9. Conclusions • Benefits of using Thread-Specific Storage Pattern • Efficiency of access (no locking) • Reusability (via Wrapper Façade) • Ease of use (hides complexity) • Liabilities of the pattern • Potential cluttering of the TSS map • Objects not used by multiple threads don’t belong in the map • Putting them there wastes space, adds program complexity • “Yet another” factor obscuring system structure/behavior • E.g., have to understand map during multi-threaded debugging • Language-specific implementation options • May reduce portability • E.g., templates and operator overloading

More Related