200 likes | 305 Views
ECE 259 / CPS 221 Advanced Computer Architecture II. DMP: Deterministic Shared Memory Multiprocessing. Joseph Devietti et al. Presenter : Tae Jun Ham 2012. 3. 19. Abstract. Most current shared memory multicore and multiprocessor systems are nondeterministic .
E N D
ECE 259 / CPS 221 Advanced Computer Architecture II DMP: Deterministic Shared Memory Multiprocessing Joseph Devietti et al. Presenter : Tae Jun Ham 2012. 3. 19
Abstract • Most current shared memory multicore and multiprocessor systems are nondeterministic. • Non-determinism makes debugging and testing hard. • Previous approaches were based on replay • But replay is only useful for debugging • Based on deterministic inter-thread communication, this paper suggests several ways to achieve deterministic shared memory multiprocessing
Determinism • What is Deterministic Parallel Execution? • Executes multiple threads that communicate via shared memory • Should produce the same output if given the same program input • What causes Non-determinism? • Software sources : concurrent threads, the state of memory pages, power saving mode, disk and I/O buffer, and some OS system calls. • Hardware sources : state of caches, predictor tables and bus priority controller, and bus arbiters. In other words, almost all microarchitectural structures.
DMP-ShTab • Communication-Free Region: Parallel • Communication : Serial • Rules • Without token: Read for shared address Write for own address • With token: Can do everything
QB-SyncFollow & QB-Sharing QB SyncFollow : After unlock, pass the token QB Sharing : After finishing works on shared data, pass the token
Evaluation - Performance Serial : Linear slowdown with the increasing number of threads ShTab : 38% TM-Fwd : 21%
Evaluation - Quanta size sensitivity In general, larger quanta is slower. Serial case is less sensitive to quanta size.
Evaluation - Heuristics on quanta size Effective for ShTab. SyncFollow benefits for some workloads.
Evaluation - Sw-DMP Author says : In summary, this data shows that Sw-DMP-ShTab does not unduly limit performance scalability for multithreaded applications.
Discussions • Can this system deployed? • Too much performance overhead • Implementation Complexity • Which one do you prefer? DMP vs Deterministic replay • Possible power saving with DVFS?