1 / 53

Applications of Non-Blocking Data Structures to Real-Time Systems

Applications of Non-Blocking Data Structures to Real-Time Systems. Seminar for the degree of Licentiate of Philosophy Håkan Sundell Computing Science Chalmers University of Technology. ARTES project: ”Applications of wait/lock-free protocols to real-time systems” Started in March 1999.

Download Presentation

Applications of Non-Blocking Data Structures to Real-Time Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Applications of Non-Blocking Data Structures to Real-Time Systems Seminar for the degree of Licentiate of Philosophy Håkan Sundell Computing Science Chalmers University of Technology

  2. ARTES project: ”Applications of wait/lock-free protocols to real-time systems” Started in March 1999. One active Ph.D.-student. Project leader: Philippas Tsigas Background

  3. Introduction Real-Time Systems Synchronization Shared Data Objects: Snapshots Evaluation The Effect of Using Timing Information Snapshot Shared Register Software engineering part Conclusions & Future Work Schedule

  4. Uni- or Multi-processor system Interconnection Network e.g. The Controller Area Network (CAN). Real-Time Systems CPU CPU CPU CPU

  5. Shared Memory Real-Time Systems CPU CPU . . . CPU Cache Cache Cache Memory - Uniform Memory Access (UMA) ... ... ... CPU CPU CPU CPU CPU CPU . . . Cache bus Cache bus Cache bus Memory Memory Memory - Non-Uniform Memory Access (NUMA)

  6. Cooperating Tasks Timing Constraints Inter-task Communication: Shared Data Objects Needs Synchronization Real-Time Systems T2 ? ? ?? ? ? T1 T3

  7. Introduction Real-Time Systems Synchronization Shared Data Objects: Snapshots Evaluation The Effect of Using Timing Information Snapshot Shared Register Software engineering part Conclusions & Future Work Schedule

  8. Synchronization using Locks Uses semaphores, spinning, disabling interrupts Negative Blocking Priority inversion Risk of deadlock Positive Execution time guarantees easy to do, but pessimistic Synchronization Take lock ... do operation ... Release lock

  9. Lock-Free Synchronization Retries until not interfered by other operations Usually detecting interference by using some kind of shared variable indicating busy-state or similar. Non-blocking Synchronization Change flag to unique value, or remember current state ... do the operation while preserving the active structure ... Check for same value or state and then validate changes, otherwise retry

  10. Lock-Free Synchronization Negative No execution time guarantees, can continue forever - thus can cause starvation Positive Avoids blocking and priority inversion Avoids deadlock Fast execution on average Non-blocking Synchronization

  11. Non-blocking Synchronization Uses atomic synchronization primitives Uses shared memory Wait-Free Synchronization Always finish in a finite number of its own steps Negative Complex algorithms Memory consuming Non-blocking Synchronization Test&Set Compare &Swap Copying Helping Announcing Split operation ???

  12. Wait-Free Synchronization Positive Execution time guarantees Fast execution Avoids blocking and priority inversion Avoids deadlock Avoids starvation Same implementation on both single- and multiprocessor systems Non-blocking Synchronization

  13. Introduction Real-Time Systems Synchronization Shared Data Objects: Snapshots Evaluation The Effect of Using Timing Information Snapshot Shared Register Software engineering part Conclusions & Future Work Schedule

  14. Correctness criteria for concurrent operations: linearizability All concurrent executions can be transformed into an equivalent serial sequence of atomic operations preserving the partial order ti Write tj Read tk Write ser t Shared Data Objects

  15. Snapshot A consistent momentous state of a set of several shared variables that are logically related One reader (scanner) Reads the whole set of variables in one atomic step Many writers (updaters) Writes to only one variable each time Snapshot

  16. Atomicity / Linearizability criteria Snapshot: Correctness Read YES ci Write Write t Read YES ci Write Write t Read NO ci Write Write t = returned by scanner

  17. Atomicity / Linearizability criteria Snapshot: Correctness Read NO ci Write Write t ci Write Write NO cj Write t = returned by scanner

  18. Introduction Real-Time Systems Synchronization Shared Data Objects: Snapshots Evaluation The Effect of Using Timing Information Snapshot Register Software engineering part Conclusions & Future Work Schedule

  19. Wait-free snapshot algorithm by Ermedahl et. al 3 register copies for each component Uses the Test&Set atomic primitive for synchronization What are we evaluating Used by reader Used by writer

  20. Real-Time System: Measured schedulability Created “realistic” scenarios on a theoretic 68020 uni-processor system Real RTOS parameters Manual WCET-analysis on cycle level 1 scanner (5 components), 24 updaters (10 real-time tasks, 15 interrupts) Fixed priority response time analysis Schedulable without any synchronization Adding lock/wait-free or semaphore synchronization Analysis

  21. Analysis: Schedulability (%)

  22. Simulation RT-simulator written in Erlang by Ermedahl and Sjödin. Fixed priority preemptive scheduler Semaphores Messages Subset of scenarios used in analysis Experiments

  23. Experiments: Schedulability (%)

  24. Multi-node: Simulation of CAN-bus 1 MHz 10 nodes connected using messages Local snapshots on each node 1 super-snapshot task on 1 node Subset of scenarios used for single-node analysis Experiments

  25. Experiments: Rsnap for multi-node

  26. Introduction Real-Time Systems Synchronization Shared Data Objects: Snapshots Evaluation The Effect of Using Timing Information Snapshot Register Software engineering part Conclusions & Future Work Schedule

  27. Previously used by Chen and Burns in 1999. Assuming system with periodic fixed-priority scheduling Notations from Standard Real-Time Response Time Analysis Use information about Periods , T Worst-case Computation time , C Worst-case Response times , R Timing Information

  28. Introduction Real-Time Systems Synchronization Shared Data Objects: Snapshots Evaluation The Effect of Using Timing Information Snapshot Register Software engineering part Conclusions & Future Work Schedule

  29. Back to Basics: Unbounded Memory Protocol The reader increases global index and scans backwards. Snapshot ? = previous values / nil w = writer position Snapshotindex . . . c1 v ? ? ? ? w nil nil . . . ci v ? ? ? ? w nil nil . . . cc v ? ? ? ? w nil nil t

  30. Bounded Memory: Cyclical Buffers Needed buffer length is dependent on how fast the updaters is compared to the scanner Each component can have different buffer lengths Snapshot

  31. Bounding Needed buffer length for component k Can be refined even further Timing Information where Ts is the period for the snapshot task Tw is the period for the writer tasks

  32. Using a Sun Enterprise 10000 multiprocessor computer 1 scanner task and 10 updater tasks, one on each CPU Comparing two wait-free snapshot algorithms Using timing information Using Test-and-Set synchronization Experiments

  33. Scenarios with different ratios between scanner/updater: Measuring response time for scan versus update operations Experiments

  34. Scan operation - Average Response Time Experiments

  35. Update operation – Average Response Time Experiments

  36. Introduction Real-Time Systems Synchronization Shared Data Objects: Snapshots Evaluation The Effect of Using Timing Information Snapshot Shared Register Software engineering part Conclusions & Future Work Schedule

  37. Target domain: Shared Memory (Even no cache coherency) Wait-Free Atomic Shared Buffer by Vitanyi et. al A Matrix of 1-reader 1-writer registers Each register contains a value/tag pair encoded as one value Shared Register Readers R11 R12 ... R21 R22 … Rij • written by processor i • read by processor j ... ... ... tag value Writers

  38. Algorithm: Readers scans its column for highest tag and returns the corresponding value Writers scan its column and writes the next tag together with the new value to its row Unbounded maximum size for the tag field in the value/tag pair Assume 8 writer tasks with 10 ms period Maximum tag after one hour is 2880000 which needs 22 bits! Shared Register

  39. Analyzing the maximum difference between tags possible observable by a task at two consecutive invocations of the algorithm In any possible execution: Tmax is the longest period Rmax is the longest response time Twr is the period of the writer tasks Recycling tags: Newer tags can restart from zero when we reach a certain tag value In order to be able to decide if newer tags are newer we need to have: Timing Information v3 v4 v1 v2 v3 v4 0 N

  40. Example Task Scenario on 8 processors: Unbounded algorithm would have reached tag 68400 in one hour , needing >16 bits Examples

  41. Introduction Real-Time Systems Synchronization Shared Data Objects: Snapshots Evaluation The Effect of Using Timing Information Snapshot Register Software engineering part Conclusions & Future Work Schedule

  42. Multithreaded programming needs communication. Communicating using shared data structures like stacks, queues, lists and so on. This needs synchronization! Locks (Mutual exclusion) has several drawbacks, especially for Real-Time Systems. Non-blocking solutions are often complex to implement and have non-standard interfaces. Background

  43. Designed with the following properties: Functionality – Stacks, Queues, Lists, Snapshot, Register… with clear specifications Programmer friendly - #include <noble.h> , NBL<function> Easy to adapt existing solutions – Provides locks as well as non-blocking synchronization NOBLE: A Non-Blocking Inter-Process Communication Library

  44. Designed with the following properties (cont.): Efficient – Object oriented design “virtual functions and inheritance with base classes” in C Portable – Modular design, platform-dependent code separated Adaptable for different programming languages – C, C++, Standard dynamic linked library NOBLE: A Non-Blocking Inter-Process Communication Library

  45. #include <noble.h> First create a global variable handling the shared data object, for example a stack:NBLStack *stack;stack=NBLCreateStackLF(10000); When some thread wants to do some operation:NBLStackPush(stack, item);oritem=NBLStackPop(stack); Examples

  46. When the data structure is not in use anymore:NBLStackFree(stack); To change the synchronization mechanism, only one line of code has to be changed!stack=NBLStackCreateLF(10000);replaced withstack=NBLStackCreateLB(); Examples

  47. Set of 50000 random operations performed multithreaded on each data structure, with either low or high contention. Comparing the different synchronization mechanisms and implementations available. Varying number of threads from 1 – 30. Performed on multiprocessors: Sun Enterprise 10000 with 64 CPUs, Solaris Compaq PC with 2 CPUs, Win32 Experiment

  48. Experiments: Linked List (high)

  49. Multiprocessor support Sun Solaris (Sparc) Win32 (Intel x86) SGI (Mips) – Evaluation stage Linux (Intel x86) – Evaluation stage Extensive Manual Web site up and running, http://www.cs.chalmers.se/~noble Status

  50. Introduction Real-Time Systems Synchronization Shared Data Objects: Snapshots Evaluation The Effect of Using Timing Information Snapshot Register Software engineering part Conclusions & Future Work Schedule

More Related