560 likes | 702 Views
Patterns for providing End-to-End Real-time Guarantees in DOC Middleware. Irfan Pyarali irfan@cs.wustl.edu PhD Final Examination Advisors: Dr. Ron K. Cytron and Dr. Douglas C. Schmidt. Presentation Outline. Quick overview of RT-CORBA and thesis scope Trace an invocation end-to-end
E N D
Patterns for providing End-to-End Real-time Guarantees in DOC Middleware Irfan Pyarali irfan@cs.wustl.edu PhD Final Examination Advisors: Dr. Ron K. Cytron and Dr. Douglas C. Schmidt
Presentation Outline • Quick overview of RT-CORBA and thesis scope • Trace an invocation end-to-end • Identifying sources of unbounded priority inversion • Patterns for providing end-to-end real-time guarantees • RT-CORBA architecture • Empirical evaluation of end-to-end behavior
Real-time CORBA Overview • RT CORBA adds QoS control to regular CORBA improve the application predictability • Bounding priority inversions • Managing resources end-to-end • Policies & mechanisms for resource configuration/control in RT-CORBA include: Processor Resources • Thread pools • Priority models • Portable priorities Communication Resources • Protocol policies • Explicit binding Memory Resources • Request buffering • These capabilities address some (but by no means all) important real-time application development challenges
Thesis Scope • End-to-end QoS guarantees require vertical and horizontal integration • Peer-to-Peer • Network adapter to Application layer • Thesis focus areas: • End-to-end priority propagation • Demultiplexing • Dispatching • Concurrency
Tracing an Invocation End-to-end Client ORB Server ORB Connection Cache Memory Pool Connection Cache Memory Pool A B S S S X Y C Reply Demultiplexer Connector Acceptor Reactor POA Reactor CV1 CV1 CV1 X Y C C C A B S S S CV2 3. Add new connection to cache 1. Lookup connection to Server (S) 2. Lookup failed – make new connection to Server (S) 9. Allocate memory to marshal data 4. Add new connection to Reactor 10. Send data to Server – marking connection (S) busy 20. Follower unmarshals reply 18. Leader reads reply from server 19. Leader hands off reply to follower 11. Wait in Reply Demuliplexer –some other thread is already leader, become follower 5. Accept new connection from Client (C) 13. Allocate buffer for incoming request 14. Read request data 15. Demultiplex request and dispatch upcall 16. Send reply to client 17. Wait for incoming events 8. Wait for incoming events on the Reactor 7. Add new connection to Reactor 6. Add new connection to Cache 12. Read request header
Identifying Sources of Unbounded Priority Inversion Client ORB Server ORB Connection Cache Memory Pool Connection Cache Memory Pool A B S X Y C Reply Demultiplexer Connector Acceptor Reactor POA Reactor CV1 X Y C A B S CV2 • Connection cache • Time required to send request is dependent on availability of network resources and the size of the request • Priority inheritance will help • Creating new connections can be expensive and unpredictable • Memory Pool • Time required to allocate new buffer depends on pool fragmentation and memory management algorithm • Priority inheritance will help • Reply Demultiplexer • If the leader thread is preempted by a thread of higher priority before the reply is handed-off (i.e., while reading the reply or signaling the invocation thread), then unbounded priority inversion will occur • There is no chance of priority inheritance since signaling is done through condition variables • Reactor • No way to identify high priority client request from one of lower priority • POA • Time required to demultiplex request may depend on server organization • Time required to dispatch request may depend on contention on dispatching table • Priority inheritance will help
Hash Map Hash Map Hash Map Hash Map Hash Map Servant Id Operation Name POA Id Fox/Simpsons/Family Homer Doh! CORBA Request Demultiplexing Work at Nuclear Plant Run Nuclear Plant Play PoohSticks Play Saxophone Impress Girls Make Money Sell Beer Sell Propane Drink Beer Eat Honey Bounce Doh! Skeleton Layer Homer Lisa Mr. Burns Moe Bobby Hank Pooh Tigger Servant Layer Family Townspeople Simpsons King of the Hill Winnie the Pooh Fox Disney Root POA POA Layer ORB I/O ORB Core Layer
Problems with Request Demultiplexing • Slow demultiplexing • Five hash lookups required • Unacceptable worst-case time • Hash map searches are O(n) worst-case • Lookup time dependent on • Height of POA hierarchy • Number of POAs • Number of Servants • Number of Operations • Required demultiplexing • Independent of server organization and configuration • Predictable, efficient and scalable
Perfect Hash Map Drink Beer Doh! Play trick on Ned Bowl Work at Nuclear Plant Skeleton Demultiplexing Forces • Skeletons names known at compile-time • Cannot change representation • CORBA requires names to be transmitted as strings Bowl Doh! Homer Drink Beer Play trick on Ned Work at Nuclear Plant • Solution • Use GPERF to generate Perfect Hash function offline • Worst-case time O(1)
Active Map Servant Demultiplexing Forces • Servants can be added and removed on the fly • Can change representation • Object reference is ORB specific Homer Marge Family Bart Lisa Maggie • Solution • Use Active Object Map • Worst-case time O(1)
Active Map Fox / Simpsons / Townspeople Fox / Simpsons Fox Fox / King of the Hill Fox / Simpsons / Family POA Demultiplexing Forces • POAs can be added and removed on the fly • Can change representation • Object reference is ORB specific Townspeople Simpsons Family Fox King of the Hill • Solution • Flatten POA hierarchy • Logically still the same • Use Active Object Map • Worst-case time O(1)
Perfect Hash Map Active Map Active Map Servant Id Operation Name POA Id 2:1 0:5 Doh! Revised Request Demultiplexing Work at Nuclear Plant Run Nuclear Plant Play PoohSticks Play Saxophone Impress Girls Make Money Sell Beer Sell Propane Drink Beer Eat Honey Bounce Doh! Skeleton Layer Homer Lisa Mr. Burns Moe Bobby Hank Pooh Tigger Servant Layer Fox / Simpsons / Family Fox / Simpsons Fox / Simpsons / Townspeople Fox Fox / King of the Hill Disney Disney / Winnie the Pooh Root POA POA Layer ORB I/O ORB Core Layer
Summary of Revised Request Demultiplexing Pyarali, et al., Applying Optimization Principle Patterns to Real-time ORBs COOTS, May 1999
Reference Count during Dispatch Problem • Satisfy MT applications with stringent real-time requirements • Lock cannot be held for the duration of an upcall • The target element cannot be removed while an upcall is in progress • Other objects can be added or removed while an upcall is in process • Bound all priority inversions • Increase concurrency, allowing simultaneous upcalls • Solution • Count the current number of upcalls on each entry • Events are dispatched as follows: • Acquire lock, locate entry, increase reference count, release lock • Perform upcall • Re-acquire lock, decrement reference count, release lock • Once reference count reaches zero, element can be safely removed • Reference count is greater than zero during upcall • Consequences • Upcalls can execute concurrently and can safely use the dispatching table • Priority inversion independent of upcall duration • Can collaborate in object life-cycle management • It requires two locks per upcall Pyarali, et al., A Pattern Language for Efficient, Predictable, Scalable, and Flexible Dispatching Mechanisms for DOC Middleware ISORC, March 2000
Half-Sync/Half-Async Thread Pool Design Design Overview • Single Acceptor endpoint • One Reactor for each priority level • Each lane has a queue • I/O and application processing are done in different threads
Performance of Memory Management Schemes • Performance order: Stack > TSS > Global • Predictability order: Stack = TSS > Global • Contention greatly effects Global Memory Pool • No effect on Stack and TSS Memory Pools
Leader/Followers Thread Pool Design Design Overview • Each lane has its own set of resources • Reactor/Acceptor • I/O and application processing are done in the same thread
Performance of Thread Pool Implementations Pyarali, et al., Optimizing Thread-Pool Strategies for Real-Time CORBA Optimizing Middleware ACM Workshop, June 2001
RT-CORBA Architecture ORB A ORB B Connector POA Connector POA Low Priority Lane Low Priority Lane Connection Cache Memory Pool Connection Cache Memory Pool B B A S A S Leader/Followers Acceptor Leader/Followers Acceptor Reactor Reactor CV1 CV1 CV2 CV2 A B S A B S High Priority Lane High Priority Lane Connection Cache Memory Pool Connection Cache Memory Pool B B A S A S Leader/Followers Acceptor Leader/Followers Acceptor Reactor Reactor CV1 CV1 CV2 CV2 A B S A B S
Motivation for Real-time Experiments • Illustrate RT, deterministic, and predictable ORB behavior • Demonstrate end-to-end predictability by utilizing the ORB to • Propagate and preserve priorities • Exercise strict control over the management of resources • Avoid unbounded priority inversions • End-to-end predictability of timeliness in fixed priority CORBA • Respecting thread priorities between client and server for resolving resource contention during the request processing • Bounding the duration of thread priority inversions end-to-end • Bounding the latencies of operation invocations
Network Test Bed Description void method (in unsigned long work);
Description of Experiments • Increasing Workload in Vanilla CORBA • Increasing Workload in RT-CORBA With Lanes • Increasing Priority Increasing Rate • Increasing Workload in RT-CORBA With Lanes • Increasing Priority Decreasing Rate • Increasing Best-effort Work in Vanilla CORBA • Increasing Best-effort Work in RT-CORBA With Lanes • Increasing Workload in RT-CORBA Without Lanes
Experiment 1: Increasing Workload in Vanilla CORBA Experiment • Measure the disruption caused by Increasing Workload in Vanilla CORBA • Increasing Priority Increasing Rate Server • 3 threads Client • 3 Rate-based Invocation threads • High 75 Hertz • Medium 50 Hertz • Low 25 Hertz
Conclusions: Increasing Workload in Vanilla CORBA • As workload increases and system capacity decreases, the high priority 75 Hertz client is effected first, followed by the medium priority 50 Hertz client, and finally by the low priority 25 Hertz client • The above behavior is because all clients are treated equally by the server • Behavior is unacceptable for a real-time system
Experiment 2: Increasing Workload in RT-CORBA With Lanes (Increasing Priority Increasing Rate) Experiment • Measure the disruption caused by Increasing Workload in RT-CORBA With Lanes • Increasing Priority Increasing Rate Server • 3 thread lanes • High / Medium / Low Client • 3 Rate-based Invocation threads • High 75 Hertz • Medium 50 Hertz • Low 25 Hertz
Results A: Increasing Workload in RT-CORBA With Lanes (Increasing Priority Increasing Rate)(Client and Server on same machine)
Results B: Increasing Workload in RT-CORBA With Lanes (Increasing Priority Increasing Rate)(Client and Server on remote machines)
Conclusions: Increasing Workload in RT-CORBA With Lanes (Increasing Priority Increasing Rate) • As workload increases and system capacity decreases, the low priority 25 Hertz client is effected first, followed by the medium priority 50 Hertz client, and finally by the high priority 75 Hertz client • The above behavior is because higher priority clients are given preference over lower priority clients by the server • When client and server are on separate machines, the lower priority client threads are able to sneak in some requests between the time a reply is sent to the high priority thread and before a new request is received from it • Behavior is consistent for a real-time system
Experiment 3: Increasing Workload in RT-CORBA With Lanes (Increasing Priority Decreasing Rate) Experiment • Measure the disruption caused by Increasing Workload to RT-CORBA With Lanes • Increasing Priority Decreasing Rate Server • 3 thread lanes • High / Medium / Low Client • 3 Rate-based Invocation threads • High 25 Hertz • Medium 50 Hertz • Low 75 Hertz
Results: Increasing Workload in RT-CORBA With Lanes (Increasing Priority Decreasing Rate)
Conclusions: Increasing Workload in RT-CORBA With Lanes (Increasing Priority Decreasing Rate) • As workload increases and system capacity decreases, the low priority 75 Hertz client is effected first, followed by the medium priority 50 Hertz client, and finally by the high priority 25 Hertz client • The above behavior is because higher priority clients are given preference over lower priority clients by the server • Behavior is consistent for a real-time system
Experiment 4: Increasing Best-effort Work in Vanilla CORBA Experiment • Measure the disruption caused by the Increasing Best-effort Work in Vanilla CORBA • Increasing Priority Increasing Rate Server • 4 threads Client • 3 Rate-based Invocation threads • High 75 Hertz • Medium 50 Hertz • Low 25 Hertz • Several Best-effort threads Continuous Invocations Notes • System is running at capacity Any progress made by Best-effort threads will cause disruptions
Conclusions: Increasing Best-effort Work in Vanilla CORBA • All three priority based clients suffer as the number of best-effort clients are added to the system • The above behavior is because all client threads are treated equally by the server • Behavior is unacceptable for a real-time system
Experiment 5: Increasing Best-effort Work in RT-CORBA With Lanes Experiment • Measure the disruption caused by Increasing Best-effort Work in RT-CORBA With Lanes • Increasing Priority Increasing Rate Server • 4 thread lanes • High / Medium / Low / Best-effort Client • 3 Rate-based Invocation threads • High 75 Hertz • Medium 50 Hertz • Low 25 Hertz • Several Best-effort threads Continuous Invocations Notes • System is running at two levels • At capacity Any progress by Best-effort threads will cause disruptions • Just below capacity Best-effort threads should be able to capture any slack in the system
Results A: Increasing Best-effort Work in RT-CORBA With LanesSystem Running at Capacity (Work = 30)(Client and Server on same machine)
Results B: Increasing Best-effort Work in RT-CORBA With LanesSystem Running Slightly Below Capacity (Work = 28)(Client and Server on same machine)
Results C: Increasing Best-effort Work in RT-CORBA With LanesSystem Running Slightly Below Capacity (Work = 28)(Client and Server on remote machines)
Conclusions: Increasing Best-effort Work in RT-CORBA With Lanes • Addition of best-effort client threads did not effect any of the three priority based clients • Best-effort client threads were limited to picking up slack left in the system • As the number of best-effort client threads increase, throughput per best-effort client thread decreases, but the collective best-effort client throughput remains constant • When client and server are on separate machines, there is more slack in the system since all the client-side processing is done on another machine • Behavior is consistent for a real-time system
Experiment 6: Increasing Workload in RT-CORBA Without Lanes Experiment • Measure the disruption caused by Increasing Workload in RT-CORBA Without Lanes • Increasing Priority Increasing Rate Server • 3 threads in pool Client • 3 Rate-based Invocation threads • High 25 Hertz • Medium 50 Hertz • Low 75 Hertz Notes • Server pool priority will be varied • Low / Medium / High
Result A: Increasing Workload in RT-CORBA Without LanesServer Pool Priority = Low
Result B: Increasing Workload in RT-CORBA Without LanesServer Pool Priority = Medium
Result C: Increasing Workload in RT-CORBA Without LanesServer Pool Priority = High