Flexible Scheduling in Middleware for Distributed Real-Time Applications

Flexible Scheduling in Middleware for Distributed Rate-Based Real-Time Applications Christopher D. Gill Dissertation Supervisors: Dr. Ron K. Cytron, Dr. Douglas C. Schmidt Department of Computer Science Washington University, St. Louis, MO cdgill@cs.wustl.edu Tuesday, December 18, 2001

Hardware keeps getting smaller, faster, & cheaper • Software keeps getting larger, slower, & more expensive Historical Challenges Client AbstractService Service Proxy service service service 1 1 New Challenges Motivation for Studying Adaptive Middleware Trends • Building distributed systems is hard • Building them on-time & under budget is even harder • Many mission-critical distributed applications require real-time QoS guarantees • e.g., combat systems, online trading, telecom • Building QoS-enabled applications manually is tedious, error-prone, & expensive • Conventional middleware does not support real-time QoS requirements effectively

Overview of Research Areas These technical challenges raise crucial systems issues in both theoretical and empirical dimensions

Motivating Applications • Boeing Bold Stroke Middleware • Infrastructure Platform • Operations well defined • Event-mediated middleware solution • Previous-generation systems static • Next-generation systems dynamic and adaptive

Historically, each application has solved scheduling on its own • Tedious, error-prone • Costly over system lifecycles • Single-paradigm approaches Limitations With Existing Approaches • Current middleware lacks hooks for key domain-specific features, e.g.: • Optimized integration with higher level managers • Hybrid static-dynamic scheduling strategies • Strategies built from primitive elements • Adaptive domain-specific optimizations

Research Contributions Systems Architecture and Framework • Extends open-source middleware Patterns for Adaptive Scheduling • Capture design experience and solutions Empirical Evaluation of Adaptive Scheduling • Practical benefits for real-world applications

rate propagation WCET propagation static RMS selected rates propagated rates static mandatory rate tuples Dispatching configuration optional LLF tuple visitor laxity timers operation visitors sub-graph Rate and priority assignment policy Research Approach: the Kokyu Flexible Middleware Scheduling/Dispatching Framework Scheduler Dispatcher • Dispatcher is (re)configurable • Multiple priority lanes • Queue, thread, timers per lane • Starts repetitive timers once • Looks up lane on each arrival • Application specifies characteristics • e.g., criticality, periods, dependencies • Scheduler assigns rates & priorities per topology, scheduling policy • Defines necessary dispatch configuration • Implicit projection • Of specific scheduling policy into generic dispatch infrastructure

Greater Utilization with Criticality Isolation

Safety: Meet Critical Deadlines Static Dynamic same point in execution Static scheduling cannot isolate critical operation deadlines in overload

Dispatching configuration Solution: Support for Hybridizing Static & Dynamic Scheduling Heuristics Dispatcher • Abstract mapping: operation characteristics  OS & middleware primitives • Generalizes RMS, EDF, LLF, MUF, RMS+LLF, RMS+EDF, … • Arbitrary composition of primitives • Drives factory-driven dispatching module (re)configuration at run-time (work in progress) laxity MUF laxity critical non-critical

No One Strategy is Optimal

Case for Multi-Paradigm Scheduling ) Dynamic1 Dynamic2 Static

Solution: Compose of Scheduling Heuristics from Dispatching Primitives Dispatcher static feasibility static laxity RMS+LLF MUF static laxity performance static laxity mandatory optional Gives Fine Grain Control over Feasibility / Performance Trade-Off

Improving the Bound on Adaptive Rescheduling

Benefits of Resource Adaptation in Middleware Adaptation Preserves Feasibility in Changing Environment • Skillfully done, it also improves performance Middleware Offers Portable, Robust Solutions • Can add new managers to a system, E.g., QuO, RTARM However, a new Manager (e.g., RTARM) may be a Black Box • Determines the number of scheduler calls • Determines the inputs to each call Need Optimizations to Ensure Nimble Adaptation

Adaptation Time (RTARM as a Black Box) n = # target region operations O(log n) or O(n) fits measured time / call Number of calls is O(n) Time to adapt thus O(n log n) or O(n2) Either way, we can do better

rate propagation WCET propagation selected rates propagated rates rate tuples sub-graph Rate and priority assignment policy Solution: Integrated Rate & Priority Selection Scheduler • Skillfully integrate rate and priority mechanisms • At least as good overall: O(log n) per operation: comparison sort • Better in special cases: O(1) per operation: radix sort tuple visitor Retains Modular Policy while Supporting Optimization operation visitors

Unobtrusive Monitoring and Control Feedback

FRAME MANAGER Solution: Kokyu Real-Time Metrics Monitoring Framework Consistent View of Time REMOTE LOGGER METRICS CACHE SHARED MEMORY Customized Data Collection PROBES STORAGE METRICS MONITOR OPERATIONS PROBES Logging and Visualization DISPATCHER REMOTE WORKSTATION DOVE Browser (Java) EMBEDDED BOARDS QuO Feedback to Higher-Level Resource Managers RTARM

Co-Scheduling Control Elements with Application

Goal: co-scheduling of control elements with the application Inclusive Systems Approach Dispatcher Problem: separate treatment of control elements and application high medium low App! App? Control! Control?

feasibility feasibility feasibility feasibility performance performance performance performance feasibility feasibility feasibility feasibility feasibility feasibility performance performance performance performance performance performance Solution: Use Empirical & A Priori Information to Co-Schedule Resource Mgrs & Applications RMS {App!, Control!,?, App?} LLF {App!, Control!,?, App?} Dispatcher Preserve Assurances but Optimize Performance RMS {App!, Control!,?} LLF {App?l} LLF {App!, Control!,?} LLF {App?} Towards an Adaptive Control Decision Lattice RMS {App!, Control!} LLF {Control?, App?} LLF {App!, Control!} LLF {Control?, App?} RMS {App!} LLF {Control!,?, App?} LLF {App!} LLF {Control!,?, App?}

Experimental Test-Bed Application • Research Version of Operational Flight Program for AV-8B Aircraft • Added navigation route computations to ramp non-critical load • Added critical and non-critical computations to inject execution time jitter Middleware • The ACE ORB (TAO) • TAO Real-Time Event Channel • Kokyu Framework: Scheduling, Dispatching, and Metrics Operating System and Hardware • VxWorks RTOS on the PPC boards • 200 MHz Motorola PPC 604 card • two 100 MHz Dy4-177 PPC 603 cards • Dy4-783 memory mapped display processor • Commercial VME-64 chassis with all boards • Switched Ethernet, MIL-STD-1553 MUX Bus on one Dy4-177 card

Popular Scheduling Strategies Rate Monotonic Scheduling (RMS) • Assigns thread priorities by rate • Operations at each priority handled in FIFO order RMS + Minimum Laxity First (MLF) • Critical operations managed as in RMS • Non-critical operations managed in single lowest priority • Non-critical operations handled in minimum laxity (slack time) order Maximum Urgency First (MUF) • Thread priority per criticality level • Operations in each priority level handled in laxity order

Experimental Benchmarks • Measure Performance of Heuristics over Execution Time Jitter & Load • Each numbered operating region is stable • Transitions between operating regions: changes in SRT load, HRT+SRT jitter • Performed using realistic hardware, OS, middleware, OFP application

Region 7 Performance (MUF does Better)

Region 8 Performance (RMS+MLF does Better)

Adaptation is Beneficial: Best Strategy Differs ML H MH ML H MH ML H MH L L L

Mining Adaptation Clues

Refining Adaptation Clues

Adaptation over Scheduling Heuristics • A Map of the Best Performing Heuristic over Execution Jitter & Load • RMS performs best if system is under-loaded (theory predicts this) • RMS+MLF performs best in overload if jitter is very high or very low • MUF performs best if jitter is moderate • A Basis for Adaptive Control • Run-time observable measure correlates with performance: operation latency • A simple automaton could be constructed

My Contribution: A Unified Middleware Approach • Real QoS problems require both theoretical and empirical perspectives • Scheduling theorygeneralized over OS/middleware primitives  heuristic space • Empirical study of specific heuristic (sub-)spaces is crucial • Analogy: theory/studies of Ethernet behavior: bin-exp-backoff vs. congestion collapse

Research Impacts and Collaborations

Future Research Objectives

Concluding Remarks Empirical Evaluation • Validates adaptive/hybrid scheduling approach • Quantifies costs/benefits of discrete alternatives • Powerful when combined with theoretical view • “Mining” technique for problems & properties Composable Scheduling/Dispatching • Enables domain-specific optimizations, especially when design decisions are aided by empirical data Heuristic Space Experiments • Offer a quantitative blueprint for adaptation Open-Source Code • All software described here that is uniquely a part of my research will be made available in the ACE_wrappers distribution • Kokyu framework (early 2002) • Dispatching for new TAO Event Channel

Thanks Mentors • Dr. Douglas C. Schmidt, Dr. David L. Levine, and Dr. Ron K. Cytron Colleagues and Collaborators • Faculty and Staff of WU CS and CoE • Dr. Douglas Niehaus • Mr. David Sharp, Mr. Bryan Doerr, Mr. Don Winter, Dr. David Corman, Dr. Doug Stuart, Mr. Brian Mendel, Mr. Greg Holtmeyer, Mr. Pat Goertzen, Ms. Jeanna Gossett, Ms. Amy Wright, Mr. Jim Urness, Mr. Tom Venturella, Mr. Russ Wolter • Mr. Kenneth Littlejohn (AFRL), Dr. Gary Koob (DARPA ITO) • Dr. Rakesh Jha, Mr. John Shackleton, Mr. Nigel Birch • Dr. Joseph Loyall, Dr. Richard Schantz, Dr. John Zinky, Dr. David Bakken • Dr. Ebrahim Moshiri, Mr. Malcolm Spence, Mr. Kevin Stanley Friends and Family • Members of the DOC Group • WU CS Graduate Students • My wife Barb and son Paul • My Mom Dr. Helen Gill, Dad Mr. David Gill and sister Ms. Sarah E. Gill • My parents and siblings-in-law

Additional Questions ? These slides: http://www.cs.wustl.edu/~cdgill/cdgill_defense.ppt http://www.cs.wustl.edu/~cdgill/cdgill_defense.pdf Contact Information: http://www.cs.wustl.edu/~cdgill cdgill@cs.wustl.edu

Backup Slides

Number of Calls is Linear

Time in Each Call Appears Linear

Problem: Limitations with Existing Approaches to Time Frame Management Application QuO RT-ARM Previously ad hoc • Used different time types • I.e., gethrtime ( ) vs. gettimeofday ( ) • Fortunately mapped to same clock • Brittle & non-portable solution • No logical time support end-to-end • No common point of reference for time frames across subsystems ??? Dispatcher Upcall Adapter Start End Id 0 1 Hz Harmonic Frame Rates 2 Hz 0 1 4 Hz 0 1 2 3

RATE ID START END 20HZ 20HZ 20HZ 20HZ 20HZ 20HZ 20HZ 20HZ 20HZ 2 0 7 8 6 4 1 3 5 250 300 50 350 200 0 100 150 400 50 150 100 300 350 200 450 250 400 10HZ 10HZ 10HZ 10HZ 10HZ 10HZ 10HZ 10HZ 10HZ 3 0 1 0 4 3 1 2 2 20HZ 200 200 400 100 0 300 0 100 300 100 200 300 200 400 300 400 500 100 5HZ 5HZ 5HZ 5HZ 5HZ 5HZ 5HZ 5HZ 5HZ 0 0 1 1 1 0 2 1 0 10HZ 0 200 0 200 0 400 0 200 200 400 600 200 200 200 400 400 200 400 5HZ Solution: Distinct Manager for Physical & Logical Time Frames WSOA: Singleton Frame Manager FRAME MANAGER • Consistent low-level service • Frame start, end, id for each rate • Register rates with Frame Manager • Timer-based call to update frames • Dispatcher gets deadline for period • I.e., for laxity, deadline queues • Upcall adapter gets deadline for cancellation, metrics decisions 20HZ 450 5HZ 600 10HZ 20HZ 500 450 • Higher-level resource managers can also use Frame Manager for adaptation decisions • E.g., RT-ARM, QuO DISPATCHER ADAPTER RTARM QuO • IDL interface allows remote calls EMBEDDED BOARD

METRICS CACHE METRICS CACHE Problem: Limitations with Existing Adaptive Feedback Collection Approaches Process A Process B Solutions originally ad hoc • Tedious, error prone, brittle Lack of support for embedded, RT constraints • Stringent memory limitations • Narrow windows of harvest time • Shared memory (e.g., VME) is attractive • However, C++ semantics can cause problems • Virtual functions • Native pointers • Heap allocation • Address space mapping SHARED MEMORY SHARED MEMORY

BEGIN SUSPEND RESUME END BEGIN BEGIN SUSPEND END RESUME END Solution: Metrics Collection Within Embedded & Real-Time Constraints METRICS CACHE 20Hz EnQ WSOA: Shared Memory Capability • Offset-based smart pointers, • Strategized allocators • Supports run-time cache sizing and use from any address space • System instrumented with C++ inline probe macros: write to a singleton metrics cache Process Tile B S R E SHARED MEMORY METRICS MONITOR 20Hz DQ B S R E Proc Tile B E 20Hz Enqueue BEGIN Cache is Usable from Process Owned Memory • Upcall Adapter • Metrics Monitor, Logger  DOVE 20Hz Dequeue BEGIN END SUSPEND DISPATCHER RESUME END OPERATION UPCALL ADAPTER EMBEDDED BOARDS

Flexible Scheduling in Middleware for Distributed Real-Time Applications