390 likes | 636 Views
Responsive Interactive Applications by Dynamic Mapping of Activation Trees February 20, 1998. Peter A. Dinda http://www.cs.cmu.edu/~pdinda School of Computer Science Carnegie Mellon University. Outline. Responsive interactive applications Best effort real-time service
E N D
Responsive Interactive Applications by Dynamic Mapping of Activation TreesFebruary 20, 1998 Peter A. Dinda http://www.cs.cmu.edu/~pdinda School of Computer Science Carnegie Mellon University
Outline • Responsive interactive applications • Best effort real-time service • Dynamic mapping problem • History-based prediction approach • Statistical properties of host load • Algorithms and evaluation for simplified problem • Research plan • Application and network traces
Interactive Application Model Feedback Message Handler Message mouse_click() Aperiodic User Action Activation tree
Acoustic Room Modeling Room model impulse responses Physical Simulation of Wave Eqn Speakers Modify model Frequency response plots
Other Applications • Image editing • The Adobe Photoshop universe • Computer aided design • Quake design optimization (Malcevic-97) • Computational steering • CUMULUS (Geist-96), CAVE (Disz-95), … • Games • DIS • Collaboration? • Collaborative Planning (Zinky-DUTC-95)
Responsiveness • Timely feedback to individual user actions • Bound: response time £ tmax • Jitter bound and resource usage hint • Bound: response time ³ tmin • Example: image editor drawing tool
A Best Effort Real-time Service MAP procedure() IN [tmin, tmax] • Execute the activation tree rooted at procedure() so that tmin£texec£tmax • No guarantees • Responsiveness spec: bounds [tmin,tmax] • Performance metric: fraction of trees that meet their bounds
Machine Model • Hosts on a LAN • No centralized or coordinated scheduling, or reservations • Other unrelated traffic exists • We are only a user • Remote execution facility • Can execute any procedure on any host • RPC, DSM, DCE, CORBA, DCOM, ... • Measurable - at least a good real-time clock exists (<1ms)
Execution Model [tmin,tmax] • Dynamically map nodes of the unfolding activation tree to the hosts • At each procedure call, choose which host is best suited to execute the call in order to meet the bounds on the tree
Dynamic Mapping Problem How do we map the nodes of the trees to the hosts so that the fraction of trees that satisfy their bounds is maximized?
Aspects of My Approach • History-based prediction • Decomposition of bounds • Adaptation of mapping algorithms during tree traversal Methods: Trace-driven simulation, Iterative refinement
time, duration, bounds time, duration, bounds time, duration, bounds time, duration, bounds time, duration, bounds time, duration, bounds History-based Prediction H0 foo() [tmin,tmax] • For each host, H0 predicts whether it can meet the bounds, based on past local history and then chooses one where it is possible • Execution times include both communication for remote call and the actual computation ... [t’min,t’max] H1 ... H0 bar() ... foo() is executing on H0 and calls bar(), which can be mapped to H0, H1, or H2 H2 H0 has a local history of execution times of bar() on each of the other hosts
Decomposition of Bounds foo() [tmin,tmax] partially executed, known • Choice of [t’min,t’max] for bar() depends on unvisited portion of the tree • Collect history of what fraction of time spent in foo() subtree was spent in bar() subtree • Choose fraction of bounds to give to bar() based on that history and current time [t’min,t’max]? bar() unexecuted, known unexecuted, unknown unexecuted, unknown
Adaptation of Mapping Algorithms During Tree Traversal • Tune strategy to how deep we are in the tree and how far along in the traversal • Explore more aggressively early in the traversal, when the effect of a bad decision is easiest to overcome • Find interesting new hosts • Spend less time making mapping decision deep in the tree • More likely to remain on single host
Statistical Properties of Load • Load traces • First order properties of the traces • Self-Similarity • Epochal Behavior Goal: modeling of load
Load Traces • Digital Unix one minute load average • 1 Hz sample rate, one week traces • Wide variety of Hosts (38 machines) • Production Cluster • 13 machines in the PSC’s Alpha Supercluster • Research Cluster • 8 machines in CMCL cluster • Compute Servers • 2 shared machines in CMCL lab • Desktops • 15 machines desktop workstations owned by CMCL members
Self-Similarity • Intuition: “looks the same” on all time scales • Autocorrelations similar across time scales • Power spectrum envelope decays differently from 1/f • Characterization: Hurst parameter • H Ranges from 0 to 1 • H<0.5 : negative near neighbor correlation - “choppy” • H=0.5 : no correlations • H>0.5 : positive near neighbor correlation - “smooth” • Methodology: Estimate Hurst Parameter • Four different, validated, estimators
What does Self-Similarity Tell Us? • Load is complex, but not random • History matters • Smoothing may be misguided • Load should frighten modelers • Conventional stochastic process models are wrong • Long memory stochastic process models are desirable but may be impractical
Epochal Behavior • Local frequency content of load signal is stable for long periods of time with abrupt transitions • “Spectrogram shows wide vertical bands” • Not the same as seasonality
What Does EpochalBehavior Tell Us? • Suggests decomposition of load trace modeling problem • Segmentation problem to find epochs • Modeling problem within each epoch
Algorithms and Evaluation for Simplified Problem • Map only leaf nodes • Ignore communication for I=1 to N do MAP leaf_procedure() IN [tmin,tmax]end
RangeCounter(W): A Near Optimal Algorithm • Each host has a quality level Q and a window of the last W execution times (W is small) • Choose host with highest quality level, and age quality levels of all hosts: Q=Q-1 • If bounds are met, increase host’s quality level by the inverse of our confidence in it: • If bounds are not met, reduce host’s quality level by half: Q=Q/2
Load Trace-based Simulation • Exec time computed from load trace using a simple, validated model • Mapping algorithms are given bounds, select a host, then are told exec time • Simulator computes performance of • Algorithm under test • Optimal (precognizant) algorithm • Random mapping • Individual host mappings
Scope of Evaluation • 9 mapping algorithms • 6 different groups of hosts • Chosen from 38 hosts • 1 week, 1 Hz load trace from each host • 648 different cases • Combinations of nominal time and bounds • 100,000 calls for each case
Research Plan Extend current results to the full dynamic mapping problem • Extend simulation environment to include communication and activation trees • Trace collection (Activation trees, network) • A trace for everything • Trace characterization (models, benchmarks) • Simulator extension • Develop algorithm • Evaluate with benchmarks • Incorporate into real system
Activation Tree Traces • Representation of real program runs • Each node annotated with compute time, and what data it references • Delay requirements
Network Traces • Realistic communication times • Packet traces on Ethernet with tcpdump • Simple broadcast networks seem too limiting • Remos • Existing trace databases