Exploring Adaptive Applications and Virtualized Audio in Distributed Computing Systems

An Introduction to thePrescience Lab Peter A. Dinda Prescience Lab Department of Computer Science Northwestern University http://plab.cs.northwestern.edu

Outline • Motivations • Questions • Projects • Conclusions

How do we deliver arbitrary amounts of computational power to ordinary people? Assumptions: Shared computing environments, Limited utility of reservations

How do we deliver arbitrary amounts of computational power to ordinary people? Distributed and Parallel Computing Interactive Applications

How do we build adaptive distributed interactive applications effectively? How does the demand for resources in these applications vary over time? How does the supply of resources vary over time? How can we use the adaptation mechanisms exposed by an application to match its resource demand with resource supply?

How do we build adaptive distributed interactive applications effectively? • Applications • Virtualized Audio • Immersive audio • Interactive visualization of massive datasets • Frameworks • Virtuoso • Grid computing using virtual machines • Dv

Virtualized Audio (with Dong Lu, Curtis Barrett) Distributed Computational Resources Other Users or Audio Sources Microphones, Headphones GPS, head-tracking Wireless connectivity Limited local computation

Virtualized Audio: Interactive Auralization Listener Performer Room Virtual Listening Room Virtual Performer Sound Field 2 Auralization HRTF Listener at Virtual Location Headphones • Auralization injects performer into listener’s space • Auralization adapts as listener moves or room changes • Recomputes impulse responses

Architecture of Interactive Auralization User-driven Immersive Audio Client Scalable Audio Filtering Service Streaming AudioService Current Spatial Modeland source/sink positions Master filtering server Filter configuration Filtering server Source 1 Mixing server Left Channel Client Filtering server Source 2 Binaural Audio Output Right Channel Filtering server Source 3 Filter generation Mixing server Filtering server Source 4 Parallel FD Simulation Parallel FD Simulation Parallel FD Simulation Parallel FD Simulation Parallel FD Simulation Parallel FD Simulation Filtering server Source n Filtering server Scalable Real-time Simulation Server Impulse response filters characterize user’s space

Adaptation in Virtualized Audio • Numerous mechanisms • Sampling rate, impulse response length, algorithm for computing impulse response, filter approximations, server selection, … • Can vary computational load over many orders of magnitude • Compute/communicate ratio is huge • How do we use these mechanisms to achieve consistent real-time response?

Virtuoso (with Renato Figueiredo, Jose Fortes, Ananth Sundararaj, Ashish Gupta) • Make Grids like PCs • User gets raw machine(s) • Machine appears to be on his network • User can install what he needs as owner • Lower level of abstraction • Classic virtual machine monitors • Virtual networking • Middleware support • Instantiation, migration of machines • Connectivity to remote files, machines • Resource control

Classic Virtual Machine: VMWare

Why Virtual Networking? • A machine running is suddenly plugged into your network. What happens? • Does it get an IP address? • Is it a routeable address? • Does firewall let its traffic through? • To any port? Virtual machine hostile environment

A Simple Layer 2 Virtual Network Client Server VM monitor SSH Remote VM Virtual NIC Physical NIC Physical NIC Friendly Local Network Hostile Remote Network

A Simple Layer 2 Virtual Network Client Server Bridge Bridge VM monitor SSH Tunnel Remote VM Virtual NIC Physical NIC Physical NIC Friendly Local Network Hostile Remote Network

Bootstrapping the Virtual Network • Star topology always possible • TCP session from client must have been possible • Better topology may be possible • Depends on security at each site • Topology may change • Virtual machines can migrate • Bootstrap to higher layers • Virtual filesystems

How does the demand for resources vary over time? How does the supply of resources vary over time? • Resource demand in interactive applications • Instrumented games, preceding applications, … • Not much is known here • Resource supply in distributed environments • URGIS • Grid Information based on the relational data model • GridG • Clairvoyance • Online resource prediction for hosts and networks • Tsunami • Wavelet-based approaches to information dissemination • Diffusion • Zero-cost information dissemination

URGIS (with Beth Plale, Dong Lu) • Unified Relational Grid Information Services • GIS based on the relational data model • Leverage results from database community • Northwestern work: MySQL, Oracle RDBMSes • Compositional queries • Application-specific information aggregration • Like decision support queries (TPC-H) • Support for information of varying dynamicity • Varying update rates and freshness requirements • Seamless inclusion of streaming data • A common data model and query language • Powerful, high level, declarative, easy-to-optimize

Compositional Queries • “Find four different hosts with a total memory between 512 MB and 1 GB” • “Find all available sensors and predictors that provide information about the network path between a and b” • “Tell me when the load on any of these four hosts diverges from the average by more than 50%”

Example select host1.name, host2.name, host3.name, host4.name, hd1. mem +hd2. mem +hd3. mem +hd4. mem as TotalMem , from hosts as host1, hostdata as hd1, hosts as host2, hostdata as hd2, hosts as host3, hostdata as hd3, hosts as host4, hostdata as hd4 where host1. ip =hd1. ip and host2. ip =hd2. ip and host3. ip =hd3. ip and host4. ip =hd4. ip and hd1. mem +hd2. mem +hd3. mem +hd4. mem >=512 and hd1. mem +hd2. mem +hd3. mem +hd4. mem <=1024 and host1. ip !=host2. ip and host1. ip !=host3. ip and host1. ip !=host4. ip and host2. ip !=host3. ip and host2. ip !=host4. ip and host3. ip !=host4. ip order by TotalMem desc limit 10

nondeterministically select host1.name, host2.name, host3.name, host4.name, hd1. mem +hd2. mem +hd3. mem +hd4. mem as TotalMem , from hosts as host1, hostdata as hd1, hosts as host2, hostdata as hd2, hosts as host3, hostdata as hd3, hosts as host4, hostdata as hd4 where host1. ip =hd1. ip and host2. ip =hd2. ip and host3. ip =hd3. ip and host4. ip =hd4. ip and hd1. mem +hd2. mem +hd3. mem +hd4. mem >=512 and hd1. mem +hd2. mem +hd3. mem +hd4. mem <=1024 and host1. ip !=host2. ip and host1. ip !=host3. ip and host1. ip !=host4. ip and host2. ip !=host3. ip and host2. ip !=host4. ip and host3. ip !=host4. ip order by TotalMem desc limit 10 inlessthan 5 seconds usingheuristic prefer_depth_first Time-bounded, non-deterministic queries

Implementation of Non-deterministic, Time-bounded Queries • Random number associated with each row in each table (or insert) • Query is rewritten to incorporate a random ranges on the input tables • Range lengths chosen to meet deadline • This is not trivial and we don’t have this translation yet • Heuristics not yet incorporated • Hopefully RDBMS-independent

RGIS1 Non-deterministic Query Performance 100,000 hosts Find n hosts with a total memory of 1 GB of memory

RGIS1 Non-deterministic Query Performance 100,000 hosts Find 2 hosts with a total memory of 1 GB of memory

Clairvoyance (with Jason Skicewicz, Yi Qiao) • Measure, Characterize, Predict, and Disseminate information about dynamic resource supply • Resource signals • Discrete-time signals strongly correlated with resource supply • Currently, univariate, working on multivariate • Currently • Host load • Windows performance counters (using WatchTower) • Network flow bandwidth and latency (using Remos) • Any text-based source • Online predictive modeling • Simple models (MEAN, BESTMEAN, BESTMEDIAN, LAST…) • Box/Jenkins Models (AR, MA, ARMA, ARIMA,…) • Fractional ARIMAs • Nonlinear modeling (TARs, Wavelet-decompositions)

RPS Toolkit • Extensible toolkit for implementing resource signal prediction systems [CMU-CS-99-138] • Growing: RTA, RTSA, Wavelets, GUI, etc • Easy “buy-in” for users • C++ and sockets (no threads) • Prebuilt prediction components • Libraries (sensors, time series, communication)

Measurement and Prediction

Multiscale Network Prediction • Large, recent study of predictability • Hundreds of NLANR and other traces • Mostly WANs • Different resolutions • Binning and low-pass via wavelets • Sweet Spot • Predictability often maximized at particular resolution

Multiresolution Prediction Example

Tsumami (with Jason Skicewicz) • Efficient dissemination of resource signals • Wavelet-based methods for characterization, modeling, and prediction • Tsumani toolkit will ship with the next RPS release

The Tension Video App Sensor Fine-grain measurement … Resource-appropriate measurement Grid App Resource Signal (periodic sampling) Example: host load Course-grain measurement

Proposed System Application Sensor Network Stream Interval Level 0 Level 0 Wavelet Transform Inverse Wavelet Transform Level L Level M-1 Level M Application receives levels based on its needs

Delay • Transforms introduce sample delay • Depends on number of levels and type of filter used • Exponential in the number of levels • Affects both streaming and block transforms • Seemingly inherent for wavelets • Exploit prediction • Limited success • Exploit “wavelet-like” decompositions • Trade-off between reconstruction accuracy and delay • Existing theory. Our evaluation not done yet.

Wavelets and Prediction • Predict each level of transformed signal separately • “Detail signals” • Surprisingly ineffective in practice • Whitens the signal • “Approximation signals” • Smoothing, used in network prediction work discussed earlier • Reasonably effective, worth pursuing

Diffusion (with Brian Cornell, Jack Lange) • Efficient dissemination of resource signals • Piggyback additional information on existing packet transfers • No additional packets • Packet size unchanged • Evaluations with traces, Minet • Implementation as Linux kernel module • >=86 bits per packet possible • 17 bits per packet verified Zero Cost Information Dissemination

Diffusion Implementation Sensor App App Consumer Transport Transport Network Network Header Editing Data Extraction Data Link Data Link Physical Physical Sensor data piggybacked on application packets

SpyTalk

How can we use the adaptation mechanisms exposed by an application to match its resource demand with resource supply? • Application-level performance predictions • Running Time Advisor • Confidence interval for running time of a task on a particular host • Message Time Advisor • Confidence interval for transfer time of a message • Adaptation advisors • Real-time Scheduling Advisor • Choose which host of a set on which a task is most likely to meet its deadline • Real-time  responsiveness requirement • Service for interactive applications

Running Time Advisor

Real-time Scheduling Advisor

How do we build adaptive distributed interactive applications effectively? How does the demand for resources in these applications vary over time? How does the supply of resources vary over time? How can we use the adaptation mechanisms exposed by an application to match its resource demand with resource supply?

How do we deliver arbitrary amounts of computational power to ordinary people? Distributed and Parallel Computing Interactive Applications

Future Directions • Continue pushing on projects discussed • New directly related projects • Interactive hierarchical visualization of huge datasets • Resource demand characterization, modeling, and prediction • Other directions • Intrusion detection using signal processing

For MoreInformation • Peter Dinda • http://www.cs.northwestern.edu/~pdinda • Prescience Lab • http://plab.cs.northwestern.edu

Exploring Adaptive Applications and Virtualized Audio in Distributed Computing Systems