450 likes | 473 Views
An Introduction to the Prescience Lab. Peter A. Dinda Prescience Lab Department of Computer Science Northwestern University http://plab.cs.northwestern.edu. Outline. Motivations Questions Projects Conclusions.
E N D
An Introduction to thePrescience Lab Peter A. Dinda Prescience Lab Department of Computer Science Northwestern University http://plab.cs.northwestern.edu
Outline • Motivations • Questions • Projects • Conclusions
How do we deliver arbitrary amounts of computational power to ordinary people? Assumptions: Shared computing environments, Limited utility of reservations
How do we deliver arbitrary amounts of computational power to ordinary people? Distributed and Parallel Computing Interactive Applications
How do we build adaptive distributed interactive applications effectively? How does the demand for resources in these applications vary over time? How does the supply of resources vary over time? How can we use the adaptation mechanisms exposed by an application to match its resource demand with resource supply?
How do we build adaptive distributed interactive applications effectively? • Applications • Virtualized Audio • Immersive audio • Interactive visualization of massive datasets • Frameworks • Virtuoso • Grid computing using virtual machines • Dv
Virtualized Audio (with Dong Lu, Curtis Barrett) Distributed Computational Resources Other Users or Audio Sources Microphones, Headphones GPS, head-tracking Wireless connectivity Limited local computation
Virtualized Audio: Interactive Auralization Listener Performer Room Virtual Listening Room Virtual Performer Sound Field 2 Auralization HRTF Listener at Virtual Location Headphones • Auralization injects performer into listener’s space • Auralization adapts as listener moves or room changes • Recomputes impulse responses
Architecture of Interactive Auralization User-driven Immersive Audio Client Scalable Audio Filtering Service Streaming AudioService Current Spatial Modeland source/sink positions Master filtering server Filter configuration Filtering server Source 1 Mixing server Left Channel Client Filtering server Source 2 Binaural Audio Output Right Channel Filtering server Source 3 Filter generation Mixing server Filtering server Source 4 Parallel FD Simulation Parallel FD Simulation Parallel FD Simulation Parallel FD Simulation Parallel FD Simulation Parallel FD Simulation Filtering server Source n Filtering server Scalable Real-time Simulation Server Impulse response filters characterize user’s space
Adaptation in Virtualized Audio • Numerous mechanisms • Sampling rate, impulse response length, algorithm for computing impulse response, filter approximations, server selection, … • Can vary computational load over many orders of magnitude • Compute/communicate ratio is huge • How do we use these mechanisms to achieve consistent real-time response?
Virtuoso (with Renato Figueiredo, Jose Fortes, Ananth Sundararaj, Ashish Gupta) • Make Grids like PCs • User gets raw machine(s) • Machine appears to be on his network • User can install what he needs as owner • Lower level of abstraction • Classic virtual machine monitors • Virtual networking • Middleware support • Instantiation, migration of machines • Connectivity to remote files, machines • Resource control
Why Virtual Networking? • A machine running is suddenly plugged into your network. What happens? • Does it get an IP address? • Is it a routeable address? • Does firewall let its traffic through? • To any port? Virtual machine hostile environment
A Simple Layer 2 Virtual Network Client Server VM monitor SSH Remote VM Virtual NIC Physical NIC Physical NIC Friendly Local Network Hostile Remote Network
A Simple Layer 2 Virtual Network Client Server VM monitor SSH Remote VM Virtual NIC Physical NIC Physical NIC Friendly Local Network Hostile Remote Network
A Simple Layer 2 Virtual Network Client Server Bridge Bridge VM monitor SSH Tunnel Remote VM Virtual NIC Physical NIC Physical NIC Friendly Local Network Hostile Remote Network
Bootstrapping the Virtual Network • Star topology always possible • TCP session from client must have been possible • Better topology may be possible • Depends on security at each site • Topology may change • Virtual machines can migrate • Bootstrap to higher layers • Virtual filesystems
How does the demand for resources vary over time? How does the supply of resources vary over time? • Resource demand in interactive applications • Instrumented games, preceding applications, … • Not much is known here • Resource supply in distributed environments • URGIS • Grid Information based on the relational data model • GridG • Clairvoyance • Online resource prediction for hosts and networks • Tsunami • Wavelet-based approaches to information dissemination • Diffusion • Zero-cost information dissemination
URGIS (with Beth Plale, Dong Lu) • Unified Relational Grid Information Services • GIS based on the relational data model • Leverage results from database community • Northwestern work: MySQL, Oracle RDBMSes • Compositional queries • Application-specific information aggregration • Like decision support queries (TPC-H) • Support for information of varying dynamicity • Varying update rates and freshness requirements • Seamless inclusion of streaming data • A common data model and query language • Powerful, high level, declarative, easy-to-optimize
Compositional Queries • “Find four different hosts with a total memory between 512 MB and 1 GB” • “Find all available sensors and predictors that provide information about the network path between a and b” • “Tell me when the load on any of these four hosts diverges from the average by more than 50%”
Example select host1.name, host2.name, host3.name, host4.name, hd1. mem +hd2. mem +hd3. mem +hd4. mem as TotalMem , from hosts as host1, hostdata as hd1, hosts as host2, hostdata as hd2, hosts as host3, hostdata as hd3, hosts as host4, hostdata as hd4 where host1. ip =hd1. ip and host2. ip =hd2. ip and host3. ip =hd3. ip and host4. ip =hd4. ip and hd1. mem +hd2. mem +hd3. mem +hd4. mem >=512 and hd1. mem +hd2. mem +hd3. mem +hd4. mem <=1024 and host1. ip !=host2. ip and host1. ip !=host3. ip and host1. ip !=host4. ip and host2. ip !=host3. ip and host2. ip !=host4. ip and host3. ip !=host4. ip order by TotalMem desc limit 10
nondeterministically select host1.name, host2.name, host3.name, host4.name, hd1. mem +hd2. mem +hd3. mem +hd4. mem as TotalMem , from hosts as host1, hostdata as hd1, hosts as host2, hostdata as hd2, hosts as host3, hostdata as hd3, hosts as host4, hostdata as hd4 where host1. ip =hd1. ip and host2. ip =hd2. ip and host3. ip =hd3. ip and host4. ip =hd4. ip and hd1. mem +hd2. mem +hd3. mem +hd4. mem >=512 and hd1. mem +hd2. mem +hd3. mem +hd4. mem <=1024 and host1. ip !=host2. ip and host1. ip !=host3. ip and host1. ip !=host4. ip and host2. ip !=host3. ip and host2. ip !=host4. ip and host3. ip !=host4. ip order by TotalMem desc limit 10 inlessthan 5 seconds usingheuristic prefer_depth_first Time-bounded, non-deterministic queries
Implementation of Non-deterministic, Time-bounded Queries • Random number associated with each row in each table (or insert) • Query is rewritten to incorporate a random ranges on the input tables • Range lengths chosen to meet deadline • This is not trivial and we don’t have this translation yet • Heuristics not yet incorporated • Hopefully RDBMS-independent
RGIS1 Non-deterministic Query Performance 100,000 hosts Find n hosts with a total memory of 1 GB of memory
RGIS1 Non-deterministic Query Performance 100,000 hosts Find 2 hosts with a total memory of 1 GB of memory
Clairvoyance (with Jason Skicewicz, Yi Qiao) • Measure, Characterize, Predict, and Disseminate information about dynamic resource supply • Resource signals • Discrete-time signals strongly correlated with resource supply • Currently, univariate, working on multivariate • Currently • Host load • Windows performance counters (using WatchTower) • Network flow bandwidth and latency (using Remos) • Any text-based source • Online predictive modeling • Simple models (MEAN, BESTMEAN, BESTMEDIAN, LAST…) • Box/Jenkins Models (AR, MA, ARMA, ARIMA,…) • Fractional ARIMAs • Nonlinear modeling (TARs, Wavelet-decompositions)
RPS Toolkit • Extensible toolkit for implementing resource signal prediction systems [CMU-CS-99-138] • Growing: RTA, RTSA, Wavelets, GUI, etc • Easy “buy-in” for users • C++ and sockets (no threads) • Prebuilt prediction components • Libraries (sensors, time series, communication)
Multiscale Network Prediction • Large, recent study of predictability • Hundreds of NLANR and other traces • Mostly WANs • Different resolutions • Binning and low-pass via wavelets • Sweet Spot • Predictability often maximized at particular resolution
Tsumami (with Jason Skicewicz) • Efficient dissemination of resource signals • Wavelet-based methods for characterization, modeling, and prediction • Tsumani toolkit will ship with the next RPS release
The Tension Video App Sensor Fine-grain measurement … Resource-appropriate measurement Grid App Resource Signal (periodic sampling) Example: host load Course-grain measurement
Proposed System Application Sensor Network Stream Interval Level 0 Level 0 Wavelet Transform Inverse Wavelet Transform Level L Level M-1 Level M Application receives levels based on its needs
Delay • Transforms introduce sample delay • Depends on number of levels and type of filter used • Exponential in the number of levels • Affects both streaming and block transforms • Seemingly inherent for wavelets • Exploit prediction • Limited success • Exploit “wavelet-like” decompositions • Trade-off between reconstruction accuracy and delay • Existing theory. Our evaluation not done yet.
Wavelets and Prediction • Predict each level of transformed signal separately • “Detail signals” • Surprisingly ineffective in practice • Whitens the signal • “Approximation signals” • Smoothing, used in network prediction work discussed earlier • Reasonably effective, worth pursuing
Diffusion (with Brian Cornell, Jack Lange) • Efficient dissemination of resource signals • Piggyback additional information on existing packet transfers • No additional packets • Packet size unchanged • Evaluations with traces, Minet • Implementation as Linux kernel module • >=86 bits per packet possible • 17 bits per packet verified Zero Cost Information Dissemination
Diffusion Implementation Sensor App App Consumer Transport Transport Network Network Header Editing Data Extraction Data Link Data Link Physical Physical Sensor data piggybacked on application packets
How can we use the adaptation mechanisms exposed by an application to match its resource demand with resource supply? • Application-level performance predictions • Running Time Advisor • Confidence interval for running time of a task on a particular host • Message Time Advisor • Confidence interval for transfer time of a message • Adaptation advisors • Real-time Scheduling Advisor • Choose which host of a set on which a task is most likely to meet its deadline • Real-time responsiveness requirement • Service for interactive applications
How do we build adaptive distributed interactive applications effectively? How does the demand for resources in these applications vary over time? How does the supply of resources vary over time? How can we use the adaptation mechanisms exposed by an application to match its resource demand with resource supply?
How do we deliver arbitrary amounts of computational power to ordinary people? Distributed and Parallel Computing Interactive Applications
Future Directions • Continue pushing on projects discussed • New directly related projects • Interactive hierarchical visualization of huge datasets • Resource demand characterization, modeling, and prediction • Other directions • Intrusion detection using signal processing
For MoreInformation • Peter Dinda • http://www.cs.northwestern.edu/~pdinda • Prescience Lab • http://plab.cs.northwestern.edu