420 likes | 505 Views
Research Proficiency Exam. Survey of State-of-the-art in Inter-VM Communication Mechanisms Jian Wang. Talk Outline. Introduction Shared memory research Scheduler optimization research Challenges and problems. Motivation: Why Inter-VM Communication? . Physical Machine.
E N D
Research Proficiency Exam Survey of State-of-the-art in Inter-VMCommunication Mechanisms Jian Wang
Talk Outline • Introduction • Shared memory research • Scheduler optimization research • Challenges and problems
Motivation:Why Inter-VM Communication? Physical Machine • Virtualization technology is mainly focused on building the isolation barrier between co-located VMs. • However, applications often wish to talk across this isolation barrier • E.g. High performance grid apps, web services, virtual network appliances, transaction processing, graphics rendering. Virtual Machine A Virtual Machine B Hypervisor (or Virtual Machine Monitor)
Why not just use TCP or UDP? • Transparent to applications BUT • High communication overhead between co-located VMs
Xen and Xen Networking Subsystem Domain 0 PKT Communication data path between co-located VMs
Packet routed VM 1 Domain-0 VM 2 Xen Ask Xen to transmit pages Put packet into a page Ask Xen to swap/copy pages
Basic Idea: Shared Memory Advantages of using Shared Memory: • No need for per-packet processing • Pages reused in circular buffer • Writes are visible immediately • Fewer hypercalls (only for signaling)
With Shared Memory VM 1 VM 2 Xen Ask Xen to share pages Allocate one pool of pages
Shared Memory End Goals 1.Performance: High throughput, low latency and acceptable CPU consumption. 2.Transparency: Don't change the app. Don't change the kernel. • 3. Dynamism:On-the-fly setup/teardown channels. Auto discovery. Migration support.
Hypervisor Scheduler Dom1 DomX Dom2 DomX Dom1 …… …... t1 t2 t3 t4 Dom1 Dom2
Scheduler induced delays Jboss query1 query2 Running on dedicated servers reply1 reply2 DB Jboss query1 query2 Runnning on consolidated server reply1 reply2 DB Scheduler induced delays Network latency
Hypervisor Scheduler • Lack of communication awareness in VCPU scheduler • Lacks knowledge of timing requirements of tasks/applications within each VM. • Absence of support for real-time inter-VM interactions • Unpredictability of current VM scheduling mechanisms
End Goals for Scheduler Optimizations • Low latency • Independent of other domains’ workloads • Predictable
XenSocket Xiaolan Zhang, Suzanne McIntosh • Shared Memory between two domains • One way communication pipe • Below Socket layer • Bypass TCP/IP stack • No auto discovery, no migration support, no transparency
Remote address • Remote port # Server socket(); bind(sockaddr_inet); listen(); accept(); socket(); bind(sockaddr_xen); Client socket(); connect(sockaddr_inet); socket(); connect(sockaddr_xen); • Local port # • Remote VM # • Remote VM # • Remote grant # System returns grant # for client
XWAY Kangho Kim Cheiyol Kim • Bi-directional communication • Transparent to applications • Below Socket layer • Significant kernel modifications, No migration support, TCP only
XWay channel Domain A Domain B Event channel SQ SQ Head Tail Head Tail RQ RQ Head Tail Head Tail
Inter-VM Communication (IVC) Wei Huang, Matthew Koop • IVC library providing efficient intra-physical node communication through shared memory • Provides auto discovery and migration support • User transparency or kernel transparency not fully supported, only MPI protocol supported
IVC • IVC consists of two parts: • A user space communication library • A kernel driver • Uses a general socket style interface.
MMNet Prashanth Radhakrishnan, Kiran Srinivasan • Map in the entire physical memory of the peer VM • Zero copy between guest kernels • On-the-fly setup/teardown channels not supported, • In their model, VMs need to fully trust each other, which is not practical.
XenLoopOverview Jian Wang, Kartik Gopalan • Enables direct traffic exchange between co-located VMs • Transparency for Applications and Libraries • Kernel Transparency • Automatic discovery of co-located VMs • On-the-fly setup/teardown XenLoop channels • Migration transparency
XenLoop Architecture One-bit bidirectional channel to notify the other endpoint that data is available in FIFO Netfilter hook to capture and examine outgoing packets. Applications Applications Lockless producer-consumer circular buffers Socket Layer Socket Layer Transport Layer Transport Layer Network Layer Network Layer Event Channel XenLoop Layer XenLoop Layer Software Bridge Software Bridge OUT FIFO A B IN FIFO B A IN OUT Domain Discovery N B N B Netfront Netfront Software Bridge Virtual Machine A Virtual Machine B Domain 0
Preferentially scheduling communication oriented domains • Introduce short term unfairness PerformanceVSFairness • Address inter-VM communication characteristics
Xen&Co Sriram Govindan, Arjun R Nath • Prefer VM with most pending network packets • Both to be sent and received • Predict pending packets • Receive prediction • Send prediction • Fairness • Still preserve reservation guarantees over a coarser time scale – PERIOD
Xen & Co Packet Reception Domain 1 Domain 2 Domain n … Guest Domains domain1.pending-- Hypervisor Packet arrive at the NIC Domain0.pending-- Domain0.pending++ NIC Domain0 Domain1.pending++ Interrupt Schedule Domain 1. Now, schedule domain0. 29
Scheduling I/O in VMM Diego Ongaro, Alan L. Cox • Boosting I/O domains • Used when an idle domain is sent a virtual interrupt • Run-queue ordering • Within each state, sorts domains by credits remaining • Tickling too soon • Don’t tickle while sending virtual interrupts
Task-aware Scheduling Hwanju Kim, Hyeontaek Lim • Use task info to determine whether a domain that gets a event notification is I/O-bound • Give the domain a partial boost if it is I/O bound. • Partial boosting • Partial boosted VCPU can preempt a running VCPU and handle the pending event. • Whenever it is inferred as non-I/O-bound, the VMM will revoke CPU from the partially boosted VCPU. • Use correlation information to predict whether an event is directed for I/O tasks • Block I/O • Network I/O
Atomic Inter-VM Control Transfer Jian Wang, Kartik Gopalan • Dom2 cannot get time slice as early as possible Dom1 DomX Dom2 DomX Dom1 …… …... t1 t2 t3 t4
AICT One time slice(30ms) Dom1 Dom2 One way AICT Dom1 Dom2 Dom1 Two way AICT • Basic Idea • Donate unused time slices to the target domain • Proper Accounting • When source domain donates time slice to target guest, charge credits on source domain in stead of target domain.
Challenges • Real-time Guarantee • Coordinate with guest scheduler • Compositional VM systems Web Server Application Server Database Sever Dom3 Dom1 Dom2
Conclusions For co-located inter-VM communication • Shared memory greatly improves performance • Optimizing scheduler has much benefits
The End Thank You. Questions?
XenLoop Performance Netperf UDP_STREAM
XenLoop Performance (contd.) Migration Transparency Colocated VMs Separated again Separated VMs
Future Work • Compatibility with routed-mode Xen setup • Implemented. Under testing. • Packet interception b/w socket and transport layers • Do this without changing the kernel. • Will reduce 4 copies to 2 (as others), significantly improving bandwidth performance. • XenLoop for Windows guest? • Windows Linux XenLoop Channel • XenLoop architecture mostly OS agnostic.