1 / 42

Research Proficiency Exam

Research Proficiency Exam. Survey of State-of-the-art in Inter-VM Communication Mechanisms Jian Wang. Talk Outline. Introduction Shared memory research Scheduler optimization research Challenges and problems. Motivation: Why Inter-VM Communication? . Physical Machine.

kovit
Download Presentation

Research Proficiency Exam

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Research Proficiency Exam Survey of State-of-the-art in Inter-VMCommunication Mechanisms Jian Wang

  2. Talk Outline • Introduction • Shared memory research • Scheduler optimization research • Challenges and problems

  3. Motivation:Why Inter-VM Communication? Physical Machine • Virtualization technology is mainly focused on building the isolation barrier between co-located VMs. • However, applications often wish to talk across this isolation barrier • E.g. High performance grid apps, web services, virtual network appliances, transaction processing, graphics rendering. Virtual Machine A Virtual Machine B Hypervisor (or Virtual Machine Monitor)

  4. Why not just use TCP or UDP? • Transparent to applications BUT • High communication overhead between co-located VMs

  5. Xen and Xen Networking Subsystem Domain 0 PKT Communication data path between co-located VMs

  6. Packet routed VM 1 Domain-0 VM 2 Xen Ask Xen to transmit pages Put packet into a page Ask Xen to swap/copy pages

  7. Basic Idea: Shared Memory Advantages of using Shared Memory: • No need for per-packet processing • Pages reused in circular buffer • Writes are visible immediately • Fewer hypercalls (only for signaling)

  8. With Shared Memory VM 1 VM 2 Xen Ask Xen to share pages Allocate one pool of pages

  9. Shared Memory End Goals 1.Performance: High throughput, low latency and acceptable CPU consumption. 2.Transparency: Don't change the app. Don't change the kernel. • 3. Dynamism:On-the-fly setup/teardown channels. Auto discovery. Migration support.

  10. Hypervisor Scheduler Dom1 DomX Dom2 DomX Dom1 …… …... t1 t2 t3 t4 Dom1 Dom2

  11. Scheduler induced delays Jboss query1 query2 Running on dedicated servers reply1 reply2 DB Jboss query1 query2 Runnning on consolidated server reply1 reply2 DB Scheduler induced delays Network latency

  12. Hypervisor Scheduler • Lack of communication awareness in VCPU scheduler • Lacks knowledge of timing requirements of tasks/applications within each VM. • Absence of support for real-time inter-VM interactions • Unpredictability of current VM scheduling mechanisms

  13. End Goals for Scheduler Optimizations • Low latency • Independent of other domains’ workloads • Predictable

  14. Shared Memory Research

  15. XenSocket Xiaolan Zhang, Suzanne McIntosh • Shared Memory between two domains • One way communication pipe • Below Socket layer • Bypass TCP/IP stack • No auto discovery, no migration support, no transparency

  16. Remote address • Remote port # Server socket(); bind(sockaddr_inet); listen(); accept(); socket(); bind(sockaddr_xen); Client socket(); connect(sockaddr_inet); socket(); connect(sockaddr_xen); • Local port # • Remote VM # • Remote VM # • Remote grant # System returns grant # for client

  17. XWAY Kangho Kim Cheiyol Kim • Bi-directional communication • Transparent to applications • Below Socket layer • Significant kernel modifications, No migration support, TCP only

  18. XWay channel Domain A Domain B Event channel SQ SQ Head Tail Head Tail RQ RQ Head Tail Head Tail

  19. Inter-VM Communication (IVC) Wei Huang, Matthew Koop • IVC library providing efficient intra-physical node communication through shared memory • Provides auto discovery and migration support • User transparency or kernel transparency not fully supported, only MPI protocol supported

  20. IVC • IVC consists of two parts: • A user space communication library • A kernel driver • Uses a general socket style interface.

  21. MMNet Prashanth Radhakrishnan, Kiran Srinivasan • Map in the entire physical memory of the peer VM • Zero copy between guest kernels • On-the-fly setup/teardown channels not supported, • In their model, VMs need to fully trust each other, which is not practical.

  22. MMNet

  23. XenLoopOverview Jian Wang, Kartik Gopalan • Enables direct traffic exchange between co-located VMs • Transparency for Applications and Libraries • Kernel Transparency • Automatic discovery of co-located VMs • On-the-fly setup/teardown XenLoop channels • Migration transparency

  24. XenLoop Architecture One-bit bidirectional channel to notify the other endpoint that data is available in FIFO Netfilter hook to capture and examine outgoing packets. Applications Applications Lockless producer-consumer circular buffers Socket Layer Socket Layer Transport Layer Transport Layer Network Layer Network Layer Event Channel XenLoop Layer XenLoop Layer Software Bridge Software Bridge OUT FIFO A B IN FIFO B A IN OUT Domain Discovery N B N B Netfront Netfront Software Bridge Virtual Machine A Virtual Machine B Domain 0

  25. Scheduler Optimization Research

  26. Preferentially scheduling communication oriented domains • Introduce short term unfairness PerformanceVSFairness • Address inter-VM communication characteristics

  27. Xen&Co Sriram Govindan, Arjun R Nath • Prefer VM with most pending network packets • Both to be sent and received • Predict pending packets • Receive prediction • Send prediction • Fairness • Still preserve reservation guarantees over a coarser time scale – PERIOD

  28. Xen & Co Packet Reception Domain 1 Domain 2 Domain n … Guest Domains domain1.pending-- Hypervisor Packet arrive at the NIC Domain0.pending-- Domain0.pending++ NIC Domain0 Domain1.pending++ Interrupt Schedule Domain 1. Now, schedule domain0. 29

  29. Scheduling I/O in VMM Diego Ongaro, Alan L. Cox • Boosting I/O domains • Used when an idle domain is sent a virtual interrupt • Run-queue ordering • Within each state, sorts domains by credits remaining • Tickling too soon • Don’t tickle while sending virtual interrupts

  30. Task-aware Scheduling Hwanju Kim, Hyeontaek Lim • Use task info to determine whether a domain that gets a event notification is I/O-bound • Give the domain a partial boost if it is I/O bound. • Partial boosting • Partial boosted VCPU can preempt a running VCPU and handle the pending event. • Whenever it is inferred as non-I/O-bound, the VMM will revoke CPU from the partially boosted VCPU. • Use correlation information to predict whether an event is directed for I/O tasks • Block I/O • Network I/O

  31. Atomic Inter-VM Control Transfer Jian Wang, Kartik Gopalan • Dom2 cannot get time slice as early as possible Dom1 DomX Dom2 DomX Dom1 …… …... t1 t2 t3 t4

  32. AICT One time slice(30ms) Dom1 Dom2 One way AICT Dom1 Dom2 Dom1 Two way AICT • Basic Idea • Donate unused time slices to the target domain • Proper Accounting • When source domain donates time slice to target guest, charge credits on source domain in stead of target domain.

  33. Challenges • Real-time Guarantee • Coordinate with guest scheduler • Compositional VM systems Web Server Application Server Database Sever Dom3 Dom1 Dom2

  34. Conclusions For co-located inter-VM communication • Shared memory greatly improves performance • Optimizing scheduler has much benefits

  35. The End Thank You. Questions?

  36. Backup slides

  37. XenLoop Performance Netperf UDP_STREAM

  38. XenLoop Performance (contd.)

  39. XenLoop Performance (contd.)

  40. XenLoop Performance (contd.) Migration Transparency Colocated VMs Separated again Separated VMs

  41. Future Work • Compatibility with routed-mode Xen setup • Implemented. Under testing. • Packet interception b/w socket and transport layers • Do this without changing the kernel. • Will reduce 4 copies to 2 (as others), significantly improving bandwidth performance. • XenLoop for Windows guest? • Windows  Linux XenLoop Channel • XenLoop architecture mostly OS agnostic.

More Related