270 likes | 445 Views
Prioritizing Local Inter-Domain Communication in Xen. Sisu Xi , Chong Li, Chenyang Lu, and Christopher Gill Cyber-Physical Systems Laboratory Washington University in St. Louis IEEE/ACM International Symposium on Quality of Service, 2013. Motivation. Multiple computing elements
E N D
Prioritizing Local Inter-Domain Communication in Xen Sisu Xi, Chong Li, Chenyang Lu, and Christopher Gill Cyber-Physical Systems Laboratory Washington University in St. Louis IEEE/ACM International Symposium on Quality of Service, 2013
Motivation • Multiple computing elements • Cost! Weight! Power! • Communicate via dedicated network or real-time networks • Use fewer computing platforms to integrate independently developed systems via virtualization
Motivation • Multiple computing elements • Cost! Weight! Power! • Communicate via dedicated network or real-time networks • Use fewer computing platforms to integrate independently developed systems via virtualization Physically Isolated Hosts -> Common Computing Platforms Network Communication -> Local Inter-Domain Communication Guarantee QoSwith Virtualization???
System Model and Contributions • We focus on • Xen as the underlying virtualization software • Single core for each virtual machine on a multi-core platform • Local Inter-Domain Communication (IDC) • No modification to the guest domain besides the Xen patch • Contributions • Real-Time Communication Architecture (RTCA) in Xen • Reduces high priority IDC latency from ms to us in the presence of low priority IDC
Background – Xen Overview Domain 1 Domain 2 Domain 0 netback … … A B softnet_data netfront netfront NIC driver VCPU VCPU VCPU VMM Scheduler NIC Core Core Core Core Core Core
Part I – VMM Scheduler: Limitations • Default credit scheduler • Schedule VCPUs in round-robin order • RT-Xen scheduling framework • Schedule VCPUs by priority • Server based mechanism, each VCPU has (budget, period) • However • If execution time < 0.5 ms, VCPU budget is not consumed • Solution • Dual quanta: ms for scheduling, while us for time accounting “RT-Xen: Towards Real-Time Hypervisor Scheduling in Xen”, ACM International Conferences on Embedded Software (EMSOFT), 2011 “Realizing Compositional Scheduling through Virtualization”, Real-Time and Embedded Technology and Application Symposium (RTAS), 2012
Part I – VMM Scheduler: Evaluation sent pkt every 10ms 5,000 data points … Linux 3.4.2 100% CPU Dom 4 Dom 0 Dom 1 Dom 3 Dom 9 Dom 10 Dom 2 VMM Scheduler: RT-Xen VS. Credit When Domain 0 is not busy, the VMM scheduler dominates the IDC performance for higher priority domains C 5 C 4 C 3 C 0 C 1 C 2
Part I – VMM Scheduler: Enough??? … … … Dom 3 Dom 1 Dom 2 Dom 5 Dom 4 Dom 0 100% CPU VMM Scheduler C 5 C 4 C 0 C 1 C 2 C 3
Part II – Domain 0: Background netback[0] { rx_action(); tx_action(); } Domain 1 Domain 2 Domain 0 … … A B netif netif netback netfront netfront TX RX … … netif netif Domain m Domain n … … C D softnet_data netif netif netfront netfront Packets are fetched in a round-robin order Sharing one queue in softnet_data
Part II – Domain 0: RTCA Domain 1 Domain 2 Domain 0 netback[0] { rx_action(); tx_action(); } … … A A netif netif netback netfront netfront TX RX … … … netif netif Domain m Domain n … … B B softnet_data netif netif netfront netfront Packets are fetched by priority, up to batch size Queues are separated by priority in softnet_data
Part II – Domain 0: Evaluation Setup Interference sent pkt every 10ms 5,000 data points … 100% CPU Original vs. RTCA Heavy … Medium … Light Dom 2 Dom 0 Dom 1 Dom 3 Dom 5 Dom 4 VMM Scheduler Base C 5 C 4 C 0 C 1 C 2 C 3
Part II – Domain 0: Latency IDC Latency between Domain 1 and Domain 2 in presence of low priority IDC (us) When there is no interference, IDC performance is comparable • Original Domain 0 performs poorly in all cases • Due to priority inversion within Domain 0 • RTCA with batch size 1 performs best • We eliminate most of the priority inversions • RTCA with larger batch sizes perform worse under IDC interference
Part II – Domain 0: Latency IDC Latency between Domain 1 and Domain 2 in presence of low priority IDC (us) When there is no interference, IDC performance is comparable • Original Domain 0 performs poorly in all cases • Due to priority inversion within Domain 0 By reducing priority inversion in Domain 0, RTCA can effectively mitigate impacts of low priority IDC on the latency of high priority IDC • RTCA with batch size 1 performs best • we eliminate most of the priority inversions • RTCA with larger bath sizes perform worse under IDC interference
Part II – Domain 0: Throughput iPerf Throughput between Dom 1 and Dom 2 A small batch size leads to significant reduction in high priority IDC latency and improved IDC throughput under interfering traffic
Other Approaches and Future Work • Shared Memory Approach [XWAY, XenLoop, Xensocket] • Required modification to guest OS or applications • Traffic Control in Linux [www.lartc.org] • Applied within one device. Cannot directly be applied on IDC • Future Work • Multi-Core VM scheduling • Network Interface Card (NIC) • Rate control • Co-ordinate with VMM scheduler
Conclusion • VMM scheduler alone cannot guarantee IDC latency • RTCA: Real-Time Communication Architecture • RTCA + RT-Xenreduces high priority IDC latency from ms to us in the presence of low priority IDC • https://sites.google.com/site/realtimexen/ Domain 2 Domain 0 Domain 1 netback Thank You! Questions? netfront netfront softnet_data VCPU VCPU VCPU VMM Scheduler Hardware
Why IDC? Why Xen? • Embedded Systems • Integrated Modular Avionics • ARINC 653 Standard • Honeywell claims that IMA design can save 350 pounds of weight on a narrow-body jet: equivalent to two adults • http://www.artist-embedded.org/docs/Events/2007/IMA/Slides/ARTIST2_IMA_WindRiver_Wilson.pdf ARINC 653 Hypervisor VanderLeest S.H., Digital Avionics Systems Conference (DASC), 2010 Full Virtualization based ARINC 653 partition Sanghyun Han, Digital Avionics Systems Conference (DASC), 2011
End-to-End Task Performance • Dom 1 & Dom 2 • 60% CPU each • Dom 3 to Dom 10 • 10% CPU each • 4 pairs bouncing packets Interference Heavy Dom 10 Dom 3 Dom 4 Dom 5 Dom 7 Dom 8 Dom 9 Dom 6 Medium T1(10, 2) T3(20, 2) T2(20, 2) T1(10, 2) Light 100% CPU Original vs. RTCA T1(10, 2) T4(30, 2) Dom 11 Dom 12 Dom 1 Dom 0 Dom 2 Dom 13 Base VMM Scheduler: Credit vs. RT-Xen C 5 C 4 C 0 C 1 C 2 C 3
End-to-End Task Performance By combining the RT-Xen VMM scheduler and the RTCA Domain 0 kernel, we can deliver end-to-end real-time performance to tasks involving both computation and communication
(1). XEN Virtual Network Domain-0 Domain-U • Transparent • Isolation • General • Migration • Performance • Data Integrity • Multicast socket(AF_INET, SOCKET_DGRAM, 0); socket(AF_INET, SOCKET_STREAM, 0); sendto(…) recvfrom(…) app kernel INET INET TCP TCP UDP UDP IP IP Netfront Driver Netback Driver VMM
(2). XWay, VEE’08 • Transparent ? • Performance • Dynamic Create/Destroy • Live Migration • Connect Overhead • Patch Guest OS • No UDP • Complicated Domain-U socket(AF_INET, SOCKET_DGRAM, 0); socket(AF_INET, SOCKET_STREAM, 0); sendto(…) recvfrom(…) app kernel INET XWAY switch UDP XWAY protocol TCP IP XWAY driver Netfront VMM
(3). XenSocket, Middleware’07 (IBM) • No Modification to OS/Xen • One way Communication • Performance • Transparent Domain-U socket(AF_INET, SOCKET_DGRAM, 0); socket(AF_INET, SOCKET_STREAM, 0); socket(AF_XEN, …); sendto(…) recvfrom(…) app kernel INET AF_Xen TCP UDP IP Netfront Netfront VMM
(4). XenLoop, HPDC’08 (Binghamton) • No Modification to OS/Xen • Transparent • Performance • Migration • Overhead • Isolation ? • Dynamic teardown ? Domain-U socket(AF_INET, SOCKET_DGRAM, 0); socket(AF_INET, SOCKET_STREAM, 0); sendto(…) recvfrom(…) app kernel INET TCP UDP IP XenLoop Netfront VMM
priority kthreads Domain-1 Domain-2 Domain-0 netback … … A A TX RX netif netif netfront netfront B B … … … netif netif Domain-m Domain-n multiple kthreads … … softnet_data netif netif netfront netfront netback netback NIC driver highest priority TX RX TX RX