Differentiated Services == Differentiated Scheduling

The role of the Nova scheduler in managing Quality of Service Differentiated Services == Differentiated Scheduling Gary Kotton - VMware Gilad Zlotkin - Radware 1

Enterprise Ready Openstack Migrating existing mission critical and performance critical enterprise applications requires: → High service levels • Availability • Performance • Security → Compliance with existing architectures • Multi-tier • Fault tolerance models 2

Service Level for Applications • Availability 3

Service Level for Applications • Availability • Performance • Transaction Latency (Sec) • Transaction Load/Bandwidth (TPS) 3

Service Level for Applications • Availability • Performance • Transaction Latency (Sec) • Transaction Load/Bandwidth (TPS) • Security • Data Privacy • Data Integrity • Denial of Service 3

Service Level for Applications • Availability • Performance • Transaction Latency (Sec) • Transaction Load/Bandwidth (TPS) • Security • Data Privacy • Data Integrity • Denial of Service What all this has to do with the Nova Scheduler? 3

High Availability Models • Availability Zone Redundancy → The “cloud” way • Server Redundancy → The “classic” way • Both Server and Zone Redundancies → The “enterprise” disaster recovery way 4

Availability Zone Redundancy Global Load Balancing LB1 LB2 WS2 WS3 WS4 WS1 DB2 DB1 AZ1 AZ2 5

Server Redundancy LB1 LB2 WS1 WS2 WS3 DB1 DB2 6

Server and Zone Redundancies Global Load Balancing LB1 LB3 LB2 LB4 WS1 WS4 WS5 WS2 WS3 WS6 DB3 DB1 DB2 DB4 AZ1 AZ2 7

Network Availability VMware’s NSX for example LB1 LB2 WS1 WS2 WS3 DB1 DB2 Transport Network Logical Network Controller Cluster 8

Load Balancer Availability Radware’s Alteon Load Balancer for example Auto Failover Active Standby Configuration Synchronization LB1 LB2 Persistency State Synchronization WS1 WS2 WS3 9

Group Scheduling • Group together VMs to provide a certain service • Enables scheduling policies per group/sub-group • Provides a multi-VM application designed for fault tolerance and high performance 10

Example 11

Example Bad placement: if a host goes down entire service is down! 11

Placement strategy - anti affinity: achieving fault tolerance Example Bad placement: if a host goes down entire service is down! 11

Placement Strategies • Availability - anti affinity • VM's should be placed in different 'failure domains' (e.g., on different hosts) to ensure application fault tolerance 12

Placement Strategies • Availability - anti affinity • VM's should be placed in different 'failure domains' (e.g., on different hosts) to ensure application fault tolerance • Performance • Network proximity • Group members should be placed as closely as possible to one another on the network (same 'connectivity domain') to ensure low latency and high performance 12

Placement Strategies • Availability - anti affinity • VM's should be placed in different 'failure domains' (e.g., on different hosts) to ensure application fault tolerance • Performance • Network proximity • Group members should be placed as closely as possible to one another on the network (same 'connectivity domain') to ensure low latency and high performance • Host Capability • IO-Intensive, Network-Intensive, CPU-Intensive,... 12

Placement Strategies • Availability - anti affinity • VM's should be placed in different 'failure domains' (e.g., on different hosts) to ensure application fault tolerance • Performance • Network proximity • Group members should be placed as closely as possible to one another on the network (same 'connectivity domain') to ensure low latency and high performance • Host Capability • IO-Intensive, Network-Intensive, CPU-Intensive,... • Storage Proximity 12

Placement Strategies • Availability - anti affinity • VM's should be placed in different 'failure domains' (e.g., on different hosts) to ensure application fault tolerance • Performance • Network proximity • Group members should be placed as closely as possible to one another on the network (same 'connectivity domain') to ensure low latency and high performance • Host Capability • IO-Intensive, Network-Intensive, CPU-Intensive,... • Storage Proximity • Security - Resource Isolation/Exclusivity • Host, Network, ... 12

Anti Affinity • Havana: Anti affinity per group • nova boot --hint group=WS[:anti-affinity] --image ws.img --flavor 2 --num 3 WSi • “Instance Groups” • Properties: • Policies - for example anti affinity • Members - the instances that are assigned to the group • Metadata - key value pairs • Sadly did not make the Havana Release • Continue work in Icehouse with extended functionality 13

Network Proximity (Same Rack) 14

Host Capabilities - IO intensive - CPU intensive - Network intensive → “Smart resource placement” - Yathi Udupi and Debo Dutta (Cisco) → “Host Capabilities” - Don Dugger (Intel) 15

Storage Proximity • Schedule instances to have affinity to Cinder volumes → “Scheduling Across Services” - Boris Pavlovic (Mirantis) and Alex Glikson (IBM) → “Smart resource placement” - Yathi Udupi and DeboDutta (Cisco) 16

Resource Exclusivity • Network Isolation: Neutron, for example VMware’s NSX • Host Allocation: enable user to have a pool of hosts for exclusive use. → “Private Clouds - Whole Host Allocation” - Phil Day (HP), Andrew Laski (Rackspace) 17

Additional Scheduling Topics → “Scheduler Performance” - Boris Pavlovic (Mirantis) → “Methods to Improve DB Host Statistics” - Shane Wang and Lianhau Lu (Intel) → “Scheduler Metrics - Relationship with Ceilometer” - Paul Murray (HP) → “Multiple Scheduler Policies” - Alex Glikson (IBM) 18

Icehouse • Expand on “Instance Group” support • Topology of resources and relationships between them • DeboDutta and Yathi Udupi (Cisco) • Mike Spreitzer (IBM) • Gary Kotton (VMware) 19

API - Aiming for I1 • Proposed API (Nova Extension) • id - a unique UUID • name - human readable name • tenant_id - the ID of the tenant that owns the group • policies - a list of policies for the group (anti affinity, network proximity and host capabilities) • metadata - a way to store arbitrary key value pairs on a group • members - UUIDs of all of the instances that are members of the group 20

Flow • Group will be created with no members • Group will have a policy • Group ID will be used for scheduling • Passed as a hint • Scheduler will update members • Pending support for group of groups • Group membership will be removed when instance is deleted 21

Summary Migrating existing mission critical and performance critical enterprise applications requires: High service levels → Group Scheduling Policies • Availability → Anti-Affinity • Performance →Proximity / Host Capability • Security →Resource Exclusivity 22

Q&A Thank You Gary Kotton: gkotton@vmware.com Gilad Zlotkin: gzlotkin@radware.com

Differentiated Services == Differentiated Scheduling