180 likes | 241 Views
OpenStack Resource Scheduler. Nova Scheduler. Takes a VM instance and determine where it should run Interacts with other components through queue makes decisions by collecting information about compute resources. FilterScheduler. The scheduler process is divided into the following phases:
E N D
Nova Scheduler • Takes a VM instance and determine where it should run • Interacts with other components through queue • makes decisions by collecting information about compute resources
FilterScheduler • The scheduler process is divided into the following phases: • Getting the current state of all the compute nodes: it will generate a list of hosts • Filtering phase will generate a list of suitable hosts by applying filters • Weighting phase will sort the hosts according to their weighted cost scores, which are given by applying some cost functions
Filtering • Have not been attempted for scheduling purposes (RetryFilter) • Are in the requested availability zone (AvailabilityZoneFilter). • Have sufficient RAM available (RamFilter). • Are capable of servicing the request (ComputeFilter). • Satisfy the extra specs associated with the instance type (ComputeCapabilitiesFilter). • Satisfy any architecture, hypervisor type, or virtual machine mode properties specified on the instance's image properties. (ImagePropertiesFilter).
Standard filter • AllHostsFilter: no operation, passes all the available hosts • ImagePropertiesFilter: passes hosts that can support the specified image properties contained in the instance • AvailabilityZoneFilter: passes hosts matching the availability zone specified in the instance properties • ComputeCapabilityFilter: pass hosts that can create the specified instance type • CoreFilter: passes hosts with sufficient number of CPU cores. • JsonFilter: allows simple JSON-based grammar for selecting hosts. • And others…
Weights • A way to select the best suitable host from a group of valid hosts by giving weights to all the hosts in the list • Weigher • RAMWeigher: Hosts are weighted and sorted with the largest RAM winning • MetricsWeigher: This weigher can compute the weight based on the compute node host’s various metrics. • IoOpsWeigher: compute the weight based on the compute node host’s workload. The default is to preferably choose light workload compute hosts.
Host selection • Finds acceptable host by repeated filtering and weighing • Virtually consumes resources after the filter scheduler chooses a host
Host aggregates • A mechanism to further partition an availability zone • Only visible to administrators • Allow administrators to assign key-value pairs to groups of machines • Each node can have multiple aggregates • Usage • Advanced scheduling • Hypervisor resource pool • Logical groups for migration
Scheduler of other services • Storage • Cinder volume scheduler • Network • Dhcp agent scheduler • L3 agent scheduler • Lbaas agent scheduler
Scheduler as a Service (Gantt) • Still incubation • Objective • Provide scheduling services with API • Dynamic scheduling
IBM Platform Resource Scheduler • Provides dynamic resource management for IBM OpenStack clouds • Automated management • Reduce Infrastructure costs • Improved application performance and high availability • Higher quality of service • More flexible resource selection • Intelligent placement – automated, runtime resource optimization • Included as optional scheduler and optimization service in CMO 4.2 • Included as a chargeable add-on product for IBM SmartCloud Orchestrator 2.4 • Full compatibility with the Nova APIs and fits seamlessly into OpenStack environments • Part of IBM SDE portfolio
Platform Resource Scheduler Vs Community Scheduler • Keep 100% compatibility of community scheduler • Support all the features and planned of community scheduler • The interface and architecture follows the extension of OpenStack • Contribute PRS feature timely back to community to protect PRS technology • Provide enterprise level features and qualities • More and user-defined metrics available for placement constraint • Placement policy for aggregate (packing, striping, cpu/mem load balance, user defined) to reduce infrastructure cost and improve resource utilization. • Placement policy for HEAT and Server Group (topology-aware affinity/anti-affinity, max lost per node failure) for optimal QoS of application • Placement policy and constraints honored during VM/Application Lifecycle • Runtime optimization policy to optimize resource utilization and QoS of application. • VM HA Policy to optimize high availability of application • Hypervisor Maintenance Policy to reduce admin efforts • Faster, scale and stable as high quality service of product
Flexible Resource Selection • Extension to OpenStack’s VM placement “hints” • Using Platform Resource Scheduler (PRS) resource requirements language to enhance placement hints. No change to OpenStack APIs. • Support almost 20 operators including arithmetic operators, relational operators, and logical operators. Especially support string matching by regular expression. 1. Request a VM with PRS placement hints # nova boot --image 70ca6868-6c0d-44ca-ab0e-4a96e16d5b88 --flavor 1 --hint \ query= “[ ‘and’, [‘>=‘, ‘memFree’, ‘1024’], [‘>’, ‘diskFree’, ‘102400’] ]” myVM1 OpenStack Manager 2. Get metric data and request from OpenStack 4. Place VM according to plan 3. Hints, global policy and user defined filters considered to calculate placement. Result is returned to OpenStack Scheduler Plug-in 0. Set initial placement policy: Pack, Stripe, Memory Balance, etc NOVA COMPUTE Python API VM VM VM PRS Placement & Resource Plans VM VM VM
OpenStack Runtime Management • Migration (re-placement) of VM’s is driven by Platform Resource Scheduler (PRS) runtime policies • Runtime policies get metrics from PRS and take actions through the OpenStack API • Migration honors the global placement policies • Pack, Stripe, Memory Balance, VM High Availability, Hypervisor Maintenance, etc 2. Analyze data and if needed determine how to remediate 3. Request migrate OpenStack Manager Runtime Policy Engine Load Balance, etc 4. Request placement 5. Initial Hints and global policy considered to calculate placement. Result is returned to OpenStack 6. Migrate VM(s) to remediate problem Scheduler Plug-in 1. Get metric data from PRS Python API PRS Placement & Resource Plans NOVA COMPUTE VM VM VM VM VM VM 0. Set Global placement policy: Pack, Stripe, Memory Balance, VM High Availability, Hypervisor Maintenance, etc
Heat Holistic Scheduling • Extension to OpenStack Heat Template • New Heat resource of “IBM::Policy::Group” to specify application placement policy of topology-aware affinity. anti-affinity and max failure per host. • Schedule VMs of whole Heat application in a holistic way. tier2_policy_group:type: IBM::Policy::Group properties: policies: - type: AntiAffinity mode: hardtopology: name: Availability level: rack relationships: - peer: tier1_policy_group policy: type: Affinity mode: hard topology: name: Availability level: host tier1_policy_group:type: IBM::Policy::Group policies: - type: AntiAffinity mode: hardtopology: name: Availability level: rack 1. Request a stack with PRS “IBM::Policy::Group” resource # heat stack-create –f PolicyGroup.yumlapp1 OpenStack Heat Manager 4. Create individual VM with stack and resource id 2. Request for VMs of whole application Heat Scheduler Plug-in OpenStack Nova Manager Python API 6. Place VM according to plan 3. Schedule VMs of whole app together. Global policy and filters are consider as well. The result is kept in PRS. Heat Scheduler Plug-in 5. Query PRS for placement plan of individual VM Python API NOVA COMPUTE VM VM VM VM VM VM 0. Infrastructure topology defined in PRS. PRS Placement & Resource Plans
Use Case 1 A stack template with two groups: Affinity between groups and Anti-Affinity within group A user define topology with 6 nodes and 3 racks … tier1_policy_group: type: IBM::Policy::Group properties: policies: - type: Anti-Affinity mode: hard topology: name: Availability level: rack tier2_policy_group: type: IBM::Policy::Group properties: policies: - type: Anti-Affinity mode: hard topology: name: Availability level: rack relationships: - peer: tier1_policy_group policy: type: Affinity mode: hard topology: name: Availability level: host … Scheduler decision
Use Case 2 A stack template with one groups and multiple policies: Anti-Affinity and MaxResource LostPerNode within group … my_policy_group: type: IBM::Policy::Group properties: policies: - type: ResourceLostPerNodeFailure mode: hard topology: name: Availability-zone level: availability-zone percentage: 50 - type: AntiAffinity mode: hard topology: name: Availability-zone level: host my_asg: type: OS::Heat::AutoScalingGroup properties: resource: type: OS::Nova::Server properties: image: { get_param: image } flavor: { get_param: flavor } scheduler_hints: {group_policy: {get_resource: my_policy_group}} min_size: 1 desired_capacity: 4 max_size: 10 … A user define topology with 6 nodes and 3 racks Scheduler decision