220 likes | 235 Views
May 2019. Monitoring and Analyzing Your OpenStack Cloud. Ifat Afek, Nokia Martin Chacon Piza, Fujitsu EST GmbH. Hands-on !. Get available IP <link to list of devstack IPs>. Instructions etherpad https://etherpad.openstack.org/p/ monasca-vitrage-lab. Agenda.
E N D
May 2019 Monitoring and Analyzing Your OpenStack Cloud Ifat Afek, Nokia Martin Chacon Piza, Fujitsu EST GmbH
Hands-on ! Get available IP <link to list of devstack IPs> Instructions etherpad https://etherpad.openstack.org/p/monasca-vitrage-lab
Agenda • Introduction: Monitoring & Analyzing in OpenStack • Monasca • Vitrage
Monitoring and Analyzing in OpenStack System Metrics Monasca Monitoring libvirt Docker Kubernetes Vitrage Root Cause Analysis Ceph, etc. Nova Cinder K8s Heat Neutron
Monitoring and Analyzing in OpenStack • Basic implementation: in progress, will be ready in Train • https://review.opendev.org/#/c/622899/ • Full configuration support • Design in progress • https://review.opendev.org/#/c/627180/ • You are welcome to share your thoughts! • To be discussed in self-healing-sig PTG
What is Monasca? • Monitoring-as-a-Service solution based on a REST API • Multi-tenancy based on Keystone authentication. • Highly-performant, scalable, fault-tolerant and capable of big data retention • Metrics storage/retrieval/statistics and alarm/thresholding engine • Notification system • Real-time event stream processing • Consolidates multiple monitoring systems into a single solution • Extensible based on micro-services message bus architecture
What is Vitrage? The OpenStack Root Cause Analysis service, used for organizing, analyzing and expanding OpenStack alarms & events. Holistic and complete view of the system Root Cause Analysis Identify affected resources
Vitrage CLI Topology Alarms Management topology show topology show alarm list / count / show alarm list, count, show, history template list / add template list / add resource list / count / show resource list / count / show rca show rca show webhook list / add event post service list
Vitrage Templates • Human-readable yaml files • Condition -> Actions scenarios Notify Nova Call nova service-force-down alarm [ on] host Deduced Alarm Raise alarm on another resource Then alarm [ on ] host AND host [ contains ] instance Deduced State Modify the state of the resource Execute a Mistral Workflow host_alarm [ on ] host AND host [ contains ] instance AND Instance_alarm [ on ] instance Mark Causal Relationship
Vitrage Templates • Version 3: shorter and much more simple metadata: ... entities: - entity: ... - entity: ... scenarios: scenario: condition: <if statement true do the action> actions: - action: ...
Vitrage Templates - Example metadata: version: 3 type: standard name: cpu problem description: Monasca high CPU load alarm affects instances entities: host_alarm: type: monasca name: high_cpu_load instance: type: nova.instance host: type: nova.host scenarios: - condition: host_alarm [on] host AND host [contains] instance actions: - set_state: state: SUBOPTIMAL target: host - set_state: state: SUBOPTIMAL target: instance - raise_alarm: target: instance alarm_name: CPU performance degradation severity: WARNING causing_alarm: host_alarm
Come Join Us! Discussion on self-healing SIG PTG - Thursday afternoon Vitrage Vitrage wiki page: https://wiki.openstack.org/wiki/Vitrage Email openstack-discuss@lists.openstack.org with [vitrage] tag IRC channel: #openstack-vitrage Monasca Monasca wiki page:https://wiki.openstack.org/wiki/Monasca Email: openstack-discuss@lists.openstack.org with [monasca] tag IRC channel: #openstack-monasca Meetings are held on #openstack-monasca: at UTC 15:00 on Wednesdays and at UTC 7:00 on Thursdays