110 likes | 283 Views
Using TOSCA Requirements /Capabilities Monitoring Use Case (Primer Considerations) . Proposal by CA Technologies, IBM, SAP, Vnomic. Monitoring. A monitoring tool to gather data Deployed with the service or separately ‘ S lotted in’ from the infrastructure
E N D
Using TOSCA Requirements /Capabilities Monitoring Use Case(Primer Considerations) Proposal by CA Technologies, IBM, SAP, Vnomic
Monitoring • A monitoring tool to gather data • Deployed with the service or separately • ‘Slotted in’ from the infrastructure • Additional information needed to specify how to treat the data • Threshold values for alarm situations • SLA calculation details • Value projections • Collection interval, persisting the data to satisfyitsuse • Monitoring tools may vary with provider capabilities and customer requirements
Example Monitoring Use Case A loan approval web service requires three VMs: a database server, an application server and a http server. The service SLA requires 3 second turnaround on all transactions. In order to maintain this SLA all three VMs must have less than 50% memory utilization and database file system usage must be under 80% of capacity. CPU utilization of the VM should be under 70% average over a three minute time window. • Each VM must have: • a memory utilization monitor that will issue an alarm when memory utilization is over 50%. • a CPU utilization monitor (will issue an alarm when CPU utilization is over 70%) • A file system monitor for DB file system usage (provided as persistent data) • An application specific transaction monitor extracts average transaction times from the application (provided as persistent data) • A connection/response monitor (checks the availability and connect response of the http server) • The monitoring tool is configured to: • Calculate projections on when usage on the database system will exceed 80% • Perform SLA calculations based on availability of the http server and measured transaction response times from the application
Monitoring • Three levels • Metric • Monitor • Probe
Metric • Metric – single point data collection • Memory util, transaction time, response time … • Name (UID?) • Description • Type (counter, gauge, percentage…) • Unit (number/bytes … ) • Data type (int, float, real …)
Monitor • Collection of metrics • Same sampling interval • Last N samples • Name, UID? • Interval: seconds • Duration: time/cycle • Samples[N] – (timestamp, value) • Stats: sum, average, mean, median … • Local/remote • Synthetic (no, how …) • Alarm source/capability
Probe • Monitoring service • Controls multiple Monitors • Persistence • Local • Requirement (for data logger)
To Consider • Defining triggers for • Alarms • Plans (scaling) • Separate capability? • SLA, Trending, Projections, etc. • Relation: monitoredBy • Granularity of capabilities • Expose Metric, Monitor as well? • Optional/Required - some or possibly all the monitoring is optional in regards to being able to deploy the service
For instance - monitoring disk usage First of all we need to identify the disk - this would be a combination of the computer it resides on and the file system mount point. Naming things can be the most complicated part… This could be done with a local piece of code or remotely. I have used the local approach below, but I believe there would be a difference only a the Probe level. Metric % disk usage – this would for instance be creating/enabling a profile for the disk for the local piece of monitoring code. Monitor Interval - The interval is specified for all disk profiles being monitored. Alarm threshold - For the alarm situation, this may require a profile setting in the profile created previously. Number of samples – may be specified together with the interval above. For the alarm situation, the average over the samples is used. Persisting the data – again this may just be a profile change to collect persisting data for this metric. With TOSCA we might opt not to have this ‘persisting’ concept but rather specify some pointers about what the data will be used for, for instance ‘Available for SLA’ - At the Probe level one could establish a collection of SLA’s from the Monitors and specify the time range needed. ‘Projection based on 2 weeks of data’ Probe Local collection – specifies that a local probe is used to provide the metric. This implies that he monitoring infrastructure must be in place. Artifact - the piece of code required to do the specific monitoring. How to place/instantiate it would depend on the monitoring tool. SLA period SLA algoritm – to combine the SLA series from the monitors into one overall.
Monitoring Capabilities Provided by SP or customer <NodeTemplate id="uid-host" nodeType=“ex:monitor"> <Capabilities> <Capability id="uid-memory-monitor" name="myMemory"> <Properties> <Name>memory-peak-working-set</Name> <Units>UNIT-KB</Units> <Interval>30</Interval> <Duration>100</Interval> … </Properties> </Capability> <Capability id="uid-compute-monitor" name="myCompute"> <Properties> <Name>CPU-usage</Name> <Name>CPU-peak</Name> … </Properties> </Capability> </Capability> </Capabilities> </NodeTemplate> Default settings of the monitor Multiple metrics
Requesting Monitoring Defined by Capability <Requirements> <Requirement ref="uid-memory-monitor“ name=“memory1> </Requirement> <Requirement req="uid-compute-monitor“ name=“CPU1”> <Properties> <Interval>60</Interval> </Properties> </Requirement> <Requirement ref="uid-filesystem-monitor“ name=“FS3> <Properties> … </Properties> </Requirement> </Requirements> Property override