350 likes | 423 Views
www.oasis-open.org. January 14 th 11:00 AM-12:30 PM Morning Session. www.oasis-open.org. 11:00 AM – 12:30 PM SLA Service Contracts Cross domain services Non Functional Properties of Services Tracking/ Auditing – Charge Back. Measuring Services SLA Compliance.
E N D
www.oasis-open.org January 14th11:00 AM-12:30 PMMorning Session
www.oasis-open.org • 11:00 AM – 12:30 PM • SLA • Service Contracts • Cross domain services • Non Functional Properties of Services • Tracking/ Auditing – Charge Back
Measuring Services SLA Compliance • Need to differentiate between service SLA and measuring service SLA compliance • Services may need to have a service compliance interface (testing to verify claims against SLA) • Relationship to service composition • W-SLA, service testting and monitoring • Relation to Non-functional properties
Scope of Typical Telecom SLA • Network SLA compliance measurement is required whenever a network service performance based on a contractual agreement. • Performance is used here to define any number of attributes of that network. • Usually the SLA domain is defined with demarcation points in the network and these are usually defined at the boundaries of the control domains, (typically the edge NEs). • Although an SLA is usually a comprehensive contract for multiple items (e.g. spans) each item must be measured individually and constitutes it’s own SLA. • Services measured can include; • Response, capacity, security, dependability, flexibility, cost, etc.. • The Level metric can be a specific value or a range and are relevant to the Service. These levels might be defined as: = 10ms, >= 5 Mbps etc.. • Agreements define the services, levels, measurements, and consequences of exceeding, meeting, or missing defined levels. These might include; • monetary compensation • contractual changes • publicity • liability • Authority for measuring SLA compliance; • Service consumer, • Service provider, • Independent third party
Minimal Aspects Approach • It can be argued that there are only three primary aspects for any SLA: • Performance • Delay ; Delay variation (jitter) • Connectivity - no connection, no performance • Dependability - no connection, no performance • Availability • Reliability • Data integrity • Errors • Sequencing (Delivery order) • Privacy - Ethical/legal requirements • Cost • Security • Theft of service • Dependability • Reliability • Interoperability • Scalability • Flexibility • What are the essential Points for an SLA for a telecom service
Composed services and their part in Web Services Service Level Agreements (WSLA) • Need a taxonomy or ontology of service behaviors • Need an approach to calculating behaviors of composed services • Service failure is one of many identified behaviors
Background: Orchestration as a New Programming Paradigm • SOA promotes the concept of combining services through orchestration - invoking services in a defined sequence to implement a business process • Orchestration compounds the task of testing and managing the quality of the deployed services • Testing composite services in SOA environment is a discipline which is still at an early stage of study • Describing and usefully modeling the individual and combined behaviors - needed to offer Service Level Agreements (SLA) - is at an even earlier stage
Testing Composed Services • It’s fairly straightforward to test the operation of a device or system if we control all the parts. • When we start offering orchestrated services as a product, the services we are using may be outside our control. • For example consider well-known components: • Google mapping service • Amazon S3 storage service • Mobile operator’s location service
Testing Composed Services (2) • With orchestrated services, there is never a complete “box” we can test • With orchestration as the new programming paradigm, testing becomes a much bigger problem • Failures of orchestrated services are often “Heisenbugs” -impervious to conventional debugging, generally non-reproducible • Offering a WSLA based on testing alone, without reliable knowledge of component service behaviors, may be risky
Web Services SLA (WSLA) Packets Provider X Service X Service Provider Z Client • Concerned with behaviors of the message flows and services spanning the end-to-end business transaction • Clients can develop testing strategies that stress the service to ensure that the service provider has met the contracted WSLA commitment • Composed services make offering a WSLA more risky Network Web Service WSLA Provider Y Service Y Message flows
How can WSLAs be derived from behaviors of component services? • Need to develop a model of the behavioral attributes of the individual component Web Services which contribute to the overall behavior of an orchestrated or composed Web Service. • Need to model the combination of individual service behavioral models
Web Services behaviors • Behaviors may be described and quantified for each Web Service • May be combined by a “calculus of behaviors” when multiple services are composed • Behavior parameters may become a part of the service description, perhaps in WSDL.
Web Services behaviors (2) • To develop a Service Level Agreement (SLA) for a composed service (Z), we need to have relevant behavior descriptions for the individual services (X and Y) • We also need a deep understanding of how to combine the descriptions of X and Y to calculate results for Z Z X Y
Web Services behaviors (3) • For each behavior, the challenges include the following: • How may service X’s and service Y’s behavior be characterized? • How may those characterizations be formalized and advertised by X and Y? • How may Z incorporate X’s and Y’s characterizations and then advertise the result? • Z itself might become a component of an even larger service and therefore needs to advertise its own characteristics. It also needs this characterization to offer an SLA to consumers.
Web Services behaviors (4) • Each behavior may have its own ontology, measures, and calculus of combining those measures when services are composed. Local Ontology Z – Specific Ontology Abstracted Ontology ? X Local Ontology Z Abstracted Ontology Y Need this analysis for each behavior of services X, Y and Z Local Ontology
Web Services behaviors (5) • Ten behavior examples • Availability and Reliability • Performance • Management • Failure • Security • Privacy, confidentiality and integrity • Scalability • Execution • Internationalization • Synchronization • Let’s focus on a few of these behaviors… Source: “Advertising Service Properties,” unpublished paper by C. Hobbs, J. Bell, P. Sanchez
Availability and Reliability • “Availability” is the percentage of client requests to which the server responds within the time it advertised. • “Reliability” is the percentage of such server responses which return the correct answer. • In some applications availability is more important than reliability • Many protocols used within the Internet, for example, are self-correcting and an occasional wrong answer is unimportant. The failure to give any answer, however, can cause a major network upheaval.
Availability and Reliability (2) • In other applications reliability is more important than availability • If the service which calculates a person’s annual tax return does not respond occasionally it’s not a major problem - the user can try again • If that service does respond but with the wrong answer which is submitted to the tax authorities, then it could be disastrous
Availability and Reliability (3) • Services are built with either availability or reliability in mind, with clients accepting that no service can ever be 100% available or 100% reliable. • In combining services X and Y into a composite service Z, it is necessary to combine the underlying availability and reliability models and predict Z’s model. • To do so without manual intervention, X’s and Y’s models must be exposed.
Availability and Reliability (4) • Availability and reliability models are often expressed as Markov Models or Petri Nets, which are easy to combine in a hierarchical way. • Major issues: • Agreeing upon the semantics of the states in the Markov model or places in the Petri nets • Finding a way for X and Y to publish the models in a standard form.
Availability and Reliability (5) • Currently, apart from raw percentage figures, there is no method for describing these models • Percentage time when the server is unavailable? • Percentage of requests to which it does not reply? • Different clients may experience these differently • A server which is unavailable from 00:00 to 04:00 every day can be 100% available to a client that only tries to access it in the afternoons.
Availability and Reliability (6) • If X and Y are distributed, then it is possible, following network failures, that for some customers, Z can access X but not Y and for others Y but not X. • The assessment of Z’s availability may be hard to quantify, so it may be difficult for Z to offer a meaningful WSLA.
Failure • The failure models of X and Y may be very different: • X fails cleanly and may, because of its idempotency, immediately be called again • Y has more complex failure modes • Z will add its own failure modes to those of X and Y • Predicting the outcome could be very difficult • The complexity is increased because many developers do not understand failure modeling and, even were models to be published, their combination would be difficult due to their stochastic nature.
Failure (2) • One approach to describing a service’s failure model: • Service publishes the exceptions that it can raise and associates the required consumer behavior with each • “Exception D may be thrown when the database is locked by another process. Required action is to try again after a random backoff period of not less than 34ms.” • “Crash-only” failure model is a simple starting point for building a taxonomy of failure behavior. This work is just beginning.
Scalability • A behavioral description and WSLA for the composite service Z must include its scalability • How many simultaneous service instances can it support? • What service request rate does it handle? etc. • These parameters will almost certainly differ between the component services X and Y, and will need to be published by those services. • X and Y are presumably not dedicated solely to Z, so the actual load being applied to X and Y at any given time is unknown to the provider of Z, making the scalability of Z even harder to determine.
Web Services behaviors (again) • Ten behavior examples • Availability and Reliability • Performance • Management • Failure • Security • Privacy, confidentiality and integrity • Scalability • Execution • Internationalization • Synchronization • We described a few of these behaviors… • Can we use them to build WSLAs?
Web Service Level Agreement (WSLA) • Based on behaviors and descriptors for these behaviors. • Example: Failure model • Is transaction half-performed? • Is it re-wound? • These behaviors and descriptors are not available in the WS description, in WSDL • No performance info • Not even price!
Web Service Level Agreements (2) • Business acceptance of composed services for business-critical operations depends on a service provider’s ability to offer WSLA • Uptime, response time, etc. • Offering an WSLA depends on ability to compose the WSLA-related behaviors of the individual services • This information needs to be available via WSDL or similar source • Should include test vectors to test the SLA claims • The ability to determine and offer a WSLA commitment is a limiting factor for orchestration
Web Service Level Agreements (3) • Need a more precise way to express the parameters of behaviors • Availability – What is 99.97% uptime? • Several milliseconds outage each minute? • Several minutes planned downtime each month? • Failure model – Crash-only as the simplest, lowest layer or level of failure in a future full failure model. • Eight other SLA-related behaviors listed here – each has a complex semantic for description and composition
www.oasis-open.org • 11:00 AM – 12:30 PM • SLA • Service Contracts • Cross domain services • Non Functional Properties of Services • Tracking/ Auditing – Charge Back
Non Functional Properties (NFP) of Services • OMG RFI • Use of distributed services in different contexts by different stakeholders who don’t have control over these services, raises some challenges with respect to the ability to predict the behavior of the resulting composite service and associated Service Level Agreements (SLAs) without resorting to tight binding to the underlying services
What is a Service? • A mechanism to enable access to one or more capabilities provided by an entity (the service provider) for use by others • Services are opaque; ie only information required by service consumers is exposed
What are non-functional properties of a service • Non-functional Properties are a subset of the service description that specifies behavior that does not relate to the purpose of the service. Non-functional Properties of a service include - among others - response time, cost, availability, reliability, security, and scalability. Consumers of these services can establish selection criteria to select (or search for) a service with the desired NFP (or NFPs). • Service composition is the main target for NFP RFI • In OMG RFI: • Service Level Agreement (SLA) • The Service Provider may guarantee the Service Consumer a certain level of service in return for a specified payment. An SLA specifies negotiated and mutually agreed upon service level commitments including the conditions of the outsourcing service to be provided, quality of service, how this quality is measured, and what happens if the service quality is not met. Typically, an SLA addresses both the functional and non-functional aspects (aka NFPs) of the service to be provided.
RFI Key Questions • Provide, by order of significance, a list of NFPs, such as reliability and scalability, which you think are relevant to domains such as Telecom mission critical applications, etc. • For each NFP identified as part of question 1, please answer the following questions: • Do you have a formal representation of this NFP? If yes, please describe it. • How do you express a composite service’s NFP in terms of the aggregated NFPs of each service from which it is composed? • How do you measure (or monitor) the composite service’s NFP to ensure compliance with its service contract? • Describe the benefits of standardizing some of the above mentioned solutions. Are any of them proprietary? Open Source? Other? • What published standards (from W3C, OASIS, TMF, etc.) do you think should be taken into consideration to represent, aggregate, measure or monitor this NFP? • Identify the NFP related modeling approaches you use to support service operations including discovery, access, and execution. • Identify the NFP related modeling approaches you use to support proper handling (i.e., definition, representation, aggregation, etc.) of a services’ NFP. • Are there any available tools (commercial, open source, in-house) that you use to formally represent, compose, and monitor these NFPs? For each of the tools, indicate what NFPs you think are well treated and what kind of support is provided for each (i.e., presentation, composition, and monitoring) • What other issues/questions about non-functional properties need to be addressed? Please describe in full including supporting rationale.