1 / 34

Achieving self-healing in service delivery software systems by means of case-based reasoning

Achieving self-healing in service delivery software systems by means of case-based reasoning. Stefania Montani Cosimo Anglano Presented by Tony Schneider Pr. Introduction. Background CBR Implementation Experiment / Cavy Results.

ravi
Download Presentation

Achieving self-healing in service delivery software systems by means of case-based reasoning

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Achieving self-healing in service delivery software systems by means of case-based reasoning Stefania Montani Cosimo Anglano Presented by Tony Schneider Pr

  2. Introduction Background CBR Implementation Experiment / Cavy Results

  3. Autonomic Systems OverviewBackground | CBR Implementation | Experiment / Cavy | Results Goal is to self-manage system System needs to exhibit Self-Configuration Self-Optimization Self-Protection Self-Healing

  4. Self-HealingBackground | CBR Implementation | Experiment / Cavy | Results “Service Delivery Systems” (SDS) Aimed at delivering 24/7 services These services prone to breakage Service failures Software, Hardware, Network Can’t handle manually Need to repair the system autonomously

  5. Self-HealingBackground | CBR Implementation | Experiment / Cavy | Results

  6. Self-HealingBackground | CBR Implementation | Experiment / Cavy | Results Internalization The Self-Healing Engine is integrated with the software Not extendable Depends on specific applications Externalization Great for retrofitting current systems Allows a general method for SDS self-healing

  7. Self-HealingBackground | CBR Implementation | Experiment / Cavy | Results Problems with current approach MAPE model assumes prior knowledge of the system Knowledge base is problematic Large, time consuming , & laborious Need to keep up-to-date Build the knowledge base automatically How?

  8. Case-based ReasoningBackground | CBR Implementation | Experiment / Cavy | Results Case-Based Reasoning (CBR) Uses previous experience for problem solving Retrieves similar cases to current problem Reuses past successful solutions Revises retrieved solution if necessary Retains current case

  9. Case-based ReasoningBackground | CBR Implementation | Experiment / Cavy | Results Case-base represents “knowledge” in the MAPE model Each case represents a previous problem and its solution Implicit versus Explicit knowledge Explicit: Rules & models Implicit: Unstructured & based on experience Implicit tends to be easier and more conducive to limited interaction

  10. Case-based ReasoningBackground | CBR Implementation | Experiment / Cavy | Results Cases are stored by identifying application features The problem Applied solution The outcome of the solution Prevents bottleneck present in other learning methods E.g., online reinforcement learning

  11. Case-based ReasoningBackground | CBR Implementation | Experiment / Cavy | Results CBR relies on large amounts of past cases Pros: Methods approve with time and experience Large systems are hosts to recurrent problems Cons Need to store the data Need to populate the knowledge base

  12. Case-based ReasoningBackground | CBR Implementation | Experiment / Cavy | Results To reiterate: CBR is a methodology designed to assist in the repair of failed systems Questions so far?

  13. System OverviewBackground | CBR Implementation | Experiment / Cavy | Results SDS is treated as a black box Self-healing CBR is entirely external to the SDS Controls the health of the SDS Components of CBR reflected in MAPE Analysis <-> Retrieval Planning <-> Revise Knowledge <-> Case base

  14. System Overview: MAPE RevisedBackground | CBR Implementation | Experiment / Cavy | Results Old Model Revised for CBR

  15. System Overview: MAPE RevisedBackground | CBR Implementation | Experiment / Cavy | Results • Four Additions • Monitoring • Case Preparation • Service Restoration • Repair Module

  16. System Overview: MAPE RevisedBackground | CBR Implementation | Experiment / Cavy | Results • Application Agnostic Portion • Doesn’t rely on specific environment variables • Application Specific Portion • Relies on the data from the application • Both • Interface between the two layers • The managed element is completely external to the healing system

  17. System OverviewBackground | CBR Implementation | Experiment / Cavy | Results • Assumptions • Bad solutions have no effect on the SDS state. Likewise, good solutions don’t produce faults. • Deadlines for producing case solutions aren’t fixed • Every stored case has a unique solution • No transient faults (occur only once) • No intermittent faults (appear, disappear, then reappear again)

  18. CBR Cycle: Retrieve - Reuse/Revise - RetainBackground | CBR Implementation | Experiment / Cavy | Results • Every stored case is representative of some past failure • Need to find the case that approximates current failure • Find the average distance between features • df(x, y) • 1 if x or y are missing • overlap(x, y) if f is a symbolic feature • if f is a linear feature

  19. CBR Cycle: Retrieve - Reuse/Revise - RetainBackground | CBR Implementation | Experiment / Cavy | Results • Apply retrieved case solutions in the order of the bset average • Repeat for all found cases until the problem is solved • Also covers cases with multiple solutions (just use best choice) • What if no solution works? • Ask a human

  20. CBR Cycle: Retrieve - Reuse/Revise - RetainBackground | CBR Implementation | Experiment / Cavy | Results • Just saves the case to the knowledge base • The problem • The solution • The outcome

  21. Odds and EndsBackground | CBR Implementation | Experiment / Cavy | Results • System initialization • Boot strap phase • Prototyping • Makes a general case out of several similar cases in case base • Solves storage space problem • Takes the implicit knowledge and creates explicit knowledge • Used after base case has grown

  22. CBR questions?Background | CBR Implementation | Experiment / Cavy | Results That wraps up the CBR portion. Any Questions?

  23. Experimental SetupBackground | CBR Implementation | Experiment / Cavy | Results • Implemented CBR-based system using Java • MySQL for the base case storage • Used with an SDS testbed “Cavy” • Cavy • Configures, deploys, and operates SDS testbeds • Framework that surrounds the healing engine • Injects faults into test bed components

  24. Cavy ComponentsBackground | CBR Implementation | Experiment / Cavy | Results • Fault managers • Diagnoser • Service Monitor • Integrator • Repairer • Injector

  25. Cavy ComponentsBackground | CBR Implementation | Experiment / Cavy | Results • Basically... • The injector breaks the system • The service monitor sees the fault • The diagnoser finds a similar FS pair • Interrogator receives the solution • Repairer tries each solution until one works

  26. Cavy ComponentsBackground | CBR Implementation | Experiment / Cavy | Results • Cavy implements pieces of the self-healing architecture • Interrogator: Application agnostic pieces • Fault repairer: Application specific pieces • Service monitor: Monitor • Fault managers: Repair

  27. The ExperimentBackground | CBR Implementation | Experiment / Cavy | Results • Rubis • Mimics eBay • Two tiers • Customers interact with web server on the first • Database stored on the second • Several services are tested • Register, Browse, Sell, Home

  28. The ExperimentBackground | CBR Implementation | Experiment / Cavy | Results • Potential Rubis Failures (each can apply to either tier) • Network Problems • Configuration problems • System restart • 10 failure descriptors • Boolean values • Represent failed pieces of the system

  29. Initial Base Case (constructed by a human)Background | CBR Implementation | Experiment / Cavy| Results Automatically generated case

  30. Initial Base Case (constructed by a human)Background | CBR Implementation | Experiment / Cavy| Results Distances between current failure and base case

  31. Second CaseBackground | CBR Implementation | Experiment / Cavy| Results

  32. ResultsBackground | CBR Implementation | Experiment / Cavy| Results • Continued like this for 3 days • Of 1016 cases, less than 11 needed human intervention • Prototypes functioned correctly • Reduced size of database • Handled new faults with out human intervention • Narrowed down the possible failures to 9 prototype cases • Showed “complex” problems were just simultaneous simple problems

  33. Future Work • Use in real-world applications • Working around the given assumptions • Use of prototyping/generalization • Combine CBR with other knowledge sources • Combine CBR with some other methodology

  34. Conclusion • CBR a good solution to self-healing • Repair procedure triggered by service failures • No structured knowledge needed • Worked well even with novel faults

More Related