290 likes | 439 Views
Self Healing and Dynamic Construction Framework:. Applications of the Software Matrix. Anirudha Krishna Advisor: Dr. James Fawcett 9 Dec 2005. Goals of the Research. Exploring ideas to simplify design of complex systems Reuse Configurability Automation
E N D
Self Healing and Dynamic Construction Framework: Applications of the Software Matrix • Anirudha Krishna • Advisor: • Dr. James Fawcett • 9 Dec 2005
Goals of the Research • Exploring ideas to simplify design of complex systems • Reuse • Configurability • Automation • Build a Framework to enable systems to Self Heal • Explore the usability of the Software Matrix in this context
Self Healing and Dynamic Construction • Self Healing – System that is able to recover from failure without external intervention • Dynamic Construction – System that can add functionality to itself at run time • Software Matrix – Cell based development model focused on reuse
Matrix Framework • Framework for Reuse • Integrated Network of Cells • Dynamic Application Composition • Mediator based Communication
Matrix Framework • Five Elements of the Software Matrix • Cell • Mediator • Message Passing Support • Executive • Network Support
Matrix Cell • Four Components • Cell ID • Message Passing Structure • Capability List • Processing
Matrix Messaging • Mediator centric Communication • XML Based Messages • Synchronous and Asynchronous Message Passing • Common Local and Network Message Formats
Matrix Executive • Directory Watcher • Reflection to load Cells • Common Interface – Register Function • Start – Cell Initialization • Application Startup
Additions to the Matrix – Network Support • Replaceable Network Cell • Provides both Server and Receiver Functions • Provision for specifying Network Destinations for Messages • Asynchronous and Synchronous communication • Concept of Matrix Node
Self Healing Architecture • Matrix Style Cell Based Framework • Recover from Cell / Node Failures • Three main components – • Default Handler – Detect Failures • Address Server – Locate Message Handlers • Repository Server – Restore Failed Cells • Configurable to required level of functionality
Address Server • Maintains a Address table of Message Handlers and their locations • Contains logic for evaluating a request as a Cell failure or lack of Addressing Information • Requests reload from Repository and manages installation of new Cell • Provides capability for recovering from Repository Server Failures
Default Handler • Receives Messages from • Failed deliveries – Network / Local • Destination not specified • Maintains cache of Network Addresses • Discovers new Address either from local cache or by contacting the Address Server
Default Handler • Forwards the failed message to the new destination • Supports Recovery from Address Server Failures • Provides information on Message Handling capabilities of the node
Repository • Maintains a static store of all Cells used by the Application • Responds to queries requesting Message Handlers by identifying and retrieving the required File • Provides support for runtime startup after Failure
Application Cell Failure • Failure of Node detected by Network Cell • Message Forwarded to default handler • Request to Address Server generates Reload from Repository • Default Handler forwards original message to new location
Framework Cell Failure Applications Restored by Framework What if Framework Cell fails ???
Restoring Framework Components • Possibility that Address Server or Repository fails • Two options • Maintain mirror servers • Enable framework to heal itself • Advantages and Disadvantages to each scheme • Data Restoration on startup
Framework Configurability Ability to utilize only as much of the Framework as required Complete Self Healing framework consisting of Address Server, Default Handlers, Repository, File Handlers and Network Cells Minimal system consisting of just a pared-down Default Handler
Test System Simulated Radar Management • Three components – • Radar Display • Data Analysis • Field Console • Automating Recovery from Failure • Failure Test
Simulated Radar Management • Operational Radar System
Simulated Radar Management • Normal Operation on Four Machines • Failure of the Data Analysis Module – repaired by the Framework • Failure of a Framework Component – Address Server • Address Server restored • Data Analysis restored
Timing Test • Test Results – one time recovery delay before normal operation resumes
Thesis Conclusions • Key Feature – reduce complexity • Reduction of complexity in application design complemented by simplifying tools used • Raise Abstraction to Cell level through efficient reuse of existing Cells • Use of automation to build more robust applications
Thesis Conclusions • Simple Self Healing is possible • Dynamic Construction over a distributed system can work effectively • Efficient reuse can minimize the coding effort to allow applications to be built by putting files together
Future Work • Removing the Mediator centric messaging from production quality Matrix Applications • Generic Multithreaded Server Cell design for high bandwidth Applications • Multiple Service Providers • Adding other automation services – self configuring, self protecting, self optimizing • Sophisticated Network Cells