560 likes | 747 Views
Hybrid Systems. Hybridization. Integrated architectures for machine learning have been shown to provide performance improvements over single representation architectures.
E N D
Hybridization • Integrated architectures for machine learning have been shown to provide performance improvements over single representation architectures. • Integration, or hybridization, is achieved using a spectrum of module or component architectures ranging from those sharing independently functioning components to architectures in which different components are combined in inherently inseparable ways. • In this presentation we briefly survey prototypical integrated architectures
Combinations • The combination of knowledge based systems, neural networks and evolutionary computation forms the core of an emerging approach to building hybrid intelligent systems capable of reasoning and learning in an uncertain and imprecise environment.
Current Progress • In recent years multiple module integrated machine learning systems have been developed to overcome the limitations inherent in single component systems. • Integrations of neural networks (NN), fuzzy logic (FL) and global optimization algorithms have received considerable attention [Abr] but increasing attention is being paid to integrations with case based reasoning (CBR) and rule induction (RI) [Mar, Pren].
Primary Components • The full spectrum of knowledge representation in such systems is not confined to the primary components. • For example, in CBR systems although much knowledge resides in the case library significant problem solving knowledge may reside in secondary technologies such as in the similarity metric used to retrieve problem solution pairs from the case library, in the adaptation mechanisms used to improve an approximate solution and in the case library maintenance mechanisms.
MultiComponents • Although it is possible to generalize about the relative utilities of these component types based on the primary knowledge representation mechanisms these generalizations may no longer remain valid in particular cases depending on the characteristics of the secondary mechanisms employed. • Table 1 attempts to gauge the relative utilities of single components systems based on the primary knowledge representation.
Degree of Integration • Besides differing in the types of component systems employed, different integrated architectures have emerged in a rather ad hoc way, Abraham [Abr]. • Least integrated architectures consisting of independent components communicating with each other on a side by side basis. • More integration is shown in transformational or hierarchial systems in which one technique may be used for development and another for delivery or one component may be used to optimize the performance of another component. • More fully integrated architectures combine different effects to produce a balanced overall computational model.
Transformational, hierarchial and integrated • Abraham categorizes such systems as transformational, hierarchial and integrated. In a transformational integrated system the system may use one type of component to produce another which is the functional system. • For example, a rule based system may be used to set the initial conditions for a neural network solution to a problem. • Thus, to create a modern intelligent system it may be necessary to make a choice of complementary techniques.
Stand Alone Models • Independent components that do not interact • Solving problems that have naturally independent components – eg., decision support and categorization
Transformational • Expert systems with neural networks • Knowledge from the ES is used to set the initial conditions and training set of the NN
Hierarchial Hybrid • An ANN uses a GA to optimize its topology and the output fed into an ES which creates the desired output or explanation
Integrated – Fused Architectures • Combine different techniques in one computational model • Share data structures and knowledge representations • Extended range of capabilities – e.g., classification with explanation, or, adaptation with classification
Fused Architecture The architecture consists of four components and the environment. The performance element (PE) is the actual controller. The learning element.(LE) updates the knowledge in the PE . The LE has access to the environment, the past states and the performance measure. It updates the PE. The critic examines the external performance and provides feedback to the LE. The critic faces the problem of converting an external reinforcement into an internal one. The problem generator is to contribute to the exploration of the problem space in an efficient way. The framework does not specify the techniques.
System Types for Hybridization • Knowledge-based Systems and if-then rules • CBR Systems • Evolutionary Intelligence and Genetic algorithms • Artificial Neural Networks and Learning • Fuzzy Systems • PSO Systems
Knowledge in Intelligent Systems • In rule induction systems knowledge is represented explicitly by if-then rules that are obtained from example sets. • In neural networks knowledge is captures in synaptic weights in systems of neurons that capture categorizations in data sets. • In evolutionary systems knowledge is captured in evolving pools of selected genes and in heuristics for selection of more adapted chromosomes. • In case based systems knowledge is primarily stored in the form of case histories that represent previously developed problem-solution pairs. • In PSO systems the knowledge is stored in the prticle swarms
Table 1 (Adapted from [Abr, Jac] and [Neg]). A comparison of the utility of case based reasoning systems (CBR), rule induction systems (RI), neural networks (NN) genetic algorithms (GA) and fuzzy systems (FS), with 1 representing low and 4 representing a high utility.
Interpretability • Synaptic weights in trained neural networks are not easy to interpret with particular difficulties if interpretations are required. • Genetic algorithms model natural genetic adaptation to changing environments and thus are inherently adaptable and learn well • Not easily interpretable because although the knowledge resides partly in the selection mechanism it is in the most part deeply embedded within a population of adapted genes.
Adaptability • Case based systems are adaptable because changing the case library may be sufficient to port a system to a related area. If changes need to be made to the similarity metric or the adaptation mechanism or if the case structure needs to be changed much more work may be required.
Learnability • Fuzzy rule based systems offer more option through which learnability may be more easily achieved. • Fuzzy rules may be fine tuned by adjusting the shapes of the fuzzy sets according to user feedback [Abi]
Rules and cases • Rule based systems employ an easily comprehensible but rigid representation of expert knowledge such systems may afford better interpretation mechanisms. • Similarly recent research shows [SØR] that explanation techniques for large case bases is most promising while case based learning and maintenance can often be very efficient because of the transparency of typical case libraries.
Example Neural Expert Systems
Can we combine advantages of ANNs with other IS systems to create more powerful and effective systems?
Neural expert systems • Expert systems rely on logical inferences and decision trees and focus on modelling human reasoning. Neural networks rely on parallel data processing and focus on modelling a human brain. • Expert systems treat the brain as a black-box. Neural networks look at its structure and functions, particularly at its ability to learn. • Knowledge in a rule-based expert system is represented by IF-THEN production rules. Knowledge in neural networks is stored as synaptic weights between neurons.
In expert systems, knowledge can be divided into individual rules and the user can see and understand the piece of knowledge applied by the system. • In neural networks, one cannot select a single synaptic weight as a discrete piece of knowledge. Here knowledge is embedded in the entire network; it cannot be broken into individual pieces, and any change of a synaptic weight may lead to unpredictable results. A neural network is, in fact, a black-box for its user.
Can we combine advantages of expert systems and neural networks to create a more powerful and effective expert system? A hybrid system that combines a neural network and a rule-based expert system is called a neural expert system (or a connectionist expert system).
The heart of a neural expert system is the inference engine. It controls the information flow in the system and initiates inference over the neural knowledge base. A neural inference engine also ensures approximate reasoning.
Approximate reasoning • In a rule-based expert system, the inference engine compares the condition part of each rule with data given in the database. When the IF part of the rule matches the data in the database, the rule is fired and its THEN part is executed. The precise matching is required (inference engine cannot cope with noisy or incomplete data). • Neural expert systems use a trained neural network in place of the knowledge base. The input data does not have to precisely match the data that was used in network training. This ability is called approximate reasoning.
Rule extraction • Neurons in the network are connected by links, each of which has a numerical weight attached to it. • The weights in a trained neural network determine the strength or importance of the associated neuron inputs.
Trained Neural Network To Identify Flying Objects Is there any way that we could interpret the values in the weights in a meaningful way?
Algorithm By attaching a corresponding question to each input neuron, we can enable the system to prompt the user for initial values of the input variables: Neuron: Wings Question: Does the object have wings? Neuron: Tail Question: Does the object have a tail? Neuron: Beak Question: Does the object have a beak? Neuron: Feathers Question: Does the object have feathers? Neuron: Engine Question: Does the object have an engine? Score 1 for yes, -1 for no and 0 for unknown Use a sign function as the activation and interpret 0 for no and 1 for yes.
Exercise: Neuro-rule inference If we set each input of the input layer to either +1 (true), 1 (false), or 0 (unknown), we can give a semantic interpretation for the activation of any output neuron. For example, if the object has Wings (+1), Beak (+1) and Feathers (+1), but does not have Engine (1) What can we conclude about the object being a bird, a plane or a glider applying a threshold of 0 and using the sign function as an activation function?
Algorithm for Extracting Confidence Heuristic: Known greater than unknown An inference can be made if the known net weighted input to a neuron is greater than the sum of the absolute values of the weights of the unknown inputs. where i known, j known and n is the number of neuron inputs.
Class Exercise: Confidence in Neural Rules In the neural rules below suppose that you find an increasing amount of information about an object: 1 It has feathers. 2 It has feathers and a beak 3 It has feathers, a beak and wings. At what point, according to the above algorithm, can the inference be made that the object is a bird? How much difference does the knowledge about wings make?
Enter initial value for the input Feathers: +1 KNOWN = 12.8 = 2.8 UNKNOWN = 0.8 + 0.2 + 2.2 + 1.1 = 4.3 KNOWN UNKNOWN Enter initial value for the input Beak: +1 KNOWN = 12.8 + 12.2 = 5.0 UNKNOWN = 0.8 + 0.2 + 1.1 = 2.1 KNOWN UNKNOWN CONCLUDE: Bird is TRUE
A Set of rules can be mapped into a multi-layer neural network architecture • The weights between the layers represent rule certainties • After establishing the initial structure of the ANN a training algorithm may be applied. • After training the weights may be used to refine the initial set of rules.
Evolutionary neural networks • Although neural networks are used for solving a variety of problems, they still have some limitations. • One of the most common is associated with neural network training. The back-propagation learning algorithm cannot guarantee an optimal solution. In real-world applications, the back-propagation algorithm might converge to a set of sub-optimal weights from which it cannot escape. As a result, the neural network is often unable to find a desirable solution to a problem at hand.
Another difficulty is related to selecting an optimal topology for the neural network. The “right” network architecture for a particular problem is often chosen by means of heuristics, and designing a neural network topology is still more art than engineering. • Genetic algorithms are an effective optimisation technique that can guide both weight optimisation and topology selection.
The second step is to define a fitness function for evaluating the chromosome’s performance. This function must estimate the performance of a given neural network. We can apply here a simple function defined by the sum of squared errors. • The training set of examples is presented to the network, and the sum of squared errors is calculated. The smaller the sum, the fitter the chromosome. The genetic algorithm attempts to find a set of weights that minimises the sum of squared errors.
The third step is to choose the genetic operators – crossover and mutation. A crossover operator takes two parent chromosomes and creates a single child with genetic material from both parents. Each gene in the child’s chromosome is represented by the corresponding gene of the randomly selected parent. • A mutation operator selects a gene in a chromosome and adds a small random value between 1 and 1 to each weight in this gene.