510 likes | 710 Views
ICT619 Intelligent Systems Topic 4: Artificial Neural Networks. Artificial Neural Networks. PART A Introduction An overview of the biological neuron The synthetic neuron Structure and operation of an ANN Problem solving by an ANN Learning in ANNs ANN models Applications. PART B
E N D
ICT619 Intelligent SystemsTopic 4: Artificial Neural Networks
Artificial Neural Networks PART A • Introduction • An overview of the biological neuron • The synthetic neuron • Structure and operation of an ANN • Problem solving by an ANN • Learning in ANNs • ANN models • Applications • PART B • Developing neural network applications • Design of the network • Training issues • A comparison of ANN and ES • Hybrid ANN systems • Case Studies ICT619
Developing neural network applications Neural Network Implementations Three possible practical implementations of ANNs are: • A software simulation program running on a digital computer • A hardware emulator connected to a host computer - called a neurocomputer • True electronic circuits ICT619
Software Simulations of ANN • Currently the cheapest and simplest implementation method for ANNs - at least for general purpose use. • Simulates parallel processing on a conventional sequential digital computer • Replicates temporal behaviour of the network by updating the activation level and output of each node for successive time steps • These steps are represented by iterations or loops • Within each loop, the updates for all nodes in a layer are performed. ICT619
Software simulations of ANN (cont’d) • In multilayer ANNs, processing for a layer is completed and its output used to calculate states of the nodes in the following layer • Typical additional features of ANN simulators • Configuring the net according to a chosen architecture and node operational characteristic • Implementation of training phase using a chosen training algorithm • Tools for visualising and analysing behaviour of nets • ANN simulators are written in hi-level languages such as C, C++ and Java. ICT619
Advantages and possible problems with software simulators Advantages and possible problems with software simulators • Main attraction of ANN simulators is the relatively low cost and wide availability of ready-made commercial packages • They are also compact, flexible and highly portable. • Writing your own simulator requires programming skills and would be time consuming (except that you don't have to now!) • Training of ANNs using software simulators can be slow for larger networks (greater than a few hundred) ICT619
Commercially available neural net packages • Prewritten shells with convenient user interfaces • Cost a few hundred to tens of thousands of dollars • Allow users to specify the ANN design and training parameters • Usually provide graphic interfaces to enable monitoring of the net’s training and operation • Likely to provide interfacing with other software systems such as spreadsheets and databases. ICT619
Neurocomputers • Dedicated special-purpose digital computer (aka accelerator boards) • Optimised to perform operations common in neural network simulation • Acts as a coprocessor to a host computer and is controlled by a program running on the host. • Can be tens to thousands of times faster than simulators • Systems are available with approx. 1000 million IPS connection updates per second for networks with 8,192 neurons e.g ACC Neural Network Processor ICT619
Neurocomputers Genobyte's CAM-Brain Machine was developed between 1997 and 2000 ICT619
True Networks in Hardware • Closer to biological neural networks than simulations • Consist of synthetic neurons actually fabricated on silicon chips • Commercially available hardwired ANNs are limited to a few thousand neurons per chip1. • Chips connected in parallel to achieve larger networks. • Problems: interconnection and interference, fixed-valued weights - work progressing on modifiable synapses. 1Figures more than five years old. ICT619
Neural Network Development Methodology • Aims to add structure and organisation to ANN applications development for reducing cost, increasing accuracy, consistency, user confidence and friendliness • Split development into the following phases: • The Concept Phase • The Design Phase • The Implementation Phase • The Maintenance Phase ICT619
Neural Network Development Methodology - the Concept Phase Involves • Validating the proposed application • Selecting an appropriate neural paradigm. Application validation Problem characteristics suitable for neural network application are: • Data intensive • Multiple interacting parameters • Incomplete, erroneous, noisy data • Solution function unknown or expensive • Requires flexibility, generalisation, fault-tolerance, speed ICT619
ANN Development Methodology - the Concept Phase (cont’d) • Common examples of applications with above attributes are • pattern recognition (eg, printed or handwritten character, consumer behaviour, risk patterns), • forecasting (eg, stock market), signal (audio, video, ultrasound) processing • Problems not suitable for ANN-based solutions include: • A mathematically accurate and precise solution is available • Solution involving deduction and step-wise logic appropriate • Applications involving explaination or reporting • One application area that is unsuitable for ANNs is resource management eg, inventory, accounts, sales data analysis ICT619
Selecting an ANN paradigm • Decision based on comparison of application requirements to capabilities of different paradigms eg, the multilayer perceptron is well known for its pattern recognition capabilities, • Kohonen net more suited for applications involving data clustering • Choice of paradigm also influenced by the training method that can be employed eg. supervised training must have adequate number of input-correct output pairs available and training may take a relatively long time • Technical and economic feasibility assessments should be carried out to complete the concept phase ICT619
The Design Phase • The design phase specifies initial values and conditions at the node, network and training levels • Decisions to be made at the node level include: • Types of input – binary (0,1), bipolar (-1,+1), trivalent (-1, 0, +1), discrete, continuous-valued • Transfer function - step or threshold, hyperbolic tangent, sigmoid, consider possible use of lookup tables for speeding up calculations • Decisions to be made at the network architecture level • The number and size of layers and their connectivity (fully interconnected, or sparsely interconnected, feedforward or recurrent, other?) ICT619
The Design Phase (cont’d) • 'Size' of a layer is the number of nodes in the layer • For the input layer, size is determined by number of data sources (input vector components) and possibly the mathematical transformations done • The number of nodes in the output layer is determined by the number of classes or decision values to be output • Finding optimal size of the hidden layer needs some experimentation • Too few nodes will produce inadequate mapping, while too many may result in inadequate generalisation ICT619
The Design Phase (cont’d) Connectivity • Connectivity determines the flow of signals between neurons in the same or different layers • Some ANN models, such as the multilayer perceptron, have only interlayer connections - there is no intralayer connection • The Hopfield net is an example of a model with intralayer connections ICT619
The Design Phase (cont’d) Feedback • There may be no feedback of output values, eg, the multilayer perceptron or • There may be feedback as in a recurrent network eg, the Hopfield net • Other design questions include • Setting of parameters for the learning phase – eg, stopping criterion, learning rate. • Possible addition of noise to speed up training. ICT619
The Implementation phase Typical steps: • Gathering the training set • Selecting the development environment • Implementing the neural network • Testing and debugging the network • Gathering the training set • Aims to get right type of data in adequate amount and in the right format ICT619
Gathering training data (cont’d) • How much data to gather? • Increasing data amount increases training time but may help earlier convergence • Quality more important than quantity • Collection of data • Potential sources - historical records, instrument readings, simulation results • Preparation of data • Involves preprocessing including scaling, normalisation, binarisation, mapping to logarithmic scale, etc. ICT619
Gathering training data (cont’d) • Type of data to collect should be representative of given problem including routine, unusual and boundary-condition cases • Mix of good as well as imperfect data but not ambiguous or too erroneous. • Amount of data to gather • Increasing data amount increases training time but may help earlier convergence • Quality more important than quantity ICT619
Gathering training data (cont’d) • Collection of data • Potential sources - historical records, instrument readings, simulation results • Preparation of data • Involves preprocessing including normalisation and possible binarisation ICT619
Selecting the development environment Hardware and software aspects • Hardware requirements based on • speed of operation • memory and storage capacity • software availability • cost • compatibility • The most popular platforms are workstations and high-end PC's (with accelerator board option) ICT619
Selecting the development environment Two options in choosing software • Custom-coded simulators – which requires more expertise on part of the user but provides maximum flexibility • Commercial development packages – which are usually easy to use because of a more sophisticated interface ICT619
Selecting the development environment (cont’d) • Selection of hardware and software environment usually based on following considerations: • ANN paradigm to be implemented • Speed in training and recall • Transportability • Vendor support • Extensibility • Price ICT619
Implementing the neural network Common steps involved are: • Selection of appropriate neural paradigm • Setting network size • Deciding on the learning algorithm • Creation of screen displays • Determining the halting criteria • Collecting data for training and testing • Data preparation including preprocessing • Organising data into training and test sets ICT619
Implementation - Training • Training the net, which consists of • Loading the training set • Initialisation of network weights – usually to small random values • Starting the training process • Monitoring the training process until training is completed • Saving of weight values in a file for use during operation mode ICT619
Implementation – Training (cont’d) Possible problems arising during training • Failure to converge to a set of optimal weight values • Further weight adjustments fail to reduce output error, stuck in a local minimum • Remedied by resetting the learning parameters and reinitialising the weights • Overtraining • Net fails to generalise, i.e., fails to classify less than perfect patterns • Mix of good and imperfect patterns for training helps ICT619
Implementation – Training (cont’d) • Training results may be affected by the method of presenting data set to the network. • Adjustments may be made by varying the layer sizes and fine-tuning the learning parameters. • To ensure optimal results, several variations of a neural network may be trained and each tested for accuracy ICT619
Implementation - Testing and Debugging Testing can be done by: 1. Observing operational behaviour of the net. 2. Analysing actual weights 3. Study of network behaviour under specific conditions Observing operational behaviour • Network treated as a black box and its response to a series of test cases is evaluated Test data • Should contain training cases as well as new cases • Routine, unusual as well as boundary condition cases should be tried ICT619
Implementation - Testing and Debugging (cont’d) Testing by weight analysis • Weights entering and exiting nodes analysed for relatively small and large values • In case of significant errors detected in testing, debugging would involve examining • the training cases for representativeness, accuracy and adequacy of number • learning algorithm parameters such as the rate at which weights are adjusted • neural network architecture, node characteristics, and connectivity • training set-network interface, user-network interface ICT619
The Maintenance Phase Consists of • placing the neural network in an operational environment with possible integration • periodic performance evaluation, and maintenance • Although often designed as stand-alone systems, some neural network systems are integrated with other information systems using: • Loose-coupling – preprocessor, postprocessor, distributed component • Tight-coupling or full integration as embedded component ICT619
The Maintenance Phase Possible ANN operational environments: ICT619
System evaluation • Continual evaluation is necessary to • ensure satisfactory performance in solving dynamic problems • check for damaged or retrained networks. • Evaluation can be carried out by reusing original test procedures with current data. ICT619
ANN Maintenance Involves modification necessitated by • Decreasing accuracy • Enhancements System modification falls into two categories involving either data or software. • Data modification steps: • Training data is modified or replaced • Network retrained and re-evaluated. ICT619
ANN Maintenance (cont’d) • Software changes include changes in • Interfaces • cooperating programs • the structure of the network. • If the network is changed, part of the design and most of the implementation phase may have to be repeated. • Backup copies should be used for maintenance and research. ICT619
A comparison of ANN and ES Similarities between ES and ANN • Both aim to create intelligent computer systems by mimicking human intelligence, although at different levels • Design process of neither ES nor ANN is automatic • Knowledge extraction in ES is a time and labour intensive process • ANNs are capable of learning but selection and preprocessing of data have to be done carefully. ICT619
A comparison of ANN and ES (cont’d) Differences between ANN and ES • Differ in aspects of design, operation and use • Logic vs. brain • ES simulate the human reasoning process based on formal logic • ANNs are based on modelling the brain, both in structure and operation • Sequential vs. parallel • The nature of processing in ES is sequential • ANNs are inherently parallel ICT619
A comparison of ANN and ES (cont’d) External and static vs. internal and dynamic • Learning is performed external to the ES • ANN itself is responsible for its knowledge acquisition during the training phase. • Learning is always off-line in ES - knowledge remains static during operation • Learning in ANNs, although mostly off-line, can be on-line • Deductive vs. inductive inferencing • Knowledge in an ES always used in a deductive reasoning process • An ANN constructs its knowledge base inductively from examples, and uses it to produce decision through generalisation ICT619
A comparison of ANN and ES (cont’d) Knowledge representation: explicit vs. implicit • ES store knowledge in explicit form -possible to inspect and modify individual rules • ANNs knowledge stored implicitly in the interconnection weight values • Design issues: simple vs. complex • Technical side of ES development relatively simple without difficult design choices. • ANN design process often one of trial and error ICT619
A comparison of ANN and ES (cont’d) • User interface: white box vs. black box • ES have explanation capability • Difficulty in interpreting an ANN's knowledge-base effectively makes it a black box to the user • State of maturity and recognition: well-established vs. early • ES already well established as a methodology in commercial applications • ANN recognition and development tools at a relatively early stage. ICT619
Hybrid systems • Neuro-symbolic computing utilises the complementary nature of computing in neural networks (numerical) and expert systems (symbolic). • Neuro-fuzzy systems combine neural networks with fuzzy logic • ANNs can also be combined with genetic algorithm methodology Hybrid ES-ANN systems • The strengths of the ES can be utilised to overcome the weaknesses of an ANN based system and vice versa. • For example, ANN’s extraction of knowledge from data • ES’s explanation capability ICT619
Hybrid ES-ANN systems • Rule extraction by inference justification in an ANN • MACIE, an ANN based decision support system described in (Gallant 1993) • Extracts a single rule that justifies an inference in an ANN • Inference in an ANN is represented by output of a single node • This output is based upon incomplete input values fed from a number of nodes as shown in the diagram below. ICT619
Hybrid ES-ANN systems (cont’d) • A node uiisdefined to be a contributing node to node uj if wij ui 0. ICT619
Hybrid ES-ANN systems (cont’d) • In this example, the contributing variables are {u2, u3, u5, u6 }. • The rule produced in this example is: IF u6 = Unknown AND u2 = TRUE AND u3 = FALSE AND u5 = TRUE THEN conclude u7 = TRUE. ICT619
Hybrid ES-ANN systems (cont’d) • One approach to hybrid systems divides a problem into tasks suitable for either ES and ANN • These tasks are then performed by the appropriate methodology • One example of such a system (Caudill 1991) is an intelligent system for delivering packages • ES performs the task of producing the best loading strategy for packages into trucks • ANN works out best route for delivering the packages efficiently. ICT619
Hybrid ES-ANN systems (cont’d) • Hybrid ES-ANN systems with ANNs embedded within expert systems • ANN used to determine which rule to fire, given the current state of facts. • Another approach to hybrid ES-ANN uses an ANN as a preprocessor • One or more ANNs produce classifications. • Numerical outputs produced by ANN are interpreted symbolically by an ES as facts • ES applies the facts for deductive reasoning ICT619
Case Study Case: Application of ANNs in bankruptcy prediction (Coleman et al, AI Review, Summer 1991, in Zahedi 1993) • Predicts banks that were certain to fail within a year • Predicts certainty given to bank examiners dealing with the bank in question. • ANN has 11 inputs, each of which is a ratio developed by Peat Marwick. • Developed by NeuralWare’s Application Development Services and Support Group (ADSS) • Software used - the NeuralWorks Professional neural network development system. • Uses the standard backpropagation (multiplayer perceptron) network. ICT619
Case Study (cont’d) • ANN has 11 inputs, each a ratio developed by Peat Marwick. • Inputs connected to a single hidden layer, which in turn is connected to a single node in the output layer. • Network outputs a single value denoting whether the bank would or would not fail within that calendar year • Employed the hyperbolic-tangent transfer function and a proprietary error function created by the ADSS staff. • Trained on a set of 1,000 examples, 900 of which were viable banks and 100 of which were banks that had actually gone bankrupt • Training consisted of about 50,000 iterations of the training set. • Predicted 50% of banks that are viable, and 99% of banks that actually failed. ICT619
REFERENCES • AI Expert (special issue on ANN), June 1990. • BYTE (special issue on ANN), Aug. 1989. • Caudill,M., "The View from Now", AI Expert, June 1992, pp.27-31. • Dhar, V., & Stein, R., Seven Methods for Transforming Corporate Data into Business Intelligence., Prentice Hall 1997 • Kirrmann,H., "Neural Computing: The new gold rush in informatics", IEEE Micro June 1989 pp. 7-9 • Lippman, R.P., "An Introduction to Computing with Neural Nets", IEEE ASSP Magazine, April 1987 pp.4-21. • Lisboa, P., (Ed.) Neural Networks Current Applications, Chapman & Hall, 1992. • Negnevitsky, M. Artificial Intelligence A Guide to Intelligent Systems, Addison-Wesley 2005. ICT619