210 likes | 389 Views
Neural Networks: An Introduction and Overview. Jim Ries NLM Predoctoral Fellow JimR@acm.org 6/13/2000. Introduction. Provide an intuitive feel for what NN’s are and problems for which they are an appropriate tool. NOT overwhelm you with mathematics.
E N D
Neural Networks: An Introduction and Overview Jim Ries NLM Predoctoral Fellow JimR@acm.org 6/13/2000
Introduction • Provide an intuitive feel for what NN’s are and problems for which they are an appropriate tool. • NOT overwhelm you with mathematics. • Caveat: I’m not an NN researcher; just an interested outsider (like most of you).
Topics of Discussion • What are Neural Networks? • Training • History • Alternative Methods • Applications • Conclusions • Questions
What are Neural Nets? • A mechanism for approximating a function, given some sample or “training” data. • A mechanism for classifying, clustering, or recognizing patterns in data. • These two broad applications are essentially the same (e.g., imagine a function that outputs a discrete number indicating a cluster).
What are Neural Nets? (cont.) • Rosenblatt’s Perceptron: a network of processing elements (PE): Y1 Yp a1 am . . . x1 x2 x3 xn . . .
What are Neural Nets? (cont.) • Additional layer(s) can be added: Y1 Yp a1 am . . . h1 hm . . . x1 x2 x3 xn . . .
What are Neural Nets? (cont.) • A “node” (PE) is typically represented as a function. • Simple functions can quickly be “trained” or updated to fit a curve to data, but are unable to fit well to complex data (e.g., linear functions can never approximate quadratics). • Universal Approximator! (typically Radial Basis Function).
Training • With simple Perceptron model, we can train by adjusting the weights on inputs when the output does not match test data. • The amount of adjustment we do at each training iteration is called the “learning rate”.
Training (cont.) • With one or more hidden layers, training requires some sort of “propagation algortihm”. • Backpropagation is commonly used and is an extension to the “Minimum Disturbance Algorithm”:
Training (cont.) Minimum Disturbance Algorithm 1) Apply an example, propagate inputs to output 2) Count # of incorrect output units 3) For output units, do a number of times Select unselected units closest to zero activation Change weights if less errors, use new weights, else old 4) Repeat step #3 for all layers
Training (cont.) • Overfitting - fits a function to training data, but does not approximate real world. • Ways to avoid overfitting • Regularization (assumes real function is “smooth”. • Early stopping • Curvature-driven
History • Early 1960’s - Rosenblatt’s Perceptron (Rosenblatt, F., Principles of Neurodynamics, New York: Spartan Books, 1962). • Late 1960’s - Minsky (Minsky, M. and Papert, S., Perceptrons, MIT Press, Cambridge, 1969). • 1970’s & early 1980’s - largely empty of NN activity due to Minsky.
History (cont.) • Late 1980’s - NN re-emerge with Rumelhart and McClelland(Rumelhart, D., McClelland, J., Parallel and Distributed Processing, MIT Press, Cambridge, 1988). • Since PDP there has been an explosion of NN literature.
Alternative Methods • Classical statistical methods • Fail in on-line scenarios • Not universal approximators (e.g., linear regression) • Assume normal distribution. • Symbolic approach. • Expert Systems • Mathematical Logic (e.g., Prolog) • Schemas, Frames, or Scripts
Alternative Methods (cont.) • NN’s are the “Connectionist” approach. • Encoding of data can be a “creative” endeavor • Ensemble Approach • Baysian Networks • Fuzzy NN
Applications • Control • Forecasting • Provide faster approximations compared to exact algorithms (e.g., NeuroBlast). • Compression • Cognitive Modeling
Conclusions • NN’s are useful for a wide variety of tasks, but care must be taken to choose the correct algorithms for a given problem domain. • NN’s are not a panacea, and other approaches may be appropriate for given problems.