160 likes | 177 Views
Explore the world of artificial neural networks for supervised learning, from the basics of k-Nearest Neighbor to advanced Neural Network Training. Discover how ANNs mimic human neurons and the connection with modern AI systems. Learn about the Perceptron algorithm, backpropagation, and handling non-linearly separable data.
E N D
Artificial Neural Networks • Artificial Neural Networks are (among other things) another technique for supervised learning k-Nearest Neighbor Decision Tree Neural Network Training Data Test Data Classification
Human neuron • Dendrites pick up signals from other neurons • When signals from dendrites reach a threshold, a signal is sent down axon to synapse
Connection with AI • Most modern AI: • “Systems that act rationally” • Implementing neurons in a computer • “Systems that think like humans” • Why artificial neural networks then? • “Universal” function fitter • Potential for massive parallelism • Some amount of fault-tolerance • Trainable by inductive learning, like other supervised learning techniques
Perceptron Example 1 = malignant 0 = benign # of tumors w1 = -0.1 Output Unit w2 = 0.9 Avg area Avg density w3 = 0.1 Input Units
The Perceptron: Input Units • Input units: features in original problem • If numeric, often scaled between –1 and 1 • If discrete, often create one input node for each category • Can also assign values for a single node (imposes ordering)
The Perceptron: Weights • Weights: Represent importance of each input unit • Combined with input units to feed output units • The output unit receives as input:
The Perceptron: Output Unit • The output unit uses an activation function to decide what the correct output is • Sample activation function:
Simplifying the threshold • Managing the threshold is cumbersome • Incorporate as a “virtual” weight
How to learn the right weights? • Need to redefine perceptron • “Step function” no good – need something differentiable • Replace with sigmoid approximation
Sigmoid function • Good approximation to step function • As binfinity,sigmoid step • We’ll just take b = 1 for simplicity
Computing weights • Think of as a gradient descent method, where weights are variables and trying to minimize error:
Can appropriate weights always be found? • ONLY IF data is linearly separable
What if data is not linearly separable? Neural Network. O • Each hidden unit is a perceptron • The output unit is another perceptron with hidden units as input Vj
Neural Networks and machine learning issues • Neural networks can represent any training set, if enough hidden units are used • How long do they take to train? • How much memory? • Does backprop find the best set of weights? • How to deal with overfitting? • How to interpret results?