590 likes | 606 Views
Explore the functioning of artificial neural networks in comparison to real neurons and brains. Learn about recognition tasks, network architectures, and information representation in the brain, shedding light on the complex interplay of neurons.
E N D
Lecture 25 Artificial Intelligence: Recognition Tasks (S&G, §13.4) CS 100 - Lecture 24
The Cognitive Inversion • Computers can do some things very well that are difficult for people • e.g., arithmetic calculations • playing chess & other board games • doing proofs in formal logic & mathematics • handling large amounts of data precisely • But computers are very bad at some things that are easy for people (and even some animals) • e.g., face recognition & general object recognition • autonomous locomotion • sensory-motor coordination • Conclusion: brains work very differently from Von Neumann computers CS 100 - Lecture 24
The 100-Step Rule • Typical recognition tasks take less than one second • Neurons take several milliseconds to fire • Therefore then can be at most about 100 sequential processing steps CS 100 - Lecture 24
Von Neumann: Narrow but Deep Neural Computation: Shallow but Wide … Two Different Approaches to Computing CS 100 - Lecture 24
Retina: 100 million receptors Optic nerve: one million nerve fibers How Wide? CS 100 - Lecture 24
A Very Brief Tour ofReal Neurons (and Real Brains) CS 100 - Lecture 24
CS 100 - Lecture 24 (fig. from internet)
Left Hemisphere CS 100 - Lecture 24
Typical Neuron CS 100 - Lecture 24
Grey Matter vs. White Matter CS 100 - Lecture 24 (fig. from Carter 1998)
Neural Density in Cortex • 148 000 neurons / sq. mm • Hence, about 15 million / sq. cm CS 100 - Lecture 24
Relative Cortical Areas CS 100 - Lecture 24
Information Representationin the Brain CS 100 - Lecture 24
Macaque Visual System CS 100 - Lecture 24 (fig. from Van Essen & al. 1992)
Hierarchy of Macaque Visual Areas CS 100 - Lecture 24 (fig. from Van Essen & al. 1992)
Bat Auditory Cortex CS 100 - Lecture 24 (figs. from Suga, 1985)
Real Neurons CS 100 - Lecture 24
Typical Neuron CS 100 - Lecture 24
Dendritic Trees of Some Neurons • inferior olivary nucleus • granule cell of cerebellar cortex • small cell of reticular formation • small gelatinosa cell of spinal trigeminal nucleus • ovoid cell, nucleus of tractus solitarius • large cell of reticular formation • spindle-shaped cell, substantia gelatinosa of spinal chord • large cell of spinal trigeminal nucleus • putamen of lenticular nucleus • double pyramidal cell, Ammon’s horn of hippocampal cortex • thalamic nucleus • globus pallidus of lenticular nucleus CS 100 - Lecture 24 (fig. from Trues & Carpenter, 1964)
Layers and Minicolumns CS 100 - Lecture 24 (fig. from Arbib 1995, p. 270)
Neural Networks in Visual System of Frog CS 100 - Lecture 24 (fig. from Arbib 1995, p. 1039)
Frequency Coding CS 100 - Lecture 24 (fig. from Anderson, Intr. Neur. Nets)
Variations in Spiking Behavior CS 100 - Lecture 24
Chemical Synapse • Action potential arrives at synapse • Ca ions enter cell • Vesicles move to membrane, release neurotransmitter • Transmitter crosses cleft, causes postsynaptic voltage change CS 100 - Lecture 24 (fig. from Anderson, Intr. Neur. Nets)
Neurons are Not Logic Gates • Speed • electronic logic gates are very fast (nanoseconds) • neurons are comparatively slow (milliseconds) • Precision • logic gates are highly reliable digital devices • neurons are imprecise analog devices • Connections • logic gates have few inputs (usually 1 to 3) • many neurons have >100 000 inputs CS 100 - Lecture 24
Artificial Neural Networks CS 100 - Lecture 24
connectionweights inputs Typical Artificial Neuron output threshold CS 100 - Lecture 24
summation activationfunction net input Typical Artificial Neuron CS 100 - Lecture 24
0 2 S 1 –1 4 2 1 Operation of Artificial Neuron 0 2 = 0 1 1 = –1 1 4 = 4 3 3 > 2 1 CS 100 - Lecture 24
1 2 S 1 –1 4 2 0 Operation of Artificial Neuron 1 2 = 2 1 1 = –1 0 4 = 0 1 1 < 2 0 CS 100 - Lecture 24
input layer output layer hidden layers Feedforward Network . . . . . . . . . . . . . . . . . . CS 100 - Lecture 24
Feedforward Network In Operation (1) input . . . . . . . . . . . . . . . . . . CS 100 - Lecture 24
Feedforward Network In Operation (2) . . . . . . . . . . . . . . . . . . CS 100 - Lecture 24
Feedforward Network In Operation (3) . . . . . . . . . . . . . . . . . . CS 100 - Lecture 24
Feedforward Network In Operation (4) . . . . . . . . . . . . . . . . . . CS 100 - Lecture 24
Feedforward Network In Operation (5) . . . . . . . . . . . . . . . . . . CS 100 - Lecture 24
Feedforward Network In Operation (6) . . . . . . . . . . . . . . . . . . CS 100 - Lecture 24
Feedforward Network In Operation (7) . . . . . . . . . . . . . . . . . . CS 100 - Lecture 24
Feedforward Network In Operation (8) output . . . . . . . . . . . . . . . . . . CS 100 - Lecture 24
Comparison with Non-Neural Net Approaches • Non-NN approaches typically decide output from a small number of dominant factors • NNs typically look at a large number of factors, each of which weakly influences output • NNs permit: • subtle discriminations • holistic judgments • context sensitivity CS 100 - Lecture 24
Connectionist Architectures • The knowledge is implicit in the connection weights between the neurons • Items of knowledge are not stored in dedicated memory locations, as in a Von Neumann computer • “Holographic” knowledge representation: • each knowledge item is distributed over many connections • each connection encodes many knowledge items • Memory & processing is robust in face of damage, errors, inaccuracy, noise, … CS 100 - Lecture 24
Differences fromDigital Calculation • Information represented in continuous images (rather than language-like structures) • Information processing by continuous image processing (rather than explicit rules applied in individual steps) • Indefiniteness is inevitable (rather than definiteness assumed) CS 100 - Lecture 24
Supervised Learning • Produce desired outputs for training inputs • Generalize reasonably & appropriately to other inputs • Good example: pattern recognition • Neural nets are trained rather than programmed • another difference from Von Neumann computation CS 100 - Lecture 24
0 2 S 1 –1 4 2 1 Learning for Output Neuron (1) Desired output: 1 3 3 > 2 1 No change! CS 100 - Lecture 24
0 2 S 1 –1 4 2 1 Learning for Output Neuron (2) Desired output: 0 3 –1.5 X 3 > 2 XXX 1.75 < 2 X 1.75 X 0 1 Output is too large X 3.25 CS 100 - Lecture 24
1 2 S 1 –1 4 2 0 Learning for Output Neuron (3) Desired output: 0 1 1 < 2 0 No change! CS 100 - Lecture 24
1 2 S 1 –1 4 2 0 Learning for Output Neuron (4) Desired output: 1 X 2.75 1 –0.5 X 1 < 2 XXX 2.25 > 2 X 2.25 X 1 0 Output is too small CS 100 - Lecture 24
. . . input layer output layer hidden layers Credit Assignment Problem How do we adjust the weights of the hidden layers? Desired output . . . . . . . . . . . . . . . . . . CS 100 - Lecture 24
. . . . . . Back-Propagation:Forward Pass . . . . . . . . . . . . . . . Desired output Input CS 100 - Lecture 24
. . . Back-Propagation:Actual vs. Desired Output . . . . . . . . . . . . . . . . . . Desired output Input CS 100 - Lecture 24