600 likes | 775 Views
ARTIFICIAL INTELLIGENCE: THE MAIN IDEAS. OLLI COURSE SCI 102 Tuesdays, 11:00 a.m. – 12:30 p.m. Winter Quarter, 2013 Higher Education Center, Medford Room 226. Nils J. Nilsson. nilsson@cs.stanford.edu http:// ai.stanford.edu/~nilsson /. Course Web Page: www.sci102.com/.
E N D
ARTIFICIAL INTELLIGENCE:THE MAIN IDEAS OLLI COURSE SCI 102 Tuesdays, 11:00 a.m. – 12:30 p.m. Winter Quarter, 2013 Higher Education Center, Medford Room 226 Nils J. Nilsson nilsson@cs.stanford.edu http://ai.stanford.edu/~nilsson/ Course Web Page: www.sci102.com/ For Information about parking near the HEC, go to: http://www.ci.medford.or.us/page.asp?navid=2117 There are links on that page to parking rules and maps
PART ONE(Continued)REACTIVE AGENTS Perception Action Selection Memory
But Some Are Not Very User-Friendly Fair Isaac Experience
Models of the Cortex Using Deep, Hierarchical Neural Networks All connections are bi-directional
Two Pioneers in Using Networks to Model the Cortex Hierarchical Temporal Memory Jeff Hawkins Geoffrey Hinton
More About Jeff Hawkins’s Ideas http://www.numenta.com/htm-overview/education/HTM_CorticalLearningAlgorithms.pdf
Dileep George’s Hierarchical Temporal Memory (HTM) Model A “Convolutional” Network George is a founder of startup, Vicarious http://vicarious.com/team.html
A “Mini-Column” of the Neo-Cortex From: “HIERARCHICAL TEMPORAL MEMORY” http://www.numenta.com/htm-overview/education/HTM_CorticalLearningAlgorithms.pdf
Figure 10. Columnar organization of the microcircuit. George, Dileepand Hawkins, Jeff: (2009) Towards a Mathematical Theory of Cortical Micro-circuits. PLoSComputBiol 5(10): e1000532. doi:10.1371/journal.pcbi.1000532 http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1000532
Figure 9. A laminar biological instantiation of the Bayesian belief propagation equations used in the HTM nodes. George D, Hawkins J (2009) Towards a Mathematical Theory of Cortical Micro-circuits. PLoS Comput Biol 5(10): e1000532. doi:10.1371/journal.pcbi.1000532 http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1000532
Letting Networks “Adapt” to Their Inputs All connections are bi-directional Massive number of inputs Weight Values Become Those For Extracting “Features” of Inputs HonglakLee,et al., “Convolutional Deep Belief Networks for Scalable Unsupervised Learning of Hierarchical Representations,” Proceedings of the 26th Annual International Conference on Machine Learning, 2009
Hubel & Wiesel’s “Detector Neurons” David Hubel, Torsten Wiesel Short bar of light projected onto a cat’s retina Response of a single neuron in the cat’s visual cortex (as detected by a micro-electrode in the anaesthetized cat)
Use of Deep Networks With Unsupervised Learning All connections are bi-directional First Layer Learns “Building-Block” Features Common to Many Images
Second Layer Learns Features Common Just to Cars, Faces, Motorbikesand Airplanes cars, faces, motorbikes, airplanes
Third Layer Learns How to Combine the Features of the Second Layer Into aRepresentation of the Input cars, faces, motorbikes, airplanes
The Net Can Make Predictions About Unseen Parts of the Input
“Building High-level Features Using Large Scale Unsupervised Learning” Quoc V. Lee,et al. (Google and Stanford) 1,000 Google Computers, 1,000,000,000 Connections
Large Scale Unsupervised Learning (Continued) Recognizes 22,000 object categories Unsupervised learning for three days 10 million 200x200 pixel images downloaded from the Internet (stills from YouTube) a “cat neuron” a “face neuron”
One Result 81.7% accuracy in detecting faces out of 13,026 faces in a test set For more information about these experiments at Google/Stanford, see: http://research.google.com/archive/unsupervised_icml2012.html
Using Models (i.e., Memory) Can Make Agents Even More Intelligent Perception Action Selection Model of World (e.g., a map)
Types of Models Maps Memory of Previous States List of State-Action Pairs
Models can be pre-installed or learned
Learning and Using Maps where am I? where is everything else? Neato Robot Vacuum
Action Selection Perception S-R Rules Using “State” of the Agent determines the“state”of the world Library of States and Actions (Memory) IF state1, THEN actiona IF state2, THEN actionb . . .
Lists of numbers, such as (1,7,3,4,6) Arrays, such as “Statements,” such as Color(Walls, LightBlue) Shape(Rectangular) . . . Ways to Represent States
(1,7,3,4,6) a (1,6,2,8,7) b (4,5,1,8,5) c . . . (7,4,8,9,2) k Library of States & Actions (1,5,2,8,6) Input (present state) Closest Match
Example: Face Recognition Using a large database containing many, many images of faces, a small set of “building-block” faces is computed: The average of all faces: http://cognitrn.psych.indiana.edu/nsfgrant/FaceMachine/faceMachine.html
Familiar Uses of “Building Blocks” A Musical Tone Consists of “Harmonics”
Library of Known Faces (Represented as composites of the building-block faces) Sam Joe (2,2,-2,0,0,1,2,2,-1,2,2,-1,,0,2,0) (0,0,1,0,0,-2,-2,0,-1,-2,-2,-1,2,-1,0) Plus Thousands More Sue Mike (-3,2,1,1,-2,1,-2,3,0,0,0,-4,-3,2,-2) (4,1,3,-1,4,0,4,4,1,4,4,-4,4,-4,-4)
Face Recognition Library of Known Faces Query Face • Represented as a composite of the building-block faces • (present state) (0,0,1,0,0,-2,-2,0,-1,-2,-2,-1,2,-1,0) Sam Joe Mike Sue (2,2,-2,0,0,1,2,2,-1,,2,2,-1,,0,2,0) (-3,2,1,1,-2,1,-2,3,0,0,0,-4,-3,2,-2) (-2,2,1,1,-2,1,-2,3,1,0,0,-4,-3,2,-2) (4,1,3,-1,4,0,4,4,1,4,4,-4,4,-4,-4) Sue is the Closest Match
A table of states and actions and “values” Another Kind of Model
Why have values for multiple actions instead of just noting the best action? Because the values in the table can be changed (learned) depending on experience! REINFORCEMENT LEARNING (Another Point of Contact with Brains)
Pioneers in the Use of Reinforcement Learning in AI Andy Barto Rich Sutton Chris Watkins
But the Mouse Doesn’t Have a Map of the Maze (Like We Do)Instead it remembers the states it visits and assigns their actions random initial values
It Can Change the Values in the TableThe First Step (state1, up) gets initial random value 3
There is only one action possible (up), and the mouse ends up in state2 state2, has 3 actions, each with initial random values
Now the mouse updates the value of (state1, up) in its table 5 value propagates backward (possibly with some loss)
Sooner or later, the mouse stumbles into the goal and gets a “reward”
The reward value is propagated backward 99 value propagates backward (with some loss)
And So On . . .With a Lot of Exploration, the Mouse Learns the Maze