CS 4700: Foundations of Artificial Intelligence

CS 4700:Foundations of Artificial Intelligence Prof. Carla P. Gomes gomes@cs.cornell.edu Module: Neural Networks Expressiveness of Perceptrons (Reading: Chapter 20.5)

Expressiveness of Perceptrons

Expressiveness of Perceptrons What hypothesis space can a perceptron represent? Even more complex Booelan functions such as majority function . But can it represent any arbitrary Boolean function?

Expressiveness of Perceptrons A threshold perceptron returns 1 iff the weighted sum of its inputs (including the bias) is positive, i.e.,: I.e., iff the input is on one side of the hyperplane it defines. Perceptron  Linear Separator Linear discriminant function or linear decision surface. Weights determine slope and bias determines offset.

Linear Separability Consider example with two inputs, x1, x2: x2 + + + Can view trained network as defining a “separation line”. + + + + What is its equation? x1 Percepton used for classification

+ + OR - + Linear Separability x2 ? x1

Linear Separability x2 - + ? AND x1 - -

Linear Separability x2 + - ? XOR x1 - +

Linear Separability x2 Not linearly separable + - XOR x1 - + Minsky & Papert (1969) Bad News: Perceptrons can only represent linearly separable functions.

w1 + w2 > 2T contradiction Linear Separability:XOR • Consider a threshold perceptron for the logical XOR function (two inputs): • Our examples are: • x1 x2 label • 1 0 0 0 • 2 1 0 1 • 3 0 1 1 • 4 1 1 0 Given our examples, we have the following inequalities for the perceptron: From (1) 0 + 0 ≤ T  T0 From (2) w1+ 0 > T  w1 > T From (3) 0 + w2 > T  w2 > T From (4) w1 + w2 ≤ T So, XOR is not linearly separable

Convergence of Perceptron Learning Algorithm • … training data linearly separable • … step size  sufficiently small • … no “hidden” units Perceptron converges to a consistent function, if…

Perceptron learns majority function easily, DTL is hopeless

DTL learns restaurant function easily, perceptron cannot represent it

Good news: Adding hidden layer allows more target functions to be represented. Minsky & Papert (1969)

Multi-layer Perceptrons (MLPs) • Single-layer perceptrons can only represent linear decision surfaces. • Multi-layer perceptrons can represent non-linear decision surfaces.

Bad news: No algorithm for learning in multi-layered networks, and no convergence theorem was known in 1969! Minsky & Papert (1969) “[The perceptron] has many features to attract attention: its linearity; its intriguing learning theorem; its clear paradigmatic simplicity as a kind of parallel computation. There is no reason to suppose that any of these virtues carry over to the many-layered version. Nevertheless, we consider it to be an important research problem to elucidate (or reject) our intuitive judgment that the extension is sterile.” Minsky & Papert (1969) pricked the neural network balloon …they almost killed the field. Rumors say these results may have killed Rosenblatt…. Winter of Neural Networks 69-86.

Two major problems they saw were • How can the learning algorithm apportion credit (or blame) to individual weights for incorrect classifications depending on a (sometimes) large number of weights? • How can such a network learn useful higher-order features?

The “Bible” (1986) • Good news: Successful credit-apportionment learning algorithms • developed soon afterwards (e.g., back-propagation). Still successful, in • spite of lack of convergence theorem.

CS 4700: Foundations of Artificial Intelligence

CS 4700: Foundations of Artificial Intelligence

Presentation Transcript

Disaster Psychology

Artificial Life

Artificial Intelligence Chapter 4: Informed Search and Exploration

Intelligence Chapter 10

Chapter 6: Knowledge-based Decision Support and Artificial Intelligence

Explorations in Artificial Intelligence

Hsinchun Chen, Ph.D. Director, COPLINK Center of Excellence, Artificial Intelligence Lab, Hoffman E-Commerce Lab, Univer

344-571 ปัญญาประดิษฐ์ ( Artificial Intelligence)

Artificial Intelligence

CptS 440 / 540 Artificial Intelligence

Foundations and Early History of Clinical Psychology

CS 4700: Foundations of Artificial Intelligence

Artificial Intelligence

Design of Mat Foundations

Intelligence and Security Informatics for International Security: Framework and Case Studies

人工智能 Artificial Intelligence

CSCE 580 Artificial Intelligence Ch.4: Features and Constraints

The Collective Intelligence of Diverse Agents: Micro Foundations of Uncertainty