700 likes | 827 Views
An Unsupervised Connectionist Model of Rule Emergence in Category Learning. Rosemary Cowell & Robert French LEAD-CNRS, Dijon, France. EC FP6 NEST Grant. 516542. No. of features attended to. “Eureka moment”. “No Eureka moment”. Rule.
E N D
An Unsupervised Connectionist Model of Rule Emergence in Category Learning Rosemary Cowell & Robert French LEAD-CNRS, Dijon, France EC FP6 NEST Grant. 516542
No. of features attended to “Eureka moment” “No Eureka moment” Rule
Goal: To develop an unsupervised learning system from which simple rules emerge. • Young infants do not receive “rule instruction” for category learning. • Animals have even less rule instruction than human infants. • We are not claiming that ALL rule learning occurs in this manner, but rather that some, especially for young infants, does.
f3 New Object f1 f2 Kohonen Network Category A Category B Category C w31 w13 w12 w32 w21 w11 w22 w33 w23 Is (f1, f2, f3) closest to (w11, w21, w31) ?
f3 New Object f1 f2 Kohonen Network Category A Category B Category C w31 w13 w12 w32 w21 w11 w22 w33 w23 (f1, f2, f3) is closest to (w11, w21, w31) or (w12, w22, w32) ?
f3 New Object f1 f2 Kohonen Network Category A Category B Category C w31 w13 w12 w32 w21 w11 w22 w33 w23 Is (f1, f2, f3) closest to (w11, w21, w31) or (w12, w22, w32) or (w13, w23, w33) ?
f3 New Object f1 f2 Kohonen Network Category A Category B Category C w31 w13 w12 w32 w21 w11 w22 w33 w23 Is (f1, f2, f3) closest to (w11, w21, w31) or (w12, w22, w32) or If this is the winner, we modify (w13, w23, w33)to be a little closer to (f1, f2, f3) than before. (w13, w23, w33) ?
Weight vectors (w11, w21, w31) (w12, w22, w32) (w13, w23, w33) Life expectancy No. of calories consumed per day
Life expectancy No. of calories consumed per day
Life expectancy No. of calories consumed per day
Life expectancy No. of calories consumed per day
Life expectancy No. of calories consumed per day
Life expectancy No. of calories consumed per day
Life expectancy No. of calories consumed per day
Life expectancy No. of calories consumed per day
Life expectancy No. of calories consumed per day
Life expectancy No. of calories consumed per day
Life expectancy No. of calories consumed per day
Life expectancy No. of calories consumed per day
Life expectancy No. of calories consumed per day
Life expectancy No. of calories consumed per day
Life expectancy No. of calories consumed per day
Life expectancy The new weight vectors are now the centroids of each category. No. of calories consumed per day
Let’s see how we could extract rules from this kind of system.
A first idea Competitive learning network A copy of the Kohonen network watches…. …the weights of the original Kohonen network. Kohonen Network
f3 New Object f1 f2 Category-Determining Features Category A Category B Category C w13 w13 w12 w32 w21 w11 w22 w33 w23 A category-determining feature is one that is sufficient to categorize an object, e.g., if f3 present, then the object is in Category C and it will not be in Categories A or B. In other words, if the object is in Category C, f3 will match w33, but will not match w31 or w32.
f3 New Object f1 f2 Category-Irrelevant Features Category A Category B Category C w13 w13 w12 w32 w21 w11 w22 w33 w23 A category-irrelevant feature is one that is not sufficient to categorize an object because it is shared by objects in two or more categories. e.g., if f3 present, then the object may in Category A or C. In other words, f3 matches both w31 and w33.
A concrete example: A Cat-Bird-Bat categorizer Bird Bat Cat Category Nodes eyes Input stimulus wings beak
Bird Bat Cat Category Nodes eyes Input stimulus wings beak A category-determining feature is one that is sufficient to categorize an object, e.g., if ‘beak’, then ‘bird’ and not ‘bat’ or ‘cat’.
Bird Bat Cat Category Nodes eyes Input stimulus wings beak A category-irrelevant feature is one that is not sufficient to categorize an object because it is shared by two or more categories, e.g., if ‘wings’, then ‘bird’ or ‘bat’.
Bat Bird Cat Bat Cat Bird eyes wings Bat Bird Cat beak How do we isolate the category-determining features from category-irrelevant features?
One answer: competition between the weights in a separate network – the Rule Network -- that is a copy of the Kohonen network.
The Rule Network The Rule Network is a separate competitive network with a weight configuration that matches that of the Kohonen network. It “watches” the Kohonen Network to find rule-determining features. How?
Bat Bird Cat Bat Cat Bird eyes wings Bat Bird Cat beak We consider the weights coming from each feature are in competition.
Bat Bird Cat These weights in the Rule Network have been pushed down by mutual competition This weight in the Rule Network has won the competition and is now much stronger wings Bat Cat Bird Bat Bird Cat eyes beak The results of the competition.
Bat Bird Cat wings Bat Cat Bird Bat Bird Cat eyes beak The network has found a category-determining feature for birds
Yes, but in a biologically plausible model, what could it possibly mean for “synaptic weights to be in competition” ?
We will implement a mechanism that is equivalent to weight-competition using noise.
Revised architecture of the model, without weight competition The activations in the primary part of the network … … are echoed in the extension layers. Extension layers Kohonen Network Extracts rules, by discovering which of the stimuli’s features are sufficient to determine category membership. Forms category representations on the basis of perceptual similarity
The neurobiologically plausible Kohonen Network
f3 f1 f2 Kohonen Network: a spreading activation, biologically plausible implementation Category A Category B Category C Output Nodes w31 w13 w12 w32 w21 w11 w22 w33 w23 Input stimulus Is Cat A node most active ? (f1.w11 + f2.w21 + f3.w31)
f3 f1 f2 Kohonen Network Category A Category B Category C Output Nodes w31 w13 w12 w32 w21 w11 w22 w33 w23 Input stimulus Is Cat A node most active ? (f1.w11 + f2.w21 + f3.w31) Or, is Cat B node most active ? (f1.w12 + f2.w22 + f3.w32)
f3 f1 f2 Kohonen Network Category A Category B Category C Output Nodes w31 w13 w12 w32 w21 w11 w22 w33 w23 Input stimulus Is Cat A node most active ? (f1.w11 + f2.w21 + f3.w31) Or, is Cat B node most active ? (f1.w12 + f2.w22 + f3.w32) Or, is Cat C node most active ? (f1.w13 + f2.w23 + f3.w33)
f3 f1 f2 Kohonen Network Category A Category B Category C Output Nodes w31 w13 w12 w32 w21 w11 w22 w33 w23 Input stimulus is the largest, Cat B node is the winner. If (f1.w12 + f2.w22 + f3.w32)
… … f1 f2 f3 • Winner activates itself highly • … activates its near neighbours a little • … inhibits distant nodes • Learning is Hebbian depends on activation of sending and receiving nodes. • Next time a similar stimulus is presented, same output node wins.
The Rule Units are a separate layer of nodes whose activation echoes that in the Kohonen network. The Rule Units learn to map input stimuli onto category representations using only rule-determining features. How?
categories features