910 likes | 1.16k Views
Chapter 9. Generalization, Discrimination, and the Representation of Similarity. 9.1 Behavioral Processes. 9.1 Behavioral Processes. When Similar Stimuli Predict Similar Consequences When Similar Stimuli Predict Different Consequences
E N D
Chapter 9 Generalization, Discrimination, and the Representation of Similarity
9.1 Behavioral Processes • When Similar Stimuli Predict Similar Consequences • When Similar Stimuli Predict Different Consequences • Unsolved Mysteries—Why Are Some Feature Pairs Easier to Discriminate Between Than Others? • When Dissimilar Stimuli Predict the Same Consequence • Learning and Memory in Everyday Life— Discrimination and Stereotypes in Generalizing about Other People
Generalization and Discrimination • Generalization—transfer of past learning to new situations/problems. • Responding to one stimulus (S) as a result of training with another; influenced by similarity to the training stimulus. • Specificity—deciding how narrowly a rule applies. • Generality—deciding how broadly a rule applies. • Discrimination—recognition of differences between stimuli.
When Similar Stimuli Predict Similar Consequences • Generalization gradient—graph showing how physical changes in stimuli correspond to behavioral response changes. • In Guttman and Kalish study: • Pigeons learned to peck a yellow light (training S) for food. • Gradient shows how often they subsequently pecked different color shades (Fig 9.1). • Gradient width illustrates level of S generalization.
(Fig 9.1) Stimulus Generalization Gradients in Pigeons Adapted from Guttman & Kalish, 1956, pp. 79–88.
What Causes Generalization Gradients? Is it discrimination error? Logical inference about shared consequences? Shepard (1987): Identify regions of shared consequence. Assume all possible regions, small and large. Average probabilistically over all. Result: Standard exp-declining gradient Argues: “View exp-declining gradients as representing attempt to predict, based on past experience, how likely it is that what is true about the consequences of one stimulus will also be true of other similar stimuli.”
Generalization as a Search for Similar Consequences • Consequential region—all stimuli with the same results as the training stimulus, as mapped on a generalization gradient. • For example, the pigeon has a moderate expectation to get food from pecking a yellow-range light (given Fig 9.1).
The Challenge of Incorporating Similarity into Learning Models • Discrete-component representation— representation in which each individual stimulus (or stimulus feature) corresponds to its own node or “component.” • Simplest possible scheme to represent stimuli. • Fig 9.2 uses discrete-component representations. • Shows an unrealistic generalization gradient.
(Fig 9.2) Stimulus Generalization Model Using Discrete-Component Representations
Limitations of Discrete-Component Representations • Representations are applicable to situations in which stimuli are dissimilar and little generalization would occur. • Fail when stimuli have high degree of physical similarity. • Note: Different representations in different contexts provide different patterns of similarity. • Representations are context-specific.
(Fig 9.3) Generalization Gradient Produced by Discrete-Component Network of Fig 9.2 *Shows no response to yellow-orange light (despite similarity to previously trained yellow light). Only responds to trained “yellow” stimulus.; fails to show a smooth generalization gradient like that shown in Fig 9.1.
Shared Elements and Distributed Representations • Thorndike (law of effect), Estes (stimulus sampling theory), Rumelhart (connectionist models) contributed to a contemporary associative-learning model. • Conceptualized with distributed representations (overlapping pools of stimulus nodes). • Similar stimuli activate common elements; something learned about one stimulus transfers to other stimuli that activate the same nodes.
Orange Yellow Thorndike and Estes Shared Elements Network model follows…
Shared Elements and Distributed Representations • Fig 9.4a–d shows a network model using distributed representations. • Nodes laid out in topographic representation (nodes responding to physically similar stimuli placed beside each other in the model). • 9.4a shows the model (which is only slightly more complicated than Fig 9.2). • 9.4b shows the outcome in distributed weights after many acquisition trials.
Shared Elements and Distributed Representations • 9.4c shows response strength from a stimulus (yellow/orange) test. • 9.4d shows the weaker response to a more varied stimulus (orange) test. • Such a distributed representation model better matches real life gradients, much like Fig 9.1 (see Fig 9.5).
(Fig 9.5) Stimulus Generalization Gradient Produced by Distributed Representation Model of Fig 9.4
When Similar Stimuli Predict Different Consequences • Two substances that appear similar initially, may become distinguishable over time. • Example: • Gooseberries look like green grapes. If you are allergic to gooseberries, you learn to distinguish them from green grapes (discrimination).
Discrimination Training and Learned Specificity • The weaker the generalization, the stronger the discrimination. • Discrimination = differential responding to two stimuli. • Discrimination can be trained; in discrimination training, two different (but similar) stimuli are presented on each trial. • The steeper (and skinnier) the gradient, the higher the discrimination.
Discrimination Training and Learned Specificity • Fig 9.6 shows the adapted results of a classic 1962 experiment (Jenkins studies tone discrimination in pigeons). • One gradient represents the test pattern for pigeons that heard a 1000 Hz tone before they pecked and received food. • The other gradient represents the generalization for pigeons that were intermittently exposed to a similar 950 Hz tone without food. • Which is the control group? Experimental group?
(Fig 9.6) Generalization Gradients for Tones of Different Frequencies Adapted from Jenkins and Harrison, 1962.
Unsolved Mysteries—Why Are Some Feature Pairs Easier to Discriminate between Than Others? • Some pairs of stimulus features are separable, such as brightness and hue. • Other feature pairs are perceived holistically, such as brightness and saturation. • Understanding the nature of feature pairs relates to stimulus generalization.
Negative Patterning: Differentiating Configurations from Their Individual Components • Negative patterning occurs when we respond positively to two stimuli presented separately, but we respond negatively to the compound (i.e., the combination). • Example: • Mom at home? Eat dinner in the kitchen. Dad at home? Eat dinner in the kitchen. Both Mom and Dad at home? Don’t eat dinner in the kitchen (Eat in the dining room).
Negative Patterning • Rats, monkeys, and humans learn negative patterning tasks. • Rabbits can learn to blink to either a tone or a light, and to not blink to a simultaneous tone and light.
Negative Patterningin Rabbit Eyeblink Conditioning Adapted from Kehoe, 1988, Figure 9.
Negative Patterning • Single-layer network models using discrete-component representations cannot learn negative patterning.
Negative Patterning • Fig 9.11 shows a multi-layer network model for negative patterning. • Include extra nodes that only fire when two or more specific features present. • In Fig 9.11, a configural node for “tone + light” will fire only if both inputs are active.
Configural Learning in Categorization • Configural tasks require sensitivity to combinations of stimulus cues, above and beyond what is known about stimulus components. • Configural nodes can be applied to categorization learning, where humans learn to classify stimuli into categories. • e.g., diagnosis from symptoms.
Configural Learning in Categorization • Figure 12a–12c shows a configural-node model of category learning. • 12a shows the model. • In 12b, both fever and soreness together (without ache) predicts the disease. • Dilemma = combinatorial explosion • 12c is a simpler, more flexible (alternative) model.
When Dissimilar Stimuli Predict the Same Consequence • Co-occurrence of stimuli may increase generalization. • Example: • If you like the cookies at a new bakery, you may like their brownies.
Sensory Preconditioning: Similar Predictions for Co-occurring Stimuli • Sensory Preconditioning—conditioning without an explicit US. • Prior presentation of compound stimuli results in later tendency for learning about one stimulus to generalize to the other.
Sensory Preconditioning *Example* • Step 1: (tone, light) • Step 2: (light, puff) • CR eyeblink should develop over acquisition trials. • Step 3: (tone alone) • If CR eyeblink occurs, we call this phenomenon “sensory preconditioning.” • Illustrates the generalizability of a stimulus’s power! The tone was never presented as a cue for the puff!
Acquired Equivalence: Novel Similar Predictions Based on Prior Similar Consequences • Acquired equivalence—prior training in stimulus equivalence increases amount of generalization between two stimuli, even if stimuli are superficially dissimilar. • In Hall study, pigeons learned the dissimilar colors paired separately with the same color had the same result. • Demonstrated this generalization in a new situation.
Learning and Memory in Everyday Life— Discrimination and Stereotypes in Generalizing about Other People • Category formation is a basic cognitive process. • Rational generalizations let us tentatively generalize individual outcomes from previous experiences. • Stereotyping is denying exceptions for individuals from a group for which we may hold oversimplified beliefs. • Attempts to justify unfair treatment.
9.1 Interim Summary • Generalization = transfer of past learning to new situations and problems. • Requires finding balance between specificity (knowing how narrowly a rule applies) and generality (knowing how broadly the rule applies). • Discrimination = recognizing differences between stimuli; knowing which to prefer. • Understanding similarity is essential to understand generalization and discrimination.
9.1 Interim Summary • Discrete-component representations: assign each stimulus (or feature) to its own node. • Applicable to situations in which similarity among features is small enough that there is negligible transfer of response from one to another. • Distributed representations: incorporate idea of shared elements. • Allow creation of psychological models with concepts represented as patterns of activity over many nodes; provide ability to model stimulus similarity and generalization.
9.1 Interim Summary • We tend to assume that patterns formed from compound cues will have consequences that parallel (or even combine) what we know about the individual cues. • However, some discriminations require sensitivity to the configurations of stimulus cues above and beyond what is known about the individual stimulus cues.
9.1 Interim Summary • Animals and people can learn to generalize between stimuli that have no physical similarity but that do have a history of co-occurrence or of predicting the same outcome.