160 likes | 383 Views
Conditioning. Bear with me. Bare with me. Beer with me. Stay focused. Learning. Typically this subsides as this is learned. A. Two-process learning (Rescorla-Solomon 67) fast: fear and arousal slow: adaptive behavioral responses B. Three-process learning A
E N D
Conditioning Bear with me. Bare with me. Beer with me. Stay focused.
Learning Typically this subsides as this is learned. • A. Two-process learning (Rescorla-Solomon 67) • fast: fear and arousal • slow: adaptive behavioral responses • B. Three-process learning • A • declarative memory (as opposed to procedural) • C. More-than-three-process learning • A • declarative memory • episodic memory • semantic memory • more stuff
US UR US UR/CR innate innate Delay procedure CS learned US CS S Trace procedure CS US Conditional and Unconditional Training US = “Reinforcer” easier harder
US US UR/CR Action innate innate delivery of the reinforcer is contingent on the occurrence of a stimulus (the CS). learned learned CS S1 delivery of the reinforcer is contingent on the occurrence of a designated response Classical and Operant CC predicts that the animal will produce UR/CR while performing the desired action, but does not explain why the animal learns to select the action.
Selectionist View • Selectionist principles • Behaviors are varied, selected and retained in a process similar to the natural selection of the species • Only overt behaviors can be reinforced by the environment • Principle of the selection is based in the behavioral discrepancy
Behavioral Discrepancy Behavioral discrepancy is the change in an ongoing behavior produced by the eliciting stimulus Example: Presentation of food produces salivation which would not otherwise occur
Unified Selection Principle Whenever a behavioral discrepancy occurs, an environment-behavior relation is selected that consists -- other things being equal -- of all those stimuli occurring immediately before the discrepancy and all those responses occurring immediately before and at the same time as the elicited response. Under this principle there is no difference between Classical and Operant conditioning as far as learning goes.
Name Set I Set II Test Pavlovian Overshadowing Inhibitory Blocking Upwards unblocking Downwards unblocking Conditioning Phenomena It goes on...
Conditioning/Selection Models • Trial-by-trial • Probabilistic (Dayan-Long, Cheng-Novick) • … and not (Rescorla-Wagner) • NN (Donohoe) • Moment-by-moment • Sutton-Barto • Mignault • Schmajuk (NN) • ~ Bazillion of others... S1 and S2 processing should happen at roughly the same time so almost all models suggest a multiplicative relationship between levels of S1 and S2.
Rescorla-Wagner model • Trial based • Based on net prediction of the reward • Only happens when prediction discrepancy is detected • Falls out straight from ML estimation of association strength • Is essantially the delta-rule net prediction association strength update reward stimulus eligibility • Problems: • Does not deal well with overshadowing and downwards unblocking... • Does not depend on the temporal relations between stimuli • Does not explain re-acquisition rate
Real-time model • Combines Y theory with RW model • time-derivative model • presumes that all stimuli produce +V at the onset and -V at the offset • Deals with secondary conditioning sum of all the associative strengths at a given time Sutton-Barto model • Problems: • Does not model Inter-Stimulus Intervals where the efficiency of the training should decrease with increased ISI • Does not deal with reacquisition
Temporal Difference model • Is related to the SB model (and the RW model) • Models reward in small discrete intervals • Models second order conditioning • Based on the assumption that the goal of learning is to accurately predict the future US levels discounted prediction of the future reward (V for predicted values of S) • Problems: • No model of attention, salience, configuration etc... • No indirect associations modeled (sensory preconditioning) • Problems with downwards unblocking
Statistical models This results in exactly the RW model with ML. This is EM. Similar to comparator models of conditioning (whatever they are). Has problems with inhibitory conditioning. Dayan & Long’s model. Models the conditioning phenomena. Does not consider associability (eligibility in SB) and attention. No distinction between preparatory and consumatory conditioning
NN models Warning: a personal opinion! • Everything is a neural net - things happen naturally • The weights propagate and this forms the dynamics of the Stimulus-Stimulus interactions S1 Stuff happens here Response S2 Whatever….
Bruce’s favorite model • Model time and rate of CS and reinforcement • Time -scale invariant • Non-associative framework rates of reinforcement cumulative number of reinforcements in presence of Sn cumulative duration of the conjunction of S1 and Sn cumulative duration of Sn
References • Dayan, P., and Abbot, L. F. (2000?). Theoretical Neuroscience. In Print??? (http://www.gatsby.ucl.ac.uk/~dayan/book/) • Dayan, P. and Long, T., (1998?). Statistical Models of Conditioning. NIPS10. • Gallistel, C. R., and Gibbon, J., (2000) . Time, Rate and Conditioning. Psychological review, in print. • Pavlov, I. P. (1927). Conditioned Reflexes. Oxford: Oxford University Press. • Mignault, A. and Marley, A. A. J. (1997). A Real-Time Neuronal Model of Classical Conditioning. Adaptive Behavior. Vol. 6-1, 3-61. • Rescorla, R. A. (1988). Behavioral studies of Pavlovian conditioning. Annual Review of Neuroscience 11: 329 - 352. • Rescorla, R. A., and R. L. Solomon. (1967). Two-process learning theory: Relationships between Pavlovian conditioning and instrumental learning. Psychological Review 74: 151 - 182. • Rescorla, R. A., and A. R. Wagner. (1972). A theory of Pavlovian conditioning: Variations in the effectiveness of reinforcement and nonreinforcement. In A. H. Black and W. F. Proskay, Eds., Classical Conditioning, vol. 2, Current Research and Theory. New York: Appleton-Century-Crofts, pp. 54 - 99. • Roitblat, H. L. and Meyer, J.-A.. Comparative Approaches to Cognitive Science. MIT Press. • Schmajuk, N. A. (1997). Animal Learning and Cognition. A neural Network approach. • Skinner, B. F. (1938). The Behavior of Organisms. New York: Appleton-Century-Crofts. • Sutton, R. S., and Barto, A. W, (1990). Computational Neuroscience: Foundations of Adaptive Networks. MIT Press • Thorndike, E. L. (1911). Animal Intelligence: Experimental Studies. New York: Macmillan. • Wilson, R. A. and Keil, F. (1999) The MIT Encyclopedia of Cognitive Sciences. MIT Press. MITECS (http://cognet.mit.edu/MITECS)