Connectionism and Models of Memory and Amnesia

Connectionism and models of memory and amnesia Jaap Murre University of Amsterdam murre@psy.uva.nl http://www.memory.uva.nl

The French neurologist Ribot discovered more than 100 years ago that in retrograde amnesia one tends to loose recent memoriesMemory loss gradients in RA are called Ribot gradients

Overview • Catastrophic interference and hypertransfer • Brief review of neuroanatomy • Outline of the TraceLink model • Some simulation results of neural network model, focussing on retrograde amnesia • Recent work: • Mathematical point-process model • Concluding remarks

Catastrophic interference • Learning new patterns in backpropation will overwrite all existing patterns • Rehearsal is necessary • McCloskey and Cohen (1989), Ratcliff (1990) • This is not psychologically plausible

Osgood surface (1949) • Paired-associates in lists A and B will interfere strongly if the stimuli are similar but the responses vary • If stimuli are different, little interference (i.e., forgetting) occurs • Backpropagation also shows odd behavior if stimuli vary but responses are similar in lists A and B (hypertransfer)

Hypertransfer Learned responses Stimuli Target responses (after three learning trials) Phase 1: Learning list A rist munk twup gork gomp toup wemp twub twup Phase 2: Learning interfering list B (after five learning trials) yupe munk muup maws gomp twup drin twub twub Phase 3: Retesting on list A rist munk goub gork gomp tomp wemp twub twub

Problems with sequential learning in backpropagation • Reason 1: Strongly overlapping hidden-layer representations • Remedy 1: reduce the hidden-layer representations • French, Murre (semi-distributed representations)

Problems with sequential learning in backpropagation • Reason 2: Satisfying only immediate learning constraints • Remedy 2: Rehearse some old patterns, when learning new ones • Murre (1992): random rehearsal • McClelland, McNaughton and O’Reilly (1995): interleaved learning

Final remarks on sequential learning • Two-layer ‘backpropagation’ networks do show plausible forgetting • Other learning networks do not exhibit catastrophic interference: ART, CALM, Kohonen Maps, etc. • It is not a necessary condition of learning neural networks; it mainly affects backpropagation • The brain does not do backpropagation and therefore does not suffer from this problem

Models of amnesia and memory in the brain • TraceLink • Point-process model • Chain-development model

Neuroanatomy of amnesia • Hippocampus • Adjacent areas such as entorhinal cortex and parahippocampal cortex • Basal forebrain nuclei • Diencephalon

The position of the hippocampus in the brain

Hippocampal connections

Hippocampus has an excellent overview of the entire cortex

Trace-Link model: structure

System 1: Trace system • Function: Substrate for bulk storage of memories, ‘association machine’ • Corresponds roughly to neocortex

System 2: Link system • Function: Initial ‘scaffold’ for episodes • Corresponds roughly to hippocampus and certain temporal and perhaps frontal areas

System 3: Modulatory system • Function: Control of plasticity • Involves at least parts of the hippocampus, amygdala, fornix, and certain nuclei in the basal forebrain and in the brain stem

Stages in episodic learning

Dreaming and consolidation of memory “We dream in order to forget” • Theory by Francis Crick and Graeme Mitchison (1983) • Main problem: Overloading of memory • Solution: Reverse learning leads to removal of ‘obsessions’

Dreaming and memory consolidation • When should this reverse learning take place? • During REM sleep • Normal input is deactivated • Semi-random activations from the brain stem • REM sleep may have lively hallucinations

Consolidation may also strengthen memory • This may occur during deep sleep (as opposed to REM sleep) • Both hypothetical processes may work together to achieve an increase in the definition of representations in the cortex

Recent data by Matt Wilson and Bruce McNaughton (1994) • 120 neurons in rat hippocampus • PRE: Slow-wave sleep before being in the experimental environment (cage) • RUN: During experimental environment • POST: Slow-wave sleep after having been in the experimental environment

Wilson en McNaughton Data • PRE: Slow-wave sleep before being in the experimental environment (cage) • RUN: During experimental environment • POST: Slow-wave sleep after having been in the experimental environment

Some important characteristics of amnesia • Anterograde amnesia (AA) • Implicit memory preserved • Retrograde amnesia (RA) • Ribot gradients • Pattern of correlations between AA and RA • No perfect correlation between AA and RA

Normal forgetting anterograde amnesia retrograde amnesia x past lesion present

An example of retrograde amnesia patient data Kopelman (1989) News events test

Retrograde amnesia • Primary cause: loss of links • Ribot gradients • Shrinkage

Anterograde amnesia • Primary cause: loss of modulatory system • Secondary cause: loss of links • Preserved implicit memory

Semantic dementia • The term was adopted recently to describe a new form of dementia, notably by Julie Snowden et al. (1989, 1994) and by John Hodges et al. (1992, 1994) • Semantic dementia is almost a mirror-image of amnesia

Neuropsychology of semantic dementia • Progressive loss of semantic knowledge • Word-finding problems • Comprehension difficulties • No problems with new learning • Lesions mainly located in the infero-lateral temporal cortex but (early in the disease) with sparing of the hippocampus

No consolidation in semantic dementia Severe loss of trace connections Stage-2 learning proceeds as normal Stage 3 learning strongly impaired Non-rehearsed memories will be lost

Semantic dementia in TraceLink • Primary cause: loss of trace-trace connections • Stage-3 (and 4) memories cannot be formed: no consolidation • The preservation of new memories will be dependent on constant rehearsal

Connectionist implementationof the TraceLink model With Martijn Meeter from the University of Amsterdam

Some details of the model • 42 link nodes, 200 trace nodes • for each pattern • 7 nodes are active in the link system • 10 nodes in the trace system • Trace system has lower learning rate that the link system

How the simulations work: One simulated ‘day’ • A new pattern is activated • The pattern is learned • Because of low learning rate, the pattern is not well encoded at first in the trace system • A period of ‘simulated dreaming’ follows • Nodes are activated randomly by the model • This random activity causes recall of a pattern • A recalled pattern is than learned extra

(Patient data) Kopelman (1989) News events test

A simulation with TraceLink

Frequency of consolidation of patterns over time

Strongly and weakly encoded patterns • Mixture of weak, middle and strong patterns • Strong patterns had a higher learning parameter (cf. longer learning time)

Transient Global Amnesia (TGA) • (Witnessed onset) of severe anterograde and retrograde amnesia • Resolves within 24 hours • Retrograde amnesia may have Ribot gradients • Hippocampal area is most probably implicated

Transient Global Amnesia (TGA)

Other simulations • Focal retrograde amnesia • Levels of processing • Semantic dementia • Implicit memory • More subtle lesions (e.g., only within-link connections, cf. CA1 lesions)

The Memory Chain Model: a very abstract neural network With Antonio Chessa from the University of Amsterdam

Abstracting TraceLink (level 1) • Model formulated within the mathematical framework of point processes • Generalizes TraceLink’s two-store approach to multiple neural ‘stores’ • trace system • link system • working memory, short-term memory, etc. • A store corresponds to a neural process or structure

Learning and forgetting as a stochastic process: 1-store example • A recall cue (e.g., a face) may access different aspects of a stored memory • If a point is found in the neural cue area, the correct response (e.g., the name) can be given Forgetting Successful Recall Unsuccessful Recall Learning

Jo Brand Neural network interpretation

m a Link system Retrieval Survival probability Single-store point process • The expected number of points in the cue area after learning is called  • This  is directly increased by learning and also by more effective cueing • At each time step, points die • The probability of survival of a point is denoted by a

Some aspects of the point process model • Model of simultaneous learning and forgetting • Clear relationship between signal detection theory (d'), recall (p), savings (Ebbinghaus’ Q), and Crovitz-type distribution functions • Multi-trial learning and multi-trial savings • Currently applied to over 250 experiments in learning and forgetting, since 1885

Connectionism and Models of Memory and Amnesia