AI Principles, Semester 2, , Biological Intelligence II

AI Principles, Semester 2, , Biological Intelligence II Recap Biological Intelligence I: Two ways to think about levels of description Firstly levels of description correspond to nearly decomposable systems implemented on top of each other, ANN neural nets correspond to one level, Production Systems the level above and Logical (rational) operations the level above. Weakness within this theory is that the systems may be far removed from being decomposable. Secondly levels of description may be either in columns of implementational systems, or as algorithms that are described in abstract information processing terms, or at the computational level, which is the level of observable external behaviour

Classical connectionism Artificial Neural Networks Many use the Back-propagation learning algorithm that is not considered biologically plausible Some ANN may be considered to be at an implementational level, and hence at a lower level of description in Newell’s (1990) hierarchy. However, as Rumelhart and Mclelland note - many connectionist models can be considered as being at the same, algorithmic level as most Production System models of cognition.

ACT-R (adaptive control of thought - rational) AI = algorithms, representations and architectures ACT-R is a leading cognitive architecture, it supports a number of subsystems with their own representations within a single architecture It explains (predicts) a lot of human behaviour, in experiments, in naturalistic settings such as using cockpits or computers Its operation can be seen in imaging experiments

ACT-R (adaptive control of thought - rational) Goal setting Long term declarative memory Central Production System Sensory subsystems Motor subsystems Sensory subsystems Motor subsystems Sensory subsystems Motor subsystems In between each system are buffers that hold information for a set amount of time, and then let it decay, like forgetting. So the buffers are like short-term memory. We can speculate that the contents of buffers are the mental contents that a human is conscious of.

ACT-R - what do the productions look like? (P initialize-addition =goal> ISA add arg1 =num1 arg2 =num2 sum nil ==> =goal> sum =num1 count 0 +retrieval> isa count-order first =num1 )

ACT-R and the brain Neuro-imaging studies of people undertaking cognitive tasks has allowed different subsystems of the ACT-R architecture to be localised into specific brain regions Long term declarative memory= across the cortex Goal setting = prefrontal cortex Central Production System = Basal Ganglia Sensory subsystems Motor subsystems Sensory subsystems Motor subsystems Sensory subsystems Motor subsystems

Newell test for a theory of cognition 1 - Arbitrary function of the environment 2 - Operate in real time 3 - Functional, adaptive, rational behaviour 4 - Possess a vast knowledge base 5 - Success in dynamic environments 6 - Integrate diverse knowledge 7 - Use (natural) language 8 - Self-aware 9 - Able to learn from its environment 10 - Acquire abilities through development 11 - Arise through evolution 12 - Be realisable within the brain

1 - Behave as an arbitrary function of the environment Is it computationally universal? This is the criteria that Newell (1990) states as the principal evidence that humans are at least partly symbol systems. ACT-R is a hybrid system that can accomplish symbolic computations and so scores highly on this criteria. Current connectionist models are less convincing, but a key issue is that connectionist models in future may be able to perform symbolic type computations in a way that maintains the advantages of analog, distributed representations (see O’Reilly’s paper which is discussed in relation to criterion 6) Classical connectionism: mixed, ACT-R: better

2 - Operate in real time, For any of the 12 abilities described in Newell’s test, just possession of that ability is no good if the agent cannot demonstrate that ability in a timely fashion. It is unclear how connectionist models might be assessed in terms of timing, many are offline models (as opposed to online models that can interact dynamically with the world) To capture all the aspects of timing for a task, you need to capture all the aspects of the task, such as the perceptual and motor aspects. These peripheral aspects of architecture are much more strongly developed in ACT-R, but this is probably because it is a single model. When connectionist modelling gives rise to large integrated architectures this may change. Classical connectionism: worse, ACT-R: best

3 - Exhibit rational i.e. effective adaptive behaviour Does the system yield functional behaviour in the real world? Both systems use statistical methods to capture regularities in the environment. Both systems allow for emergence rather than just hard-coding in arbitrary constraints. (this criteria arose from Newell’s criticism of some older models of things like short term memory, which included capacity limitations as hard coded in so that they could reproduce empirical observations from real people, even if the models would perform more adaptively with greater capacity) Classical connectionism: better, ACT-R: better

4 - Use vast amounts of knowledge about the environment How does the size of the knowledge base affect performance? How well does performance scale up with the size of the knowledge base increases? Connectionist systems scale up badly, but ACT-R is limited like all declarative systems by issues such as the Frame Problem. Classical connectionism: worse, ACT-R: mixed

5 - Success in Dynamic environments ALVINN (a ANN) - good at driving on straight stretches of highway, bad at dealing with unpredictable situations The reactive/deliberative (prepared/deliberative) trade-off Linking perception to action ACT-R - driving, air traffic control, control of power plants, game playing, collaborative problem solving with humans Classical connectionism: mixed, ACT-R: better

6 - Integrate diverse knowledge This criteria was originally described by Newell as the need for symbols and abstraction - but describing a requirement that way is too loaded. Anderson and Lebiere’s solution is to frame this criteria in terms of the function that Newell’s test requires of symbols. For Newell a key function of symbols is distal access, that is getting information quickly and efficiently between different cognitive subsystems. Newell (1990) and Anderson and Lebiere (2003) all conclude that symbols (of the type used in programming languages such as POP11, LISP or PROLOG) are required to carry out this function. It may be that not only does a future form of connectionism come up with a d istributed form of representation that can act as symbols do in ACT-R, but that this distributed representation overcomes problems with current symbolic computation (O’Reilly 2006). Classical connectionism: worse, ACT-R: mixed

6 - Integrate diverse knowledge - O’Reilly (2006) O’Reilly (2006, conclusion on page 94): “Scientists are always concerned about strongly differentiating theoretical positions: the long dominance and current disfavour of the computer metaphor for understanding the mind has led the new generation of biological neural network theorists to emphasise the graded, analog, distributed character of the brain. It is clear that the brain is much more like a social network than a digital computer, with learning, memory and processing all being performed locally through graded communication between interconnected neurons. These neurons build up strong, complex ‘relationships’ over a long period of time; a neuron buried deep in the brain can only function by learning which of the other neurons it can trust to convey useful information.

6 - Integrate diverse knowledge - O’Reilly (2006) In contrast, a digital computer functions like the post office, routing arbitrary symbolic packages between passive memory structures, without consideration for the content of these packages. This affords arbitrary flexibility (any symbol is as good as any other), but at some cost: When every thing is arbitrary, then it is difficult to encode the subktle and complex relationships present in our commonsense knowledge of the real world. In contrast, the highly social neural networks of the brain are great at keeping of “who’s who and what’s what,” but they lack flexibility, treating a new symbol like a stranger crashing a party. The digital features of the PFC and associated areas help to broaden the horizons of naturally parochial neural networks. The dynamic gating mechanisms work more like a post-office, with the basal ganglia reading the zip code of which PFC strip to update, whereas the PFC cares more about the content of the package. Furthermore, the binary rule-like representations in the PFC are more symbol-like. Thus, perhaps a fuller understanding of this synthesis of analog and digital computation will finally unlock the mysteries of human intelligence.”

7 - Language At one time, language use was a prime example of a domain thought difficult for associative theories of cognition such as connectionism. However, numerous examples of connectionist successes with language use have now been developed: Over-generalisations learnt from experience (eg in past-tense learning) Syntactic parsing Classical connectionism: better, ACT-R: worse

8 - Self awareness - consciousness Neither framework makes a great impact in this requirements Recurrent connectionist networks may be a starting point to self awareness and the buffers in ACT-R may be a starting point to consciousness, but it is early days for both frameworks Classical connectionism: worse, ACT-R: worse

9 - Learning Learning is a strength of connectionism and ACT-R, and the two approaches possess complimentary strengths ACT-R does better on cognitive skills and list learning Connectionism does better on perceptual and motor learning and semantic memory (see the model of the hippocampus in criterion 12) Classical connectionism: better, ACT-R: better

10 - Development Connectionism makes a clear stand on the empiricist-nativist debate, rejecting representational nativism How do the symbols in ACT-R first come about in the course of development? Classical connectionism: better, ACT-R: worse

11 - Arise through evolution Neither framework makes a great impact in this requirement Classical connectionism: worst, ACT-R: worst

12 - Realisability within the brain Simulation of the hippocampus demonstrates connectionism’s real strength in meeting this criterion Classical connectionism: best, ACT-R: worse

Can you think of any further criteria for the Newell test?

Can you think of any further criteria for the Newell test? Emotion Multiple-tasks Distractability Meta-cognition More naturalistic behaviours (rather than psychological experiments) Perception and action

Conclusion and the future ACT-R and other symbolic systems are more mature in their level of development than many connectionist models O’Reilly’s work is just one recent example of a large architecture, what will the future hold? O’Reilly and the ACT-R group are collaborating, they may not be exclusive approaches, but capture different sides of the same set of phenomena

AI Principles, Semester 2, , Biological Intelligence II