430 likes | 568 Views
Sentence semantics, word meaning, and nonlinear dynamics. Hermann Moisl Newcastle University. Sentence semantics, word meaning, and nonlinear dynamics.
E N D
Sentence semantics, word meaning, and nonlinear dynamics Hermann Moisl Newcastle University
Sentence semantics, word meaning, and nonlinear dynamics Understanding of word meaning in natural language is fundamental to understanding of how language achieves its primary function, which is to convey information about attitudes to the world between and among speakers. The reason for this emerges from the principle of compositionality, which states that the meaning of a sentence is a function of the meanings of its constituent words and their combination into syntactic structures. Any satisfactory theory of natural language meaning therefore has to include an account of the nature of word meaning. A huge amount of work has been done on development of theories of natural language syntax and sentence semantics, but until recently there has been relatively little on word meaning. The present discussion is intended as a contribution to such an account.
Sentence semantics, word meaning, and nonlinear dynamics In cognitive linguistics, language is an emergent property of general cognition rather than a functional module as in generative linguistics. The meanings of linguistic expressions are based on concepts learned via interactions of cognitive agents with their environments. Word meaning, in particular, is seen as association of words with the concepts they denote. The present discussion proposes a quantitative model of word meaning based on interaction with the environment.
Sentence semantics, word meaning, and nonlinear dynamics Specifically, word meaning is modelled as a nonlinear dynamical artificial neural network in which the meaning of a word is a fixed-point attractor whose location in mental state space is learned from the external physical environment. The discussion is in three main parts: Motivation for the proposed dynamical model Model outline Extension to sentence meaning
Sentence semantics, word meaning, and nonlinear dynamics Motivation In the 1933 film Duck Soup, Chico Marx asked: Who are you going to believe? Me, or your own eyes? The motivation for the model proposed here is what I see with my own eyes.
Sentence semantics, word meaning, and nonlinear dynamics Motivation 1. Why a dynamical system model? The principles of system identification indicate that the mathematics of nonlinear dynamics is the obvious theoretical framework for modelling of lexical meaning. System identification in science and engineering poses the following question: Given black box system, that is, a system whose internal workings are unavailable for inspection but whose input-output behaviour is observable, and given also a sufficiently large input-output data set, what mechanism is in the box?
Sentence semantics, word meaning, and nonlinear dynamics Motivation The answer is that, in principle, there is an arbitrary number of candidate mechanisms which can generate the observed input-output behaviour. The only way to be sure of what mechanism is ACTUALLY generating the behaviour is to look inside the box. For cognition and language more specifically, the black box is the human head. Numerous mechanisms have been proposed for the human language faculty by linguists on the basis of observed linguistic behaviour.
Sentence semantics, word meaning, and nonlinear dynamics Motivation When, however, one looks inside the box, as neuroscience has increasingly allowed one to do, one finds: A brain consisting of a large number of interconnected objects, neurons, each with a relatively simple input-output behaviour. The relationship between neuronal input and output is nonlinear. There are pervasive feedback connections among neurons. The operation of the brain unfolds over time. The brain receives input from and sends output to the external environment and modifies its behaviour in response to those inputs and outputs. There is a mathematical theory for modelling such systems: nonlinear dynamics.
Sentence semantics, word meaning, and nonlinear dynamics Motivation In terms of that theory, the brain is a high-dimensional, non-autonomous, adaptive, nonlinear dynamical system. Assuming, one hopes uncontroversially, that the brain causes cognition, if one wants to model how it ACTUALLY does this rather than how it MIGHT, nonlinear dynamical systems theory is the obvious place to start. It may be that the brain implements a so-called ‘classical’ cognitive architecture, that is, a Turing computational architecture in which recursive symbol strings are algorithmically transformed. This has been the standard view in cognitive science and generative linguistics for decades, but one must remember that it is only a hypothesis, and its validity is an empirical matter.
Sentence semantics, word meaning, and nonlinear dynamics Motivation 2. Why an artificial neural network (ANN) model? To some approximation, ANN architectures are like brain architectures, and therefore the obvious thing to use, on the Chico Marx principle.
Sentence semantics, word meaning, and nonlineardynamics 2. The model Cognitive as opposed to truth-conditional theories of meaning in language are those which attempt to connect words with other cognitive entities variously described as ‘concepts’ or ‘thoughts’. Linguistics has historically had little to say on the nature of concepts and thoughts or about how they are connected to words Cognitive science and AI have on the other hand been much concerned with these things in their discussions of the nature of mental representations.
Sentence semantics, word meaning, and nonlineardynamics 2. The model The most convincing model of lexical meaning is that pioneered by the philosopher Charles Peirce (1839-1914), and developed most recently by Burton-Roberts (2014), in which: Words and concepts are representations of the environment physically implemented in the brain. The meaning of a word is not the concept it denotes, as most existing models see it, but is rather a function: meaning = f (word,concept) This part of the talk proposes implementations of word and concept representations and of the meaning function.
Sentence semantics, word meaning, and nonlineardynamics 2. The model: The nature of nonlinear dynamical systems Dynamical systems are mathematical structures used to model physical phenomena that change over time. The state of the phenomenon at any given time (t) is modelled as an n-dimensional vector of values, where each vector element describes a different aspect of the phenomenon. This vector is the dynamical system’s mathematical state corresponding to the phenomenon’s physical one. The space of possible system states is called the state space of the dynamical system. The evolution of the system’s state is a trajectory through the n-dimensional space of possible system states.
Sentence semantics, word meaning, and nonlineardynamics 2. The model: The nature of nonlinear dynamical systems A rule describes where in the state space the system is at given time t(i). For continuous time, that rule takes the form of a set of differential equations, and for discrete time a set of difference equations. For nonlinear systems the equations are nonlinear. Most natural systems are too complex to permit the equations to be written down explicitly, so a model which obeys the (unknown) equations is inferred from input-output data. The model proposed here is a nonlinear, discrete-time, learning dynamical system.
Sentence semantics, word meaning, and nonlineardynamics 2. The model: description It is here proposed that the biological neural mechanism which generates word and concept representations in humans can usefully be modelled by a class of nonlinear dynamical systems which are adaptive and therefore able to learn such representations: artificial neural networks (ANN). As the name indicates, an ANN is an artificial device that emulates the structure and dynamics of biological brains to some degree of approximation. There are many possible ANN architectures, and several would be suitable here. The architecture in what follows, the multilayer perceptron (MLP), was selected for simplicity and convenience of exposition.
Sentence semantics, word meaning, and nonlineardynamics 2. The model: description Structurally, MLPs consist of more or less numerous interconnected artificial neurons, or ‘units’. The units are partitioned into three types: input units that receive signals from an environment, output units that make signals available to the environment, and ‘hidden’ units that are internal to the network and not directly accessible. Each unit has at least one and typically many connections through which it receives signals.
Sentence semantics, word meaning, and nonlineardynamics 2. The model: description The aggregate of these signals at any time t elicits a response from the unit that, in the case of input and hidden units, is propagated to other units in the network along connections, and in the case of output units is made available to the environment. The nature of the response is a function f of the input signals. The range of function choices is essentially arbitrary, and can be linear or nonlinear. A frequently used nonlinear one is the sigmoid: y = 1 / (1 + e-x)
Sentence semantics, word meaning, and nonlineardynamics 2. The model: description Any connection between units may be more or less efficient in transmitting signals; there is typically a significant variation in efficiency among the connections in a network. Signals applied at the input units are propagated through the net and emerge at the output units, usually transformed in some way. The transformation of the input signals is conditioned by the nature of the constituent units’ repose to input signals, and the pattern of interconnection among the units together with the efficiencies of those connections
Sentence semantics, word meaning, and nonlineardynamics 2. The model: description The connection efficiencies are typically not specified by the designer, but are learned from input-output training data pairs by a learning rule specified by the designer. This learning rule modifies the connection efficiencies over more or less numerous time steps until modifications stop, that is, when the input-output associations have been learned. For each input-output pair, therefore, the ANN dynamical system converges on a point attractor. This point attractor is the vector of activation values of the n hidden layer units in n-dimensional space.
Sentence semantics, word meaning, and nonlineardynamics 2. The model: description The proposed model is based on the network architecture exemplified in the foregoing discussion: an autoassociative multi-layer perceptron, or aMLP. There are three layers of units: An input layer consisting of n units, each of which receives a value from the corresponding element of an n-dimensional vector representing some aspect of the environment. An output layer of n units A hidden layer of m < n units
Sentence semantics, word meaning, and nonlineardynamics 2. The model: description Starting with some arbitrary configuration of connection strengths, the aim is to configure the connections so that, for any vector vi chosen from a set V = {v1…vk} of input vectors, the output will be vi. In other words, the aim is to configure the network to autoassociate input vectors. This is done by training using some learning rule such as backpropagation: a vector vi ε V is presented to the input units, and the connections are adjusted so that the output at the output units comes closer to the inputs. Over more or less many training steps, adjusting the connections at each step, the autoassociation of the entire set V is learned.
Sentence semantics, word meaning, and nonlineardynamics 2. The model: description It is important here to note that, unlike computational models in which the connection between symbol and what it represents is arbitrary, here it is not. These networks create a lower-dimensional representation of the input, and the characteristics of the input determines the representation, ie, the relationship between the representation and what it represents is non-random. The hidden layer activation pattern places the representation of each input at a different location in the n-dimensional state space of the dynamical system.
Sentence semantics, word meaning, and nonlineardynamics 2. The model: description For example, if the input was a phonetic description of the word ‘man’, say, the learning dynamics of the network would locate its representation in its n-dimensional hidden layer state space. This is shown for n = 3 in the learning trace opposite.
Sentence semantics, word meaning, and nonlineardynamics 2. The model: description A phonetic description of the word ‘house’ would be assigned to a different location in the space.
Sentence semantics, word meaning, and nonlineardynamics 2. The model: description This mechanism grounds symbols because 1. The representations it generates are non-arbitrary: they are causally determined by the characteristics of the inputs. 2. For any set of inputs, the similarity relations among the representations preserve the similarity relations among the inputs In other words, the structure of the representation manifolds for words and visuals in the model’s n-dimensional representational space mirrors the structure of environmental reality.
Sentence semantics, word meaning, and nonlineardynamics 2. The model: description This can be seen by comparing cluster analyses of the visual input set used for exemplification here and the corresponding representation set generated by the neural network.
Sentence semantics, word meaning, and nonlineardynamics 2. The model: description Word meaning was earlier defined as a function: meaning = f(word,concept) Fixed-point dynamical models for word and visual concept representation have been defined. If remains to describe the fixed-point dynamical model for the meaning function.
Sentence semantics, word meaning, and nonlineardynamics 2. The model: description An auditory aMLP generates non-arbitrary representations of spoken words. A visual aMLP generates non-arbitrary representations of visual inputs; this could, or course, be any other sensory input modality, or a combination of all of them. An associative MLP (Note: not an autoassociative MLP) learns to associate the representations generated by the auditory and visual aMLPs.
Sentence semantics, word meaning, and nonlineardynamics 2. The model: description The associative MLP implements the meaning function such that, for any (auditory, visual) pair, the activation configuration of the hidden layer is the meaning of the word corresponding to the auditory input.
Sentence semantics, word meaning, and nonlineardynamics 2. The model: outline As before, learning of each auditory-visual association converges on a fixed point in the association network’s state space. The trace opposite shows the convergence for one and three words.
Sentence semantics, word meaning, and nonlineardynamics 3. Extension to sentence semantics How does this model of word meaning extend to sentence meaning? It is possible in principle to interpret the physical configurations of the association network hidden layer as grounded symbols and to incorporate them into the compositional phrase structure of conventional Turing-computation models of sentence meaning. But there is a simpler way to do it, and one that sits more comfortably with nonlinear dynamical modelling.
3. Extension to sentence semantics Given a sequence of (word, visual) pairs, say ‘the cat sat on the mat’, and that the pairs are presented at discrete time steps t(1), t(2)…t(n), the meaning of a word at step t(i) is now a function not only of the current word and concept representations at t(i) but also of the meaning at t(i-1), which was itself a function of the word and meaning representations together with the meaning at t(i-2) and so on. In other words, the meaning of a word at time step t(i) is a function of the word and concept representations at t(i) together with the history of meanings from the start of the sequence to t(i), or, in yet other words, the meaning of the sequence is cumulative.
Sentence semantics, word meaning, and nonlineardynamics 3. Extension to sentence semantics The input sequence of (word, concept) representations drives the corresponding sequence of meanings through a trajectory of fixed points in the n-dimensional meaning space. The point in the trajectory at step t(i) is the meaning of the sequence at that step.
Sentence semantics, word meaning, and nonlineardynamics 3. Extension to sentence semantics Moreover, each input sequence generates a different trajectory in the meaning space.
Sentence semantics, word meaning, and nonlineardynamics 3. Extension to sentence semantics Interpreted computationally, this model is finite state, and like all finite state models it makes no reference to recursive complex phrase structure. Chomsky rejected finite state models as observationally, descriptively, and explanatorily inadequate at the very outset of the generative linguistics paradigm in the 1950s, and they have remained rejected ever since. The argument against observational and descriptive adequacy applies only on the assumption that the class of natural languages permits unbounded-length strings, that is, that the natural languages are infinite string sets. Empirically, natural language strings are not unbounded, nor, in a finite universe, can they be in principle, so the natural languages are in fact finite string sets.
Sentence semantics, word meaning, and nonlineardynamics 3. Extension to sentence semantics A standard result in automata theory is that every finite language can be defined by a finite state machine. This means that there can be no objection in principle to finite state models of natural language. A corollary is that, for any given set of strings, there is no way of determining which class of grammars or automata in the Chomsky hierarchy generated it. For any set of natural language strings, therefore, the choice of grammatical or automata class used to model it is unconstrained. Which one chooses depends on what one finds explanatorily adequate.
Sentence semantics, word meaning, and nonlineardynamics 3. Extension to sentence semantics Because finite state machines assign the same, strictly sequential structure to all strings, they are not explanatorily adequate; that level of adequacy requires recursive complex phrase structure. Does this compromise the proposed finite state model of sentence meaning? The answer depends on what kind of explanation one wants: Recursive complex phrase structure is intuitively appealing. However, nature is as it is, and is not constrained to work in ways which are intuitively appealing to humans. An explanation in terms of nonlinear dynamics is equally valid: nonlinear dynamics is being used to model other highly complex systems like the weather, population dynamics, and the world economy, so why not language?
Sentence semantics, word meaning, and nonlineardynamics References: Aizawa, K. (2013) Introduction to “The Material Bases of Cognition”, Minds and Machines 23, 277-86 Evans, V. (=recent) Word meaning, to appear in The Cambridge Encyclopedia of the Language Sciences, ed. P. Hogan Fekete, T. (2010) Representational systems, Minds and Machines 20, 69-101 Floridi, L. (2008) The method of levels of abstraction, Minds and Machines 18, 303-29 Franklin, S. (2009) Review of Walter J. Freeman, How Brains Make Up Their Minds, Minds and Machines 17, 353-56 Fresco, N. (2012) The explanatory role of computation in cognitive science, Minds and Machines 22, 353-80 Garzon, F., Rodriguez, A. (2009) Where is cognitive science heading?, Minds and Machines 19, 301-18 Hadley, R. (2004) On the proper treatment of semantic systematicity, Minds and Machines 14, 145-72 Johnson, K. (2007) An overview of lexical semantics, Philosophy Compass 3, 199-34 Müller, V. (2009) Symbol grounding in computational systems: a paradox of intentions, Mins and Machines 19, 529-41 Ramsey, w. (2007) Representation reconsidered, New York: Cambridge University Press Schonbein, W. (2005) Cognition and the power of continuous dynamical systems, Minds and Machines 15, 57-71 Shapiro, L. (2013) Dynamics and cognition, Minds and Machines 23, 353-75 Symonds, J. (2001) Explanation, representation, and the dynamical hypothesis, Minds and Machines 11, 521-41 Taddeo, M., Floridi, L. (2005) Solving the symbol grounding problem: a critical review of fifteen years of research, Journal of Experimental and Theoretical Artificial Intelligence 17, 419-45 Taddeo, M., Floridi, L. (2007) A praxical solution of the symbol grounding problem, Minds and Machines 17, 369-89 Van Leeuwen, M. (2005) Questions for the dynamicist: the use of dynamical systems theory in the philosophy of cognition, Minds and Machines 15, 271-333
Sentence semantics, word meaning, and nonlineardynamics References: White, G. (2011) Descartes among the robots. Computer science and the inner/outer distinction, Minds and Machines 21, 179-202