430 likes | 571 Views
Brains, logic and computational models. Włodzisław Duch Department of Informatics, Nicolaus Copernicus University , Toru ń , Poland Google: W. Duch Argumentation as a cognitive process, Torun, May 2008. Problem.
E N D
Brains, logic and computational models. Włodzisław Duch Department of Informatics, Nicolaus Copernicus University, Toruń, Poland Google: W. Duch Argumentation as a cognitive process, Torun, May 2008
Problem • How do brains, using massively parallel computations, perform logical thinking? • Boltzmann (1899): “All our ideas and concepts are only internal pictures”. • Wittgenstein (1922) in Tractatus: thoughts are pictures of how things are in the world, propositions point to pictures. • Kenneth Craik (1943): the mind constructs "small-scale models" of reality that it uses to anticipate events, to reason, and to underlie explanation. • Piaget: humans develop a context-free deductive reasoning scheme at the level of elementary FOL. • Johnson-Laird (1983): mental models are psychological representations of real, hypothetical or imaginary situations. • Sure, but how is it really done by the brain?
Plan • Brains, minds and logic: what really happens at the physical level and how to conceptualize it? • Neuron level: perceptrons & hard logical problems. • Right hemisphere and creativity. • Mental models. • Mapping mental to physical. • Platonic model of the mind. • Inverse base rates categorization experiments: psychology vs. neurodynamics.
Our fields of research Computational Intelligence. An International Journal (1984) + 10 other journals with “Computational Intelligence”, D. Poole, A. Mackworth & R. Goebel, Computational Intelligence - A Logical Approach. (OUP 1998), GOFAI book, logic and reasoning. • CI: lower cognitive functions, perception, signal analysis, action control, sensorimotor behavior. • AI: higher cognitive functions, thinking, reasoning, planning etc. • Neurocognitive informatics:brain processes can be a great inspiration for AI algorithms, if we could only understand them …. What neurons are doing? Perceptrons, basic units in multilayer perceptron networks, use threshold logic. What are the networks doing? Specific transformations, memory, estimation of similarity. How do higher cognitive functions map to the brain activity?
Neurons & logic Neurons are very complex biological units, sending spikes. The simplest abstraction of their functions: voting devices. IF θ% majority votes “yes” Then say Yes IF M of N conditions are true Then Yes Equivalent to N!/M!(N-M!) propositional statements. Some voters may have more power, so vote Xi has weight Wi Step neuron: Q(W X) [0, 1]. Graphical representation of the discriminant function gW(X)=Q(W X) Nature prefers continuous Q, X If smooth sigmoidal output is used such unit is called a “perceptron”.Combining perceptrons into multilayer perceptron networks (MLPs) approximates arbitrary decision function: MLP is “universal approximator”.
Are we really so good? Surprise! Almost nothing can be learned using MLPs networks and other similar tools! This results from the “no free lunch” theorem, “ugly duckling”, as well as common practice.
Human categorization Categorization is quite basic, many psychological models/experiments. Multiple brain areas involved in different categorization tasks. Classical experiments on rule-based category learning: Shepard, Hovland and Jenkins (1961), replicated by Nosofsky et al. (1994). Problems of increasing complexity; results determined by logical rules. 3 binary-valued dimensions: shape (square/triangle), color (black/white), size (large/small). 4 objects in each of the two categories presented during learning. Type I - categorization using one dimension only. Type II - two dim. are relevant, including exclusive or (XOR) problem. Types III, IV, and V - intermediate complexity between Type II - VI. All 3 dimensions relevant, "single dimension plus exception" type. Type VI - most complex, 3 dimensions relevant, enumerate, no simple rule. Difficulty (number of errors made): Type I < II < III ~ IV ~ V < VI For n bits there are 2n binary strings 0011…01; how complex are the rules (logical categories) that human/animal brains still can learn?
Boole’an functions are difficult to learn, require combinatorial complexity; similarity is not useful, for parity all neighbors are from the wrong class. MLP networks have difficulty to learn functions that are highly non-separable. Neurons learning complex logic Ex. of 2-4D parity problems. Neural logic can solve it without counting; find a good point of view. Projection on W=(111 ... 111) gives clusters with 0, 1, 2 ... n bits; solution requires abstract imagination + easy categorization.
Abstract imagination Transformation of visual stimuli to a space where clustering is easy. Intuitive answers, as propositional rules may be difficult to formulate. Here fuzzy XOR (stimuli color/size/shape are not identical) has been transformed by two groups of neurons that react to similarity. Network-based intuition: they know the answer, but cannot say it ... If image of the data forms non-separable clusters in the inner (hidden) space network outputs will be often wrong.
Biological justification • Single cortical column may learn to respond to stimuli with complex logic resonating in different way. • The second column may easily learn that such different reactions have the same meaning: inputs xi and training targets yj are the same => Hebbian learning DWij ~ xi yj => identical weights. • Effect: same line y=W.X projection, but inhibition turns off one perceptron when the other is active. • Simplest solution: oscillators based on combination of two neurons s(W.X-b) – s(W.X-b’) give localized projections! • We have used them in MLP2LN architecture for extraction of logical rules from data. • Note: k-sep. learning is not a multistep output neuron, targets are not known, same class vectors may appear in different intervals! • New algorithms are needed to learn it!
s(by+q1) X1 +1 y=W.X +1 X2 s(by+q2) +1 -1 X3 +1 +1 +1 X4 -1 s(by+q4) Network solution Can one learn a simplest model for arbitrary Boolean function? 2-separable (linearly separable) problems are easy; non separable problems may be broken into k-separable, k>2. Neural architecture for k=4 intervals, or 4-separable problems. Blue: sigmoidal neurons with threshold, brown – linear neurons. Abstraction of the cortical column.
Parity n=9 Simple gradient learning; quality index shown below.
Symbols in the brain Organization of the word recognition circuits in the left temporal lobehas been elucidated using fMRI experiments (Cohen et al. 2004). How do words that we hear, see or are thinking of, activate the brain? Seeing words: orthography, phonology, articulation, semantics. Lateral inferotemporal multimodal area (LIMA) reacts to auditory & visual stimulation, has cross-modal phonemic and lexical links. Adjacent visual word form area (VWFA) in the left occipitotemporal sulcus is unimodal. Likely: homolog of the VWFA in the auditory stream, the auditory word form area, located in the left anterior superior temporal sulcus. Large variability in location of these regions in individual brains. Left hemisphere: precise representations of symbols, including phonological components; right hemisphere? Sees clusters of concepts.
Words in the brain Psycholinguistic experiments show that most likely categorical, phonological representations are used, not the acoustic input. Acoustic signal => phoneme => words => semantic concepts. Phonological processing precedes semantic by 90 ms (from N200 ERPs). F. Pulvermuller (2003) The Neuroscience of Language. On Brain Circuits of Words and Serial Order. Cambridge University Press. Action-perception networks inferred from ERP and fMRI Phonological neighborhood density = the number of words that are similar in sound to a target word. Similar = similar pattern of brain activations. Semantic neighborhood density = the number of words that are similar in meaning to a target word.
Creativity with words The simplest testable model of creativity: • create interesting novel words that capture some features of products; • understand new words that cannot be found in the dictionary. Model inspired by the putative brain processes when new words are being invented starting from some keywords priming auditory cortex. Phonemes (allophones) are resonances, ordered activation of phonemes will activate both known words as well as their combinations; context + inhibition in the winner-takes-most leaves only a few candidate words. Creativity = network+imagination (fluctuations)+filtering (competition) Imagination: chains of phonemes activate both word and non-word representations, depending on the strength of the synaptic connections. Filtering: based on associations, emotions, phonological/semantic density. discoverity = {disc, disco, discover, verity} (discovery, creativity, verity)digventure ={dig, digital, venture, adventure} new! Server: http://www-users.mat.uni.torun.pl/~macias/mambo/index.php
Problems requiring insights Given 31 dominos and a chessboard with 2 corners removed, can you cover all board with dominos? Analytical solution: try all combinations. Does not work … to many combinations to try. chess board domino n Logical, symbolic approach has little chance to create proper activations in the brain, linking new ideas: otherwise there will be too many associations, making thinking difficult. Insight <= right hemisphere, meta-level representations without phonological (symbolic) components ... counting? o m black white i d o phonological reps
Insights and brains Activity of the brain while solving problems that required insight and that could be solved in schematic, sequential way has been investigated. E.M. Bowden, M. Jung-Beeman, J. Fleck, J. Kounios, „New approaches to demystifying insight”.Trends in Cognitive Science2005. After solving a problem presented in a verbal way subjects indicated themselves whether they had an insight or not. An increased activity of the right hemisphere anterior superior temporal gyrus (RH-aSTG) was observed during initial solving efforts and insights. About 300 ms before insight a burst of gamma activity was observed, interpreted by the authors as „making connections across distantly related information during comprehension ... that allow them to see connections that previously eluded them”.
Insight interpreted What really happens? My interpretation: • LH-STG represents concepts, S=Start, F=final • understanding, solving = transition, step by step, from S to F • if no connection (transition) is found this leads to an impasse; • RH-STG ‘sees’ LH activity on meta-level, clustering concepts into abstract categories (cosets, or constrained sets); • connection between S to F is found in RH, leading to a feeling of vague understanding; • gamma burst increases the activity of LH representations for S, F and intermediate configurations; feeling of imminent solution arises; • stepwise transition between S and F is found; • finding solution is rewarded by emotions during Aha! experience; they are necessary to increase plasticity and create permanent links.
Mental models Kenneth Craik, 1943 book “The Nature of Explanation”, G-H Luquet attributed mental models to children in 1927. P. Johnson-Laird, 1983 book and papers. Imagination: mental rotation, time ~ angle, about 60o/sec. Internal models of relations between objects, hypothesized to play a major role in cognition and decision-making. AI: direct representations are very useful, direct in some aspects only! Reasoning: imaging relations, “seeing” mental picture, semantic? Systematic fallacies: a sort of cognitive illusions. • If the test is to continue then the turbine must be rotating fast enough to generate emergency electricity. • The turbine is not rotating fast enough to generate this electricity. • What, if anything, follows? Chernobyl disaster … If A=>B; then ~B => ~A, but only about 2/3 studentsanswercorrectly..
Mental models summary The mental model theory is an alternative to the view that deduction depends on formal rules of inference. • MM represent explicitly what is true, but not what is false; this may lead naive reasoner into systematic error. • Large number of complex models => poor performance. • Tendency to focus on a few possible models => erroneous conclusions and irrational decisions. Cognitive illusions are just like visual illusions. M. Piattelli-Palmarini, Inevitable Illusions: How Mistakes of Reason Rule Our Minds (1996) R. Pohl, Cognitive Illusions: A Handbook on Fallacies and Biases in Thinking, Judgement and Memory (2005) Amazing, but mental models theory ignores everything we know about learning in any form! How and why do we reason the way we do? I’m innocent! My brain made me do it!
Mental models Easy reasoning A=>B, B=>C, so A=>C • All mammals suck milk. • Humans are mammals. • => Humans suck milk. Simple associative process, easy to simulate. ... but almost no-one can draw conclusion from: • All academics are scientist. • No wise men is an academic. • What can we say about wise men and scientists? Surprisingly only ~10% of students get it right after days of thinking. No simulations explaining why some mental models are so difficult. Why is it so hard? What really happens in the brain? Try to find a new point of view to illustrate it.
Physics and psychology R. Shepard (BBS, 2001): psychological laws should be formulated in appropriate psychological abstract spaces. • Physics - macroscopic properties results from microscopic interactions. • Description of movement - invariant in appropriate spaces: • Euclidean 3D => Galileo transformations; • (3+1) pseudo-Euclidean space => Lorentz x-t transformations; • Riemannian curved space => laws invariant in accelerating frames. • Psychology - behavior, categorization, results from neurodynamics. • Neural networks: microscopic description, too difficult to use. • Find psychological spaces that result from neural dynamics and allow to formulate general behavioral laws.
Laws of generalization Shepard (1987), Universal law of generalization. Tenenbaum, Griffith (2001), Bayesian framework unifying set-theoretic approach (introduced by Tversky 1977) with Shepard ideas. Generalization gradients tend to fall off approximately exponentially with distance in an appropriately scaled psychological space. Distance - from MDS maps of perceived similarity of stimuli. G(D) = probability of response learned to stimulus for D=0, for many visual/auditory tasks, falls exponentially with the distance. Roger Shepard (1987): “What is required is not more data or more refined data but a different conception of the problem”.
P-spaces Psychological spaces: how to visualize inner life? K. Lewin, The conceptual representation and the measurement of psychological forces (1938), cognitive dynamic movement in phenomenological space. George Kelly (1955): personal construct psychology (PCP), geometry of psychological spaces as alternative to logic. A complete theory of cognition, action, learning and intention. PCP network, society, journal, software … quite active group. Many things in philosophy, dynamics, neuroscience and psychology, searching for new ways of understanding cognition, are relevant here.
P-space definition P-space: region in which we may place and classify elements of our experience, constructed and evolving, „a space without distance”, divided by dichotomies. • P-spacesshould have (Shepard 1957-2001): • minimal dimensionality; • distances that monotonically decrease with increasing similarity. • This may be achieved using multi-dimensional non-metric scaling (MDS), reproducing similarity relations in low-dimensional spaces. • Many Object Recognition and Perceptual Categorization models assume that objects are represented in a multidimensional psychological space; similarity between objects ~ 1/distance in this space. • Can one describe the state of mind in similar way?
Object recognition Natural object recognition (S. Edelman, 1997) Second-order similarity in low-dimensional (<300) space is sufficient. Population of columns as weak classifiers working in chorus - stacking.
A bit of neurodynamics Amit group, 1997-2001, simplified spiking neuron models of column activity during learning. Stage 1: single columns respond to some feature. Stage 2: several columns respond to different features. Stage 3: correlated activity of many columns appears. Formation of new attractors =>formation of mind objects. PDF: p(activity of columns| given presented features)
From neurodynamics to P-spaces Walter Freeman: model of olfaction in rabbits, 5 types of odors, 5 types of behavior, very complex model in between. Simplified models: H. Liljeström. Modeling input/output relations with some internal parameters. Attractors of dynamics in high-dimensional space => via fuzzy symbolic dynamics allow to define probability densities (PDF) in feature spaces. Mind objects - created from fuzzy prototypes/exemplars.
Brain to Mind Transform brain activity into mind activity! Brain-computer interfaces: popular subject. • Identify the space of signals that are relevant to the inner experience (ex: intentions for actions). • Transform brain signals to this space. • Interpret brain activity as intentions. • Examples: • BCI intentions for movements. • Recent fMRI experiments, a+b, or a-b? • Our attempts to understand EEG trajectories as a reflection of mind events.
Brain-like computing • I can see, hear and feel only my brain states! Ex: change blindness. • Cognitive processes operate on highly processed sensory data. • Redness, sweetness, itching, pain ... are all physical states of brain tissue. Brain states are physical, spatio-temporal states of neural tissue. In contrast to computer registers, brain states are dynamical, and thus contain in themselves many associations, relations. Inner world is real! Mind is based on relations of brain’s states. Computers and robots do not have an equivalent of such WM.
Static Platonic model: motivation Plato believed in reality of mind, ideal forms recognized by intellect. A useful metaphor: perceived mind content is like a shadow of ideal, real world of objects projected on the wall of a cave. (drawing: Marc Cohen) Real mind objects: shadows of neurodynamics? Neurocognitive science: show how to do it!
Static Platonic model Newton introduced space-time, arena for physical events. Mind events need psychological spaces. • Goal: integrate neural and behavioral information in one model, create model of mental processes at intermediate level between psychology and neuroscience. • Static version: short-term response properties of the brain, behavioral (sensomotoric) or memory-based (cognitive). • Approach: • simplify neural dynamics, find invariants (attractors), characterize them in psychological spaces; • use behavioral data, represent them in psychological space. • Applications: object recognition, psychophysics, category formation in low-D psychological spaces, case-based reasoning.
Learning complex categories Categorization is quite basic, many psychological models/experiments. Multiple brain areas involved in different categorization tasks. Classical experiments on rule-based category learning: Shepard, Hovland and Jenkins (1961), replicated by Nosofsky et al. (1994). Problems of increasing complexity; results determined by logical rules. 3 binary-valued dimensions: shape (square/triangle), color (black/white), size (large/small). 4 objects in each of the two categories presented during learning. Type I - categorization using one dimension only. Type II - two dim. are relevant, including exclusive or (XOR) problem. Types III, IV, and V - intermediate complexity between Type II - VI. All 3 dimensions relevant, "single dimension plus exception" type. Type VI - most complex, 3 dimensions relevant, enumerate, no simple rule. Difficulty (number of errors made): Type I < II < III ~ IV ~ V < VI For n bits there are 2n binary strings 0011…01; how complex are the rules (logical categories) that human/animal brains still can learn?
Canonical neurodynamics. What happens in the brain during category learning? Complex neurodynamics <=> simplest, canonical dynamics. For all logical functions one may write corresponding equations. For XOR (type II problems) equations are: Corresponding feature space for relevant dimensions A, B
Inverse based rates Relative frequencies (base rates) of categories are used for classification: if on a list of disease and symptoms disease C associated with (PC, I) symptoms is 3 times more common as R, then symptoms PC => C, I => C (base rate effect). Predictions contrary to the base: inverse base rate effects (Medin, Edelson 1988). Although PC + I + PR => C (60% answers) PC + PR => R (60% answers) Why such answers? Psychological explanations are not convincing. Effects due to the neurodynamics of learning? I am not aware of any dynamical models of such effects.
IBR neurocognitive explanation Psychological explanation: J. Kruschke, Base Rates in Category Learning (1996). PR is attended to because it is a distinct symptom, although PC is more common. Basins of attractors - neurodynamics; PDFs in P-space {C, R, I, PC, PR}. PR + PC activation leads more frequently to R because the basin of attractor for R is deeper. Construct neurodynamics, get PDFs. Unfortunately these processes are in 5D. Prediction: weak effects due to order and timing of presentation (PC, PR) and (PR, PC), due to trapping of the mind state by different attractors.
Learning Point of view Neurocognitive Psychology
Probing Point of view Neurocognitive Psychology
Mental model dynamics Why is it so hard to draw conclusions from: • All academics are scientist. • No wise men is an academic. • What can we say about wise men and scientists? All A’s are S, ~ W is A; relation S <=> W ? What happens with neural dynamics? Basins of A is larger than B, as B is a subtype of A, and thus has to inherit most properties that are associated with A. Attractor for B has to be within A. Thinking of B makes it hard to think of A, as the Wise men Scientists Basins of attractors for the 3 concepts involved; basin for “Wise men” has unknown relation to the other basins. Academics
Some connections Geometric/dynamical ideas related to mind may be found in many fields: Neuroscience: D. Marr (1970) “probabilistic landscape”. C.H. Anderson, D.C. van Essen (1994): Superior Colliculus PDF maps S. Edelman: “neural spaces”, object recognition, global representation space approximates the Cartesian product of spaces that code object fragments, representation of similarities is sufficient. Psychology: K. Levin, psychological forces. G. Kelly, Personal Construct Psychology. R. Shepard, universal invariant laws. P. Johnson-Laird, mind models. Folk psychology: to put in mind, to have in mind, to keep in mind (mindmap), to make up one's mind, be of one mind ... (space).
More connections AI: problem spaces - reasoning, problem solving, SOAR, ACT-R, little work on continuous mappings (MacLennan) instead of symbols. Engineering: system identification, internal models inferred from input/output observations – this may be done without any parametric assumptions if a number of identical neural modules are used! Philosophy: P. Gärdenfors, Conceptual spaces R.F. Port, T. van Gelder, ed. Mind as motion (MIT Press 1995) Linguistics: G. Fauconnier, Mental Spaces (Cambridge U.P. 1994). Mental spaces and non-classical feature spaces. J. Elman, Language as a dynamical system; J. Feldman neural basis; Stream of thoughts, sentence as a trajectory in P-space. Psycholinguistics: T. Landauer, S. Dumais, Latent Semantic Analysis, Psych. Rev. (1997) Semantic for 60 k words corpus requires about 300 dim.
Conclusions Understanding of reasoning requires a model of brain processes => mind => logic and reasoning. Psychological interpretations and models are illusory! They provide wrong conceptualization of real processes. Simulations of the brain may lead to mind functions, but without conceptual understanding.Complex neurodynamics => dynamics in P-spaces. Low-dimensional representation of mental events are needed. Is this a good bridge between mind and brain? Mind models, psychology, logic … do not touch the truth. • Open questions/problems/hopes: • P-spaces may be high-dimensional, hard to visualize. • How to describe our inner experience? (Hurlburt & Schwitzgebel 2007) • At the end of the road it is likely that: • mind-body problem will dissolve; • physics-like theory of events in mental spaces will be possible, including all higher cognitive functions.
Thank youfor lending your ears ... Google: W. Duch => Papers/presentations/projects