Neural Networks for High-Level Intelligence and Cognition

Neural Networks for High-Level Intelligence and Cognition Włodzisław Duch (Google: Duch) Department of Informatics, Nicolaus Copernicus University, Torun, Poland IJCNN’2007, Orlando, Florida, August 14, 2007

Promise • Mind as a shadow of neurodynamics: geometrical model of mind processes, psychological spaces providing inner perspective as an approximation to neurodynamics. • Intuition: learning from partial observations, solving problems without explicit reasoning (and combinatorial complexity) in an intuitive way. • Neurocognitive linguistics: how to find neural pathways in the brain. • Creativity & word games.

Motivation & possibilities To reach human-level intelligence we need to go beyond pattern recognition, memory and control. How to reach this level? • Top down style, inventing principles: laminar & complementary computing (Grossberg), chaotic attractors (Freeman & Kozma), AMD (John Weng), confabulation (Hecht-Nielsen), dynamic logic and mental fields (Perlovsky), mnemonic equations (Caianello) ... • Bottom up style, systematic approximations, scaling up: neuromorphic systems, CCN (Izhikevich), Ccortex, O’Reilly ... • Designs for artificial brains based on cognitive/affective architectures • Integration of perception, affect and cognition, large-scale semantic memory models, implementing control/attention mechanisms.

Exponential growth of power From R. Kurzweil, The Law of Accelerating Returns By 2020 PC computers will match the raw speed of brain operations! Singularity is coming?

Attractor neural networks and concept formation in psychological spaces:mind from brain? Włodzisław Duch Department of Informatics, Nicholas Copernicus University, Toruń, Poland. www.phys.uni.torun.pl/~duch Bioinspired Computational Models of Learning and Memory, Lejondal Castle, Sept. 2002

Mind the Gap Gap between neuroscience and psychology: cognitive science is at best incoherent mixture of various branches. • Is a satisfactory understanding of the mind possible ? • Roger Shepard, Toward a universal law of generalization for psychological science (Science, Sept. 1987) • “What is required is not more data or more refined data but a different conception of the problem”. • Mind is what the brain does, a potentially conscious subset of brain processes. • How to approximate the dynamics of the brain to get satisfactory (geometric) picture of the mind?

P-spaces Psychological spaces: K. Lewin, The conceptual representation and the measurement of psychological forces (1938), cognitive dynamic movement in phenomenological space. George Kelly (1955), personal construct psychology (PCP), geometry of psychological spaces as alternative to logic. A complete theory of cognition, action, learning and intention. PCP network, society, journal, software …

Static Platonic model Newton introduced space-time, arena for physical events. Mind events need psychological spaces. • Goal: integrate neural and behavioral information in one model, create model of mental processes at intermediate level between psychology and neuroscience. • Static version: short-term response properties of the brain, behavioral (sensomotoric) or memory-based (cognitive). • Applications: object recognition, psychophysics, category formation in low-D psychological spaces, case-based reasoning. • Approach: • simplify neural dynamics, find invariants (attractors), characterize them in psychological spaces; • use behavioral data, represent them in psychological space.

Platonic mind model. Feature detectors/effectors: topographic maps. Objects in long-term memory (parietal, temporal, frontal): local P-spaces. Mind space (working memory, prefrontal, parietal): construction of mind space features/objects using attention mechanisms. Feelings: gradients in the global space.

More neurodynamics Amit group, 1997-2001, simplified spiking neuron models of column activity during learning. Stage 1: single columns respond to some feature. Stage 2: several columns respond to different features. Stage 3: correlated activity of many columns appears. Formation of new attractors =>formation of mind objects. PDF: p(activity of columns| given presented features)

Category learning Large field, many models. Classical experiments: Shepard, Hovland and Jenkins (1961), replicated by Nosofsky et al. (1994) Problems of increasing complexity; results determined by logical rules. 3 binary-valued dimensions: shape (square/triangle), color (black/white), size (large/small). 4 objects in each of the two categories presented during learning. Type I - categorization using one dimension only. Type II - two dim. are relevant (XOR problem). Types III, IV, and V - intermediate complexity between Type II - VI. All 3 dimensions relevant, "single dimension plus exception" type. Type VI - most complex, 3 dimensions relevant, logic = enumerate stimuli in each of the categories. Difficulty (number of errors made): Type I < II < III ~ IV ~ V < VI

Canonical dynamics What happens in the brain during category learning? Complex neurodynamics <=> simplest, canonical dynamics. For all logical functions one may write corresponding equations. For XOR (type II problems) equations are: Corresponding feature space for relevant dimensions A, B

Inverse based rates Relative frequencies (base rates) of categories are used for classification: if on a list of disease and symptoms disease C associated with (PC, I) symptoms is 3 times more common as R, then symptoms PC => C, I => C (base rate effect). Predictions contrary to the base: inverse base rate effects (Medin, Edelson 1988). Although PC + I + PR => C (60% answers) PC + PR => R (60% answers) Why? Psychological explanations are not convincing. Effects due to the neurodynamics of learning? I am not aware of any dynamical models of such effects.

IBR explanation Psychological explanation: J. Kruschke, Base Rates in Category Learning (1996). PR is attended to because it is a distinct symptom, although PC is more common. Basins of attractors - neurodynamics; PDFs in P-space {C, R, I, PC, PR}. PR + PC activation leads more frequently to R because the basin of attractor for R is deeper. Construct neurodynamics, get PDFs. Unfortunately these processes are in 5D. Prediction: weak effects due to order and timing of presentation (PC, PR) and (PR, PC), due to trapping of the mind state by different attractors.

Learning Point of view Neurodynamics Psychology

Probing Point of view Neurodynamics Psychology

Intuition Intuition is also a concept difficult to grasp, but commonly believed to play important role in business and other decision making; „knowing without being able to explain how we know”. Sinclair & Ashkanasy (2005): intuition is a „non-sequentialinformation-processing mode, which comprises both cognitive and affective elements and results in direct knowing without any use of conscious reasoning”. First tests of intuition were introduced by Wescott (1961), now 3 tests are used, Rational-Experiential Inventory (REI), Myers-Briggs Type Inventory (MBTI)and Accumulated Clues Task (ACT). Different intuition measures are not correlated, showing problems in constructing theoretical concept of intuition. Significant correlations were found between REI intuition scale and some measures of creativity. Intuition may result from implicit learning of complex similarity-based evaluation that are difficult to express in symbolic (logical) way. Intuition in chess has been studied in details.

Intuitive thinking Question in qualitative physics: if R2increases, R1and Vtare constant, what will happen with current and V1, V2 ? Geometric representation of facts: + increasing, 0 constant, - decreasing. Ohm’s law V=I×R; Kirhoff’s V=V1+V2. True (I-,V-,R0), (I+,V+,R0),false (I+,V-,R0). 5 laws: 3 Ohm’s & 2 Kirhoff’s. All laws A=B+C, A=B×C , A-1=B-1+C-1, have identical geometric interpretation! 13 true, 14 false facts; simple P-space, complex neurodynamics.

Intuitive reasoning 5 laws are simultaneously fulfilled, all have the same representation: Question: If R2=+, R1=0and V =0, what can be said about I, V1, V2 ? Find missing value giving F(V=0, R, I,V1, V2, R1=0, R2=+) >0 Suppose that variable X = +, is it possible? Not, if F(V=0, R, I,V1, V2, R1=0, R2=+) =0, i.e. one law is not fulfilled. If nothing is known 111 consistent combinations out of 2187 (5%) exist. Intuitive reasoning, no manipulation of symbols; heuristics: select variable giving unique answer. Soft constraints or semi-quantitative => small |FSM(X)| values.

Brains and understanding General idea: when the text is read and analyzed activation of semantic subnetwork is spread; new words automatically assume meanings that increases overall activation, or the consistency of interpretation. Many variants, all depend on quality of semantic network, some include explicit competition among network nodes. • How to approximate this process in computer models? • How to use it for medical text understanding, correlate information from texts and genomic research? • How to build a practical system? • How to improve MDs training, understanding the learning processes. • Work in CCHMF, with John Pestian and Pawel Matykeiwcz.

Insights and brains Activity of the brain while solving problems that required insight and that could be solved in schematic, sequential way has been investigated. E.M. Bowden, M. Jung-Beeman, J. Fleck, J. Kounios, „New approaches to demystifying insight”.Trends in Cognitive Science2005. After solving a problem presented in a verbal way subjects indicated themselves whether they had an insight or not. An increased activity of the right hemisphere anterior superior temporal gyrus (RH-aSTG) was observed during initial solving efforts and insights. About 300 ms before insight a burst of gamma activity was observed, interpreted by the authors as „making connections across distantly related information during comprehension ... that allow them to see connections that previously eluded them”.

Insight interpreted What really happens? My interpretation: • LH-STG represents concepts, S=Start, F=final • understanding, solving = transition, step by step, from S to F • if no connection (transition) is found this leads to an impasse; • RH-STG ‘sees’ LH activity on meta-level, clustering concepts into abstract categories (cosets, or constrained sets); • connection between S to F is found in RH, leading to a feeling of vague understanding; • gamma burst increases the activity of LH representations for S, F and intermediate configurations; • stepwise transition between S and F is found; • finding solution is rewarded by emotions during Aha! experience; they are necessary to increase plasticity and create permanent links.

Memory & creativity Creative brains accept more incoming stimuli from the surrounding environment (Carson 2003), with low levels of latent inhibition responsible for filtering stimuli that were irrelevant in the past. “Zen mind, beginners mind” (S. Suzuki) – learn to avoid habituation! Creative mind maintains complex representation of objects and situations. Pair-wise word association technique may be used to probe if a connection between different configurations representing concepts in the brain exists. A. Gruszka, E. Nęcka, Creativity Research Journal, 2002. Word 1 Priming 0,2 s Word 2 Words may be close (easy) or distant (difficult) to connect; priming words may be helpful or neutral; helpful words are related semantically or phonologically (hogse for horse); neutral words may be nonsensical or just not related to the presented pair. Results for groups of people of low/high creativity are surprising …

Creativity & associations Hypothesis: creativity depends on the associative memory, ability to connect distant concepts together. Results: creativity is correlated with greater ability to associate words & susceptibility to priming, distal associations show longer latencies before decision is made. • Neutral priming is strange! • for close words and nonsensical priming words creative people do worse than less creative; in all other cases they do better. • for distant words priming always increases the ability to find association, the effect is strongest for creative people. Latency times follow this strange patterns. Conclusions of the authors: More synapticconnections => better associations => higher creativity. Results for neutral priming are puzzling.

Words in the brain The cell assembly model of language has strong experimental support; F. Pulvermuller (2003) The Neuroscience of Language. On Brain Circuits of Words and Serial Order. Cambridge University Press. Acoustic signal => phonemes => words => semantic concepts. Semantic activations are seen 90 ms after phonological in N200 ERPs. Perception/action networks, results from ERP& fMRI. Phonological density of words = # words that sound similar to a given word, that is create similar activations in phonological areas. Semantic density of words = # words that have similar meaning, or similar extended activation network.

Words: simple model Goals: • make the simplest testable model of creativity; • create interesting novel words that capture some features of products; • understand new words that cannot be found in the dictionary. Model inspired by the putative brain processes when new words are being invented. Start from keywords priming auditory cortex. Phonemes (allophones) are resonances, ordered activation of phonemes will activate both known words as well as their combinations; context + inhibition in the winner-takes-most leaves one or a few words. Creativity = imagination (fluctuations) + filtering (competition) Imagination: many chains of phonemes activate in parallel both words and non-words reps, depending on the strength of synaptic connections. Filtering: associations, emotions, phonological/semantic density.

Creating new words A real letter from a friend: I am looking for a word that would capture the following qualities: portal to new worlds of imagination and creativity, a place where visitors embark on a journey discovering their inner selves, awakening the Peter Pan within. A place where we can travel through time and space (from the origin to the future and back), so, its about time, about space, infinite possibilities. FAST!!! I need it sooooooooooooooooooooooon. creativital, creatival (creativity, portal), used in creatival.comcreativery (creativity, discovery), creativery.com (strategy+creativity)discoverity = {disc, disco, discover, verity} (discovery, creativity, verity)digventure ={dig, digital, venture, adventure} still new! imativity (imagination, creativity); infinitime (infinitive, time) infinition (infinitive, imagination), already a company namejournativity (journey, creativity) learnativity (taken, see http://www.learnativity.com)portravel (portal, travel); sportal (space, sport, portal), taken timagination (time, imagination); timativity (time, creativity)tivery (time, discovery); trime (travel, time)

Word games Word games were popular before computer games. They are essential to the development of analytical thinking. Until recently computers could not play such games. The 20 question game may be the next great challenge for AI, because it is more realistic than the unrestricted Turing test; a World Championship with human and software players (in Singapore)? Finding most informative questions requires knowledge and creativity. Performance of various models of semantic memory and episodic memory may be tested in this game in a realistic, difficult application. Asking questions to understand precisely what the user has in mind is critical for search engines and many other applications. Creating large-scale semantic memory is a great challenge: ontologies, dictionaries (Wordnet), encyclopedias, MindNet (Microsoft), collaborative projects like Concept Net (MIT) …

Query Semantic memory Applications, eg. 20 questions game Humanized interface Store Part of speech tagger & phrase extractor verification On line dictionaries Parser Manual

A Salamander. If you do not know, ask Google!Quark page comes at the top … Puzzle generator Semantic memory may be used to invent automatically a large number of word puzzles that the avatar presents. This application selects a random concept from all concepts in the memory and searches for a minimal set of features necessary to uniquely define it; if many subsets are sufficient for unique definition one of them is selected randomly. It is an Amphibian, it is orange and has black spots. How do you call this animal? It has charm, it has spin, and it has charge. What is it?

Few conclusions Neurocognitive informatics: inspirations beyond perceptron ... Sydney Lamb, Rice Uni, wrote general book (1999) on the neural basis of language. How to create practical large-scale algorithms? Various approximations to knowledge representation in brain networks are studied: the use of a priori knowledge based on reference vectors, formation of graphs of consistent concepts in spreading activation networks, ontology & semantic-based enhancements + specific relations. Clusterization/categorization quality has been used to discover which semantic types are useful (selecting categories of features), expand and reduce the concept space, discovering useful “pathways of the brain”. Can one identify specific clinotypes in summary discharges? Can they be used to improve training of young MDs? Sessions on Medical Text Analysis and billing annotation challenge, April 1-5, 2007, IEEE CIDM, Honolulu, showed that human level competence in some text analysis tasks can be reached!

Thank youfor lending your ears ... Google: W. Duch => Papers Duch W, Intuition, Insight, Imagination and Creativity, IEEE Computational Intelligence Magazine 2(3), August 2007, pp. 40-52

Neural Networks for High-Level Intelligence and Cognition