530 likes | 659 Views
Cognitive informatics: HITs, DREAMs & Perfect Babies. Włodzisław Duch Department of Informatics, Nicolaus Copernicus University Department of Computer Science, School of Computer Engineering, Nanyang Technological University, Singapore Google: Duch
E N D
Cognitive informatics:HITs, DREAMs & Perfect Babies Włodzisław Duch Department of Informatics, Nicolaus Copernicus University Department of Computer Science, School of Computer Engineering, Nanyang Technological University, Singapore Google: Duch A*STAR Cognitive Science Symposium, Sept. 2005
Cognitive Science at foundations of science. Cognitive architectures. Engineering vision. Humanized Interfaces (HITs). DREAMs. Perfect Babies. In the year 2015 ... Bye bye Plan
Toruń Nicolaus Copernicus: born in 1473 in Toruń
Are we any wiser now? Belief in miracle cures: is it only a placebo effect? Is homeopathy effective? No, but it is profitable … Is Traditional Chinese Medicine effective? Don’t know. Be open, but skeptical! Examine your cognition! Cognitive science: foundation for understanding How do we known anything? Franz-Joseph Gall (1758-1828) discovered 26 "organs" on the surface of the brain which affect the contour of the skull, including a "murder organ" present in murderers. Phrenology become a widely accepted theory and cranioscopy was even more popular than psychological testing today, psychograph machines were sold till 1937. Thousands of observations supported phrenology!
Can we trust our senses? Our minds? Harry McGurk and John MacDonald, “Hearing lips and seeing voices”, Nature 264, 746-748 (1976) We all know visual illusions, but know much less about cognitive illusions. N. Nicholls, Bulletin of the American Meteorological Society (1999). “These illusions, and ways to avoid them impacting on decision making, have been studied in the fields of law, medicine, and business. The relevance of some of these illusions to climate prediction is discussed here. The optimal use of climate predictions requires providers of forecasts to understand these difficulties and to make adjustments for them in the way forecasts are prepared and disseminated.”
Cognitive math and physics G. Lakoff &R. E. Nunez, “Where mathematics comes from: How the Embodied Mind Brings Mathematics into Being”. Basic Books 2000 Check Wikipedia article on Cognitive Mathematics. How can a number express a concept? How can mathematical formulas and equations express general ideas that occur outside of mathematics, ideas like recurrence, change, proportions, self-regulating processes, and so on? How can "abstract" mathematics be understood? What cognitive mechanisms are used in mathematical understanding?“ Can we understand the meaning of Euler’s formula eip+1=0 ? There is no agreement on the meaning of quantum mechanics. Can cognitive approach solve the problems in understanding physics?
Cognitive everything Has memory always worked as it does now? Ancient texts suggest significant differences. Dragon sightings in XI century, UFO in XX’s. Important for understanding history, ancient beliefs and religions, but also witness’ reliability, and even one’s own trustworthiness (false memory syndromes). M. Persinger, "Neuropsychological Base of God Beliefs" (1987). ShaktiLight, transcranial magnetic stimulation device, creates in many people epiphanies, out-of-body experiences etc. Blanke O, Landis T, Spinelli L, Seeck M, Out-of-body experience and autoscopy of neurological origin. Brain 127: 243-258, 2004. Fairly common experiences few people talk about.Extensive brain imaging studies show disintegration of personal/extrapersonal space due to dysfunction of temporoparietal junction.
Few more CS applications Animal sociobiology is a result of specific brain organization forming a basis of moral behavior. Understanding origins of morality and ethics led to a conference on neuroethics at Stanford University in 2002. Empathy, the basis of compassion, is closely related to our ability to imitate and understand other people using mirror neurons. Neuroaesthetics tries to understand what and why seems to be beautiful to our brains, why art and music are common in all human societies, and what can we learn about the brain from artist’s experiments. S. Zeki, Inner Vision: An Exploration of Art and the Brain (1999). Neuroaesthetics Institute (UCL), UC Berkeley Department, Harvard’s Institute for Music and Brain Science … journals, conferences, courses.
Cognitive revolution In many fields of science and art we are going through Cognitive Revolution! How about business and engineering?
CS & Business Neuromarketing: brain imaging tells us more than we know ourselves about our minds … Recent fMRI proof of the effectiveness of branding: people like Pepsi but choose Cola. Understanding human decision-making is of primary importance for business and financial world. Gerald Zaltman, How Customers Think: Essential Insights into the Mind of the Market (Harvard Business School Press), 2003. Cognitive Approach to Understanding How-and Why-Customers Buy – great introduction to CS for business people.
Cognitive Informatics Cognitive = the mental process by which knowledge is acquired. Informatics = The use of science, computer science, information and other technologies to provide data, information and knowledge to the individual and the organization. School of Informatics, University of Edinburgh: "The study of the structure, behavior, and interactions of natural and engineered computational systems", representation, processing, and communication of information, including cognitive and social aspects. Central notion: transformation of information, by computation or communication, in organisms or in computing artifacts. Cognitive Informatics: AI, Cog Sci, computing … focusing on the way humans understand and solve problems.
Human-Computer Interaction Human factors: how to build a Human-Computer Interface that would be easy to use? How to test for information overload? EPIC (Executive Process Interaction Control),human multitask simulator by D.E. Kieras (Michigan University, 1997). Although quite useful to explain skill acquisition it does not capture human executive processes correctly.
Cognitive robotics & complex devices In fact all complex devices need artificial minds to communicate with us effectively. Smart phones will soon have hundreds of functions, but the complexity of their use should be hidden from us. Human-Computer Interaction becomes central engineering problem. Robots need artificial minds, cognitive and affective control.
Humanoid robotics Robots need artificial minds, cognitive and affective control. Toys – AIBO family is quite advanced, over 100 words, face/voice recognition, 6 weeks to rise, self-charging. Most advanced humanoid robots: Sony Qrio, standing-up, dancing, running, directing orchestra … Honda P3 Honda Asimo Mistsubishi-heavy Wakamaru, first commercially sold household robot (Sept 2005)! Qrio: Predicts its next movement in real time, shifts center of gravity in anticipation, very complex motor control, but little cognitive functions. Wakamaru: recognizes faces, orients itself towards people and greets them, recognizes 10.000 words but does not understand much. Artificial minds in robots and complex devices are still a dream …
Neuro-Robotics: using robots to investigate the brain EU IST-FET PALOMA Project – developmental approach Validating a step-wise learning theory for grasping and manipulation Retina-likeVision system: 2 cameras Anthropomorphic neck & head: 7 d.o.f 7 proprioceptive sensors Biomechatronic hand: 10 d.o.f 16 proprioceptive sensors 135 tactile sensors Anthropomorphic robot arm: 8 d.o.f, 16 (8+8) proprioceptive sensors
Artificial Minds AMs need: some perception, inner representation of the world, language abilities, situated cognition, behavioral control. Cognitive architecture: the highest-level controller, responsible for executive functions (corresponding to frontal lobes in the brain). Functions: recognition (patterns, situations, events) and categorization, different types of memory, use of associative recall, decision making, conflict resolution, improve with experience (learn), select information (pay attention to relevant inputs), anticipate, predict and monitor events, plan and solve problems, reason and maintain beliefs, search for additional knowledge and communicate with specialized agents that may find it ... Only a few such large-scale architectures exist. AM: software and robotic agents that humans can talk to & relate to in a similar way as they relate to other humans.
Cognitive architecture: SOAR Symbolic, rule-based architectures based on theory of cognition. A. Newell, Unified theories of cognition(Harvard University Press, 1990). SOAR: architecture based on universal theory of cognition, more than 25 years of development, community of over 100 researchers. Behavior = Architecture x Content • NL-Soar, a theory of human natural language (NL) comprehension and generation, explained many results from psycholinguistics. • SCA, a theory of symbolic concept learning. • NOVA, a theory of visual attention matching human data. • NTD-Soar, a computational theory of the perceptual, cognitive, and motor actions performed by the NASA Test Director (NTD) • Instructo-Soar, a computational theory of how people learn through interactive instruction … and many others.
Steve = Soar Training Expertfor Virtual Environments. STEVE, autonomous educational agent based on SOAR,helping students to learn how to operate and maintain complex equipment. Virtual 3Denvironment – graphics design relatively easy. Hard: understanding student’s mind; use of natural language, understanding questions; monitoring performance of students in virtual space. Used in Intelligent Forces project. New type of intelligent tutor but not easy to create … SOAR-EPIC combinations
Cognitive architecture: ACT-R ACT-R, a cognitive architecture by John Anderson (CMU), has over 50 active groups in the world, including NASA, Naval Research, Sandia Labs ACT-R is a framework for creating models that incorporate the ACT-R's view of cognition and add new assumptions about the particular task. These assumptions can be tested by comparing the results of the model with experiments, for example time, accuracy or brain imaging data. The Atomic Components of Thought, by J.R. Anderson & C. Lebiere (LEA 1998) ACT-R has been used successfully to create models in domains such as: learning and memory, problem solving and decision making, language and communication, perception and attention, cognitive development etc. Applications in human-computer interaction to produce user models that can assess different computer interfaces, education (cognitive tutoring systems) to "guess" the difficulties that students may have and provide focused help, computer-generated cognitive agents that inhabit training environments, neuropsychology, to interpret FMRI data.
Brain-inspired architectures G. Edelman (Neurosciences Institute) & collaborators, created a series of Darwin automata, brain-based devices, “physical devices whose behavior is controlled by a simulated nervous system”. • The device must engage in a behavioral task. • The device’s behavior must be controlled by a simulated nervous system having a design that reflects the brain’s architecture and dynamics. • The device’s behavior is modified by a reward or value system that signals the salience of environmental cues to its nervous system. • The device must be situated in the real world. Darwin VII consists of: a mobile base equipped with a CCD camera and IR sensor for vision, microphones for hearing, conductivity sensors for taste, and effectors for movement of its base, of its head, and of a gripping manipulator having one degree-of-freedom; 53K mean firing +phase neurons, 1.7 M synapses, 28 brain areas.
Engineering vision 1. What would be the biggest engineering achievement? To see results of your research used by almost everyone on Earth. 2. What is it that everyone is using? Portable phones and pilots to control TVs and other devices, getting smarter but also more complex and difficult to use every year. 3. What is the stumbling block in developing the new generation of even more useful smart phones that can do all we ask for? Their complexity, the difficulty to use all their functions fully, the need for tedious programming, in short human-machine communication. 4. What is the solution? Humanized InTerfaces (HITs) that would communicate with us in a natural way, ask minimum number of questions if the commands are ambiguous and do what we ask for: control household devices, help us to remember, communicate, access information and services, advice, educate, play word games ...
Cognitive informatics again • Creation of such HITs with Artificial Minds interfaces is the greatest challenge to computer engineering. • All other layers of software became more or less standardized, from BIOS hardware level to the graphics API for Windows user interfaces. • Creation of an extensible platform for natural perception, language processing and behavioral modeling is the single most important subject left in computer engineering. • This requires a concentrated effort of many people in an area that should be best called “cognitive informatics”: understanding how humans perceive, create their inner world, communicate and act, and creating artifacts that behave as “artificial minds”, that understand and interact with us in similar way as people do. • Cognitive architectures created so far are good beginning, but are not sufficiently flexible to model many tasks.
Humanized InTerfaces (HIT) C2I = Center for Computational Intelligence, SCE NTU Flagship project, principal investigators: Wlodzislaw Duch & Michel Pasquier + 15 other staff members so far …
HIT: definition and goals HIT is a computer/phone interface that can interact in a natural way with the user, accept natural input in form of: • speech and sound commands; text commands; • visual input, reading text (OCR), recognizing gestures, lip movement; HIT should have a robust understanding of user intentions for selected applications. HIT should respond and behave in a natural way. It may have a form of simulated talking head user can relate to, an android head, or a robotic pet. Major goals of the HIT project: • develop modular extensible software/hardware platform for HITs; • create interactive word games, information retrieval and other applications on PCs; • extend HIT functionality adding new interactivity & behavior; • move it to portable devices (PDAs/phones) & broadband services.
HIT: motivation • HIT software/hardware/services may find their way to a billion portable devices/phones in a few years time. The value of telephone ringtones alone in 2003 was 5 bln S$. New telephone functions include: camera, speech recognition, on-line translation, interactive games and educational software. • Complexity of devices: a small fraction of the functions of electronic devices, such as PDAs, phones, cameras, or new TVs is used, new humanized interfaces that will help users are needed. • Many applications in education, entertainment, services; talking heads connected to knowledge bases are already used in E-commerce. • Creating HITs is a great computer engineering challenge, like building a rocket, it requires integration of many technologies and research themes, move research to a higher level. 17 SCE staff members expressed their interest and formulated HIT subprojects. • A test-bed is urgently needed to experiment with such technologies.
HIT: state of the art HIT may draw from results of many large frontier programs, such as: Microsoft Research, offering free speech recognition/synthesis tools and publishing work on Attentional User Interface (AUI) project. DARPA’s Cognitive Information Processing Technology (call 6/2003). European Union’s Cognition Unit (started 10/2004) programs that have a goal to create artificial cognitive systems with human-like qualities. Intel has projects in natural interfaces, providing free libraries for speech, vision, machine learning methods and anticipatory computing. Talking heads already answer questions on Web pages for car, telecom, banks, pharmaceutical & other companies. Animated personal assistants work as memory enhancements and information sources, news, weather, show times, reviews, sports access... Services answering questions in natural languages are coming: AskJeeves and 82ask give answers (human) to any question! But ... HITs are not yet robust, are still very primitive in all respects, with limited interaction with the user, poor learning abilities, no anticipation ...
HIT related areas Learning Affective computing T-T-S synthesis Brain models Behavioralmodels Speech recognition HIT projects Cognitive Architectures Talking heads AI Robotics Cognitive science Graphics Lingu-bots A-Minds VR avatars Knowledgemodeling Info-retrieval WorkingMemory EpisodicMemory Semantic memory
Web/text/databases interface Text to speech NLP functions Natural input modules Graphical talking head Behavior control Cognitive functions Control of devices Affectivefunctions Specialized agents HIT: proposed approach Proposed platform: core functions: limited speech, graphics, and natural language processing (NLP) + extended functions: perceptual, cognitive, affective, specialized agents, behavioral. Challenge and opportunity is to build modular platform for HIT on a PC, with 3D graphical head, robust speech recognition, memory, reasoning + cognitive abilities, and move it to new phones/broadband services. Uniqueness: nothing like that exists, requires a large-scale effort, integration and extension of many existing projects; collaboration with telecom and software industry, great student training.
HIT sub-areas Core functions (5 experts) include HIT platform design/integration, robust limited vocabulary speech recognition, basic vision and learning; object and face tracking, synthesis of auditory and visual sensory signals, basic responses, control and talking head graphics. Extended functions (13 experts) include synchronization of lip motion, analyzing prosody in speech, control of facial expressions, emotion analysis from face video, cognitive and affective functions, natural language analysis and context identification, cognitive learning, reasoning and theories of mind (user model), learning rules for behavior, integration with robot control, visual understanding, and hardware architecture design for selected applications. Applications (5 experts) will initially include creation of specialized agents, implementing trivia / quiz/ educational and entertainment games, word games, 20 questions game, medical information presentation, office/ household automation, multi-lingual message (SMS, chat) understanding, an intelligent tutor to promote learning using appropriate gestures & more.
HIT benchmarks First year: we should be able to demonstrate basic HIT platform with: • 3D graphic head with control over rich facial movements coupled with • visual object and face tracking coordinated with basic responses; • robust limited vocabulary speech recognition for playing word games; • speech synthesis and copycatting; • following simple spoken commands; • interface with WWW search agents finding and presenting information; • full specifications for adding and interacting with HIT functions. Second year: initial applications and extensions of the HIT platform, work on the natural language processing and cognitive modeling: • demonstrate HIT playing the 20 questions game, selecting most informative questions that help to define the subject uniquely; • playing educational word games implementing trivia and quizzes; • medical information presentation system answering simple questions; • use of HIT in the NTU virtual campus; • organize an internal competition for new applications of HIT; • preliminary designs for moving the HIT platform to portable devices.
HIT extensions & management Third year and later: Improve existing platform and applications, move them to portable devices, add new functions needed for specific applications. For example, auditory and visual scene analysis, fusion of sensory signals, testing of perceptual modules inspired by neuroscience, applications of HIT for programming household devices, tutor programs to teach problem solving for some local courses. A competition for adding such new functions and applications to the HIT platform will be made at NTU; this requires creation of test data, scenarios of use, providing specification for the interface, selection/sponsoring of the best proposed approach, and integration in the HIT. Parallel project submitted to A*STAR: Developmental Robot-Embedded Artificial Mind (DREAM), aimed at controlling real android head, focused on comparison of general cognitive architectures; includes collaboration with 3 leading groups in cognitive modeling (Berkeley, Michigan and Memphis University), plus KAIST (Korea) on natural perception.
Word games Word games that were popular before computer games took over. Word games are essential to the development of analytical thinking skills. Until recently computer technology was not sufficient to play such games. The 20 question game may be the next great challenge for AI, much easier for computers than the Turing test; a World Championship with human and software players? Finding most informative questions requires understanding of the world. Performance of various models of semantic memory and episodic memory may be tested in this game in a realistic, difficult application. Asking questions to understand precisely what the user has in mind is critical for search engines and many other applications. Creating large-scale semantic memory is a great challenge: ontologies, dictionaries (Wordnet), encyclopedias, collaborative projects (Concept Net) … movie
Query Semantic memory Applications, eg. 20 questions game Humanized interface Store Part of speech tagger & phrase extractor verification On line dictionaries Parser Manual
Puzzle generator Semantic memory may invents a large number of word puzzles that the avatar presents. The application selects a random concept from all concepts in the memory and searches for a minimal set of features necessary to uniquely define it. If many subsets are sufficient for unique definition one of them is selected randomly. It is an Amphibian, it is orange and has black spots. How do you call this animal? A Salamander. It has charm, it has spin, and it has charge. What is it? If you do not know, ask Google!Quark page comes at the top …
Developmental Robot-Embedded Artificial Mind (DREAM) Ng Geok See, Wlodzislaw Duch, Michel Pasquier, Quek Hiok Chai, Abdul Wahab, Shi Daming,John HengCentre for Computational Intelligence Nanyang Technological University
Objectives and Motivations • To develop a robotic head endowed with complex cognitive processor that recognizes and interacts with humans using natural means of communication. • To facilitate integration of research areas in perception (signal processing, computer vision), real-time control, natural language processing and cognitive modeling. • To create new applications for educational games, office assistants, chatterbot interfaces, etc. • To provide training of graduate students in a new field of high potential importance for Singapore economy. • This project is at the heart of cognitive robotics, a very important field not yet pursued in Singapore.
Related work Android receptionists Inkha in the King’s College London Although functionality of these receptionists is very limited these projects have been very popular, including a note in “Nature”. The Carnegie-Mellon Universityreceptionist Valerie.
Web/text/databases interface Text to speech NLP functions Natural input modules Talking head Behavior control Cognitive functions Control of devices Affectivefunctions Specialized agents DREAM architecture DREAM is concentrated on the cognitive functions + real time control, we plan to adopt software from the HIT project for perception, NLP, and other functions.
Research Partners Main goal: integration of perception with cognitive architectures, tests of 4 architectures and creation of new architecture for android head control. • Brain Science Research Center of the Korea Advanced Institute of Science and Technology headed by Soo-Young Lee. Need his expertise for perception and brain-inspired computational models. • SOAR by John Laird, University of Michigan. His experience will be helpful in interfacing SOAR with natural perception and motor control of our android. • The “Conscious” Software Research Group headed by Stan Franklin, Institute of Intelligent Systems, University of Memphis; we will use his Intelligent Distribution Agent (IDA) for solving problems that require complex reasoning. • Shruti by Lokendra Shastri, International Computer Science Institute, Berkeley; we will use Shruti for understanding natural language commands and for formation of episodic memories as a result of interaction with users.
Some applications • DREAM will serve as a application platform for human-computer interactions, for example in projects where evolution of language and development of mind through natural interactions with people is investigated (at present text-based interfaces are used). • Interface to word games, such as trivia games, educational quizzes, games requiring reasoning, and the 20 question games. • Natural interface to information resources, such as structured internet databases (MIT Start system), encyclopedias, or transportation. • Receptionist, science museums, office mates ... • An interface to control household devices. • Taking care of old people, reminding them of their schedule, talking to them to enhance their memories and collect their life stories (in collaboration with Alzheimer clinic in Bad Aibling, Bayern, Germany). • Creating alter-ego, by mimicking behavior of people, understanding their stories (forming episodic memories) and asking additional questions to gather more information. • Creating realistic instincts for robotics pets.
Emovere: A Neuro-Cognitive Computational Framework for Research on Emotions David Cho, Quek Hiok Chai, W. Duch, Ng Geok See, A. Wahab,John Taylor (King College) and Looi Chee Kit, (NIE/LSL)
Motivations • Human can express emotions easily, but from computational point of view emotions are not easy to understand. • Emotions are an important factor in intelligent behavior, including problem solving, they can help to focus attention on correct reasoning. • Understanding on how to capture real emotions in artificial system is a challenging problem. • Research on computational approaches to emotions is a state-of-the-art basic research topic. • HIT, DREAM and intelligent tutor projects will benefit from affective computing.
What is Emotion? “The core of an emotion is readiness to act in a certain way; it is an urgency, or prioritization, of some goals and plans rather than others. Emotions can interrupt ongoing action; also they prioritize certain kinds of social interaction, prompting, for instance, cooperation or conflict.” From N.H. Frijda, The Emotions, Cambridge, 1986.
Objectives • Neurocognitive computational framework for emotion • Encoding of Emotional Tags • Encoding of Episodic Memory using Emotional Tags and Emotional Expressions. • Information flow between memory modules, computational model similar to the brain info flow • Fusion of Multiple Modal Inputs • Facial, gesture, body posture, prosody and lexical content in speech. • Application • Implementation and validation of the model in an Intelligent Tutoring System.
Emotions using visual cues: face, gesture, and body posture. • Emotional face processing • Feedforward sweep through primary visual cortices ending up in associative cortices. • Projections at various levels of the visual (primary) and the associative cortices to the amygdala. • Activation of the prefrontal cortex initiate a re-prioritization of the salience of this face within the prefrontal cortex area. • Amygdala generates or simulates a motor response providing effectively a simulation of the other person’s emotional state. • Emotional gesture processing is similar to face processing. • Emotional body posture understanding: • Body movements accompany specific emotions. • Coding schemata for the analysis of body movements and postures will be investigated.
Emotions using auditory cues: linguistic and prosody • Speech carries a significant amount of information about the emotional state of the speaker in the form of its prosody or paralinguistic content. • Temporal recurrent spiking networks have already been used in identification of prosodic attitudes, but only using fundamental frequency, still 6 attitudes were distinguished with 82% accuracy. • Primary and high level auditory cortices are involved in the extraction and perceptual processing of various prosodic cues. • The amygdala and the pre-frontal cortex appears to be responsible for translating these prosodic cues into emotional information regarding the speech source.
Affect-based Cognitive Skill Instruction in an Intelligent Tutoring System • Intelligent Tutoring Systems (ITS) • Integrating characteristics proper of human tutoring into ITS performance. • Providing the student with a more personalized and friendly environment for learning according to his/her needs and progress. • A platform to extend the emotional modeling to real life experiments with affect-driven instruction. • Will provide a reference for the use of affect in intelligent tutoring systems.
IDoCare: Infant Development and Carefor development of perfect babies! Problem: about 5-10% of all children have a developmental disability that causes problems in their speech and language development. Identification of congenital hearing loss in USA is at 2½ years of age! Solution: permanent monitoring of babies in the crib, stimulation, recording and analysis of their responses, providing guideline for their perceptual and cognitive development, calling an expert help if needed. Key sensors: suction response (basic method in developmental psychology), motion detectors, auditory and visual monitoring. Potential: market for baby monitors (Sony, BT...) is billions of $; so far they only let parents to hear or see the baby and play ambient music. W. Duch, D.L. Maskell, M.B. Pasquier, B. Schmidt, A. Wahab School of Computer Engineering, Nanyang Technological University