1 / 1

Baby’s Eye View: Temporal Dynamics of Rapid Visual Object Learning

We hope to show that infants might be able to gather enough information to learn to locate faces very quickly. To this end, we will gather visual information and other information available to infants.

jericho-nen
Download Presentation

Baby’s Eye View: Temporal Dynamics of Rapid Visual Object Learning

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. We hope to show that infants might be able to gather enough information to learn to locate faces very quickly. • To this end, we will gather visual information and other information available to infants. • We will present a computational learning system that shows that using only this limited amount of information, faces can be reliably located in images. • 3700 Images collected over 90 minutes of interaction. • No experimenter intervention. • Variety of lighting and background conditions. • No post-processing of images (rectification, etc.) • BEV was attached to a Mac PowerBook G4 laptop that ran contingency detection software and stored data continuously for 88 Minutes while the experiment was in progress, making moment-to-moment decisions about how to best vocalize in order to detect people. • BEV used her speaker to utter baby sounds collected from the Internet. There were five sounds, ranked in level of excitement from high -> low by the experimenters. These were uttered when high -> low levels of contingency were detected respectively. • 9 subjects were asked to interact with BEV so as to make her excited. • An image was added to the dataset whenever a) a vocalization was made, and b) BEV was 97.5% confident that a person was present or absent. • To collect data from a Baby’s Eye View, we created BEV, a simple baby robot. • BEV has two sensors: • A microphone in the chest to detect overall volume • An IEEE1394 Webcam in the forehead, capturing unrectified 320x240 pixel images. • BEV has one actuator: • A monaural speaker in the chest, for vocalization. Performance of one BEV-Trained SBP learner on Johnson Stimuli: Very little information required: Contingency Detection and Data Collection: Performance of all BEV-Trained SBP learners on Johnson Stimuli: Max Posterior & Posterior Probability Maps: Evidence for Conspecific Processing: [Johnson et al., Cognition, 1991] Evidence for Rapid Learning: Current Hypotheses: Baby’s Eye View: Temporal Dynamics of Rapid Visual Object Learning Nicholas Butko ♦ Dept. of Cognitive Science ♦ UCSD ♦ nbutko@cogsci.ucsd.eduIan Fasel, Javier Movellan ♦ Institute for Neural Computation ♦ {ianfasel, movellan}@mplab.ucsd.edu Computational Model Can Faces Be Learned? Motivation We set out to explore the nature of the visual information that neonate infants have available to them. Is it enough to learn detailed object categories reliably? If so, this provides evidence that an alternative hypothesis to the dominant paradigm is feasible, viz. that infants may not be born with the ability to recognize conspecifics. Baby Robot BEV Segmental Boltzmann Processes Hyper Adaptation? “We wish to propose the general term CONSPEC to refer to a unit of mental architecture in any species that ... contains structural information concerning the visual characteristics of conspecifics.” [bold emph. added] --Morton & Johnson, Psych. Review, 1991 (Of 3700 Images) • Fasel & Movellan (2006) developed a novel visual learning algorithm called “Segmental Boltzmann processes” • This algorithm is a weakly supervised algorithm. It requires one label for per image, indicating whether an object of a category of interest is in that image with a probability better than chance. • From this weak label, the algorithm learns to localize the object of interest in novel images, or indicate that the object is not present. • The algorithm is a probabilistic model that looks for “objects”: clusters of pixels that are codependent but independent of the rest of the image. • Segmental Boltzmann Processes can be viewed as a connectionist architecture, simulating 4,000,000 neurons running in real time (30 Frames Per Second). • Segmental Boltzmann Processes are ideal for multimodal learning in which a secondary modality can provide a better than chance label about the presence of an object in the visual field. 0 Minutes 3 Minutes 6 Minutes (Of 90 Minutes) Schematic Generalization “We would not expect the experience of the mother’s face to transfer to the two-dimensional schematic stimuli used with newborns.”--Morton & Johnson, Psych. Review, 1991 • Social Hypothesis:
 - Infants are genetically predisposed to look at things that look like human faces. • Sensory Hypothesis: - Infants look for general visual features, which are shared by faces

 --Kleiner & Banks, Experimental Psych., Human Perception & Performance 1987 Social Contingency BEV Dataset • Watson (1972) found that two-month infants exhibit social responses to contingent mobiles, indicating that infants use contingency as a means of identifying caregivers. • Movellan & Watson (1985) found that ten-month infants are very optimized detectors of contingency. • Movellan (2002) developed a model of this optimal contingency detection based on the principles of information maximization and optimal control. • For this experiment we used auditory contingencies as a cue for the presence of a person. However, other cues like touch or uninitiated motion may be more appropriate for neonates, and should produce similar results. Contingency Detected Rapid Learning Hypothesis Infants quickly become interested in certain aspects of the visual scene presented to them, and learn to attend to specific salient things. Bushnell et al. 1989 - 2 day old infants fixate longer to images of their mothers than to images of other women with similar hair colors and facial complexion 18% - No face ; 4% - No Person No Contingency Detected 17% - Face ; 20% - Person

More Related