10 likes | 91 Views
A. INTRODUCTION. ● Using the retina sampling, the vision system receives more useful information. ● Table 1 shows comparison of retina sampling model (human vision) and uniform sampling (computer vision).
E N D
A INTRODUCTION ● Using the retina sampling, the vision system receives more useful information. ● Table 1 shows comparison of retina sampling model (human vision) and uniform sampling (computer vision). ● The resolution (density of the sampling points) in the center part of the retina sampling is much higher than that of uniform sampling. In active vision system, we apply a connection mechanism based on correlation between input neurons’ activation and the activation of local winners. Retina structure is fundamental to human vision system, which is much more efficient than any of the current robotic vision systems. ● Photoreceptors (CONE, ROD) are concentrated around fovea, for the highest resolution on target ● Retina processes the sampled scene using ganglion cells, and sends activation through optical nerve and LGN to the primary visual cortex (V1) ● The neurons in V1 fire in groups responding to different visual features from the retina ● Correlation between input neurons’ activation ◘ Use real image data instead of noise ◘ Use images organized in time sequences to obtain feedback connections for invariance building ◘ Process the input data from layer N-1 and calculate the correlations ◘ For each neuron, find out the best correlated set of neurons, and create connections to those neurons Percentage of sampling points Uniform Sampling (Computer Vision) Retina Sampling (Human Vision) Part inside blue 31% 4% Fig. 1: Retina structure Part inside black 52% 14% ● Local winners are used to adjust the connection weights ◘ the local winners are activated (e.g. the green one in layer N) ◘ The weights of connections to neighbors of local winner are adjusted ◘ The local winners help the neighbors to fire together (horizontal red arrows are excitatory) ◘ All groups of winner sets in layer N (local winners and their neighbors) used to activate layer N+1 ◘ Use Oja’s learning rule [4] to adjust the weights of connections to the winners or Procedure of the weight adjustment: Activate layer N-1 find the strongest winners in layer N excite the neighbors as co-winners adjust weights for all activated activate layer N+1 Part inside red 63% 25% E The retina sampling model uses prespecified sampling density. Fig. 2 shows the distribution density curves of the cones inside retina. Part inside green 78% 50% Whole range 100% 100% S I Table 1: Comparison (Human V.S. Computer vision) M ● Correlation based sparse connections are used to mimic the neuron connections in V1 ● Neurons which are locally correlated connect to the same group of neurons in the higher layer ● Winners and theirs neighbors fire and have weight adjusted together, for smooth processing and increased robustness. R When this artificial retina sampling is applied to a visual scene, the vision system will receive much more data from object in focus, and still have a peripheral vision Fig. 8:The excitation of local winner and its neighbors Fig. 2:Cone densities in human retina [1] Fig. 5:An example of retina sampling Original resolution: 900x900, resolution after sampling 60x60 Retina sampling: Model of data Collection Correlation-based Connection The retina, unlike a camera, does not simply send a picture to the brain. The retina spatially encodes (compresses) the image to fit the limited capacity of the optic nerve. ● The photoreceptors are not evenly distributed inside retina. Most of them are concentrated on or around fovea ●1D probability distribution curve is shown in Fig. 3. CONCLUSIONS ● In primary visual cortex (V1), neurons are activated by the stimuli from similar groups of inputs. ● The connections built based on the correlation of the input reflect observed relations in the real world. Fig. 6 shows the correlations based on real images. An active servo system shown in Fig. 9 is being built with real-time video input, to demonstrate the active vision system for embodied intelligence. Both the retina sampling model and the correlation based connections are used to work with the servo system. ◘ The webcam is used to capture the visual data, ◘ The raw data is uniformly distributed (320x240 pixels), it will be processed first by retina model, compressed to 40x30 with little data loss in the center. ◘ With the compressed data and the correlation based sparse connection, the active vision system processes the real-time input, finds the interesting object and generates the object coordinates. ◘ The servo system receives the real-time coordinates and follows the object with laser pointer. Fig. 3:PDF of the photoreceptors (cones and rods) [2] ● Cortex receives distorted images, which are sharper in the fovea area. ●Fovea is the reference point of gaze shifting, and focuses on the most interesting part of the scene. Fig. 4 shows the sampling points for the retina model, with higher density in the center than on the periphery BIBLIOGRAPHY Fig. 9: Servo system Fig.7:Correlation based connections with remote but correlated area Fig. 6:Correlation of the input data ● Linsker obtained useful features in visual field with a fixed connectivity model and noise input for self-organizing training. [3] The disadvantage of his model is that the fixed connectivity model ◘ May not deliver connections to remote but correlated areas of the visual field. Fig. 7 shows the existence of the remote but correlated area ◘ May not result in useful features on higher levels ◘ Local connectivity region is set arbitrarily Fig. 10: Servo system is working with active vision system to follow the object in view Fig. 4:Sampling points for retina model Active vision system for embodied intelligence based on retina sampling model and hierarchical representation Janusz A. Starzyk, Xinming Yu Ohio University, Athens, OH Building up memories from environment Hierarchical representations learning is based on external reinforcement for primitive goals and internal goal creation system for abstract goals and internal rewards Fig. 11: The pathways through which the system is built up from interactions with the unspecified environment In learning, it is not easy to obtain examples of desired behavior that are both correct and representative of all the situations in which the agent has to act. Reinforcement learning (RL) is a good choice for learning in unspecified external environment. As shown in Fig. 12, the agent (A) receives data, which includes input (I) and reward (R) from the environment (E), and takes proper action (M) back to the environment. With the aid of the reward, the agent learns how to take correct action to have the maximum reward. Fig. 12: The reinforcement learning model Goal Creation system provides a mechanism that organizes learning of intentional representations and associations between sensory and motor pathways. When an agent realizes that a specific action resulted in a desirable effect related to the current goal, it stores a representation of the perceived object involved in such action and learns associations between the sensory and motor pathways. • An active vision system for embodied intelligence based on retina sampling model and hierarchical representation is developed. • The retina sampling model mimics efficiency of human vision system. • A hierarchical representation is built up with sparse connections, which are locally generated from the neurons’ activity correlation. • Using the goal creation system learning scheme, the active vision system can learn complex knowledge. • Goals evolve from the simple ones through interaction with environment. • Such organization of the learning process is conductive to creation of a general intelligence, with self-organizing structure and dynamic goals. [1] Curcio, C.A., Sloan, K.R. Jr, Packer, O., Hendrickson, A.E. & Kalina, R.E. (1987). Distribution of cones in human and monkey retina: individual variability and radial asymmetry. Science 236, pp. 579-582. [2] Riedel G., Physiology of Human Cells, Available: http://www.aberdeen.ac.uk/sms/ugradteaching/course.php?ID=10 [3] Linsker R., “From Basic Network Principles to Neural Architecture: Emergence of Spatial-Opponent Cells”, Proc. National Academy of Sciences, Vol. 83. pp. 7508-7512, 1986. [4] Oja E., “Simplified neuron model as a principal component analyzer”. Journal of Mathematical Biology 15 (3): pp. 267-273, 1982.