300 likes | 447 Views
IM-CLeVeR: Intrinsically Motivated Cumulative Learning Versatile Robots Gianluca Baldassarre , Marco Mirolli , Francesco Mannella, Vincenzo Fiore, Stefano Zappacosta, Daniele Caligiore, Fabian Chersi, Vieri Santucci, Simona Bosco.
E N D
IM-CLeVeR: Intrinsically MotivatedCumulative LearningVersatile Robots Gianluca Baldassarre, Marco Mirolli,Francesco Mannella, Vincenzo Fiore, Stefano Zappacosta,Daniele Caligiore, Fabian Chersi, Vieri Santucci, Simona Bosco
OutlineIM-CLeVeR: Intrinsically Motivated Cumulative Learning Versatile Robots • The figures of the project • The project vision • The 3 pillars of the project idea + 4 S/T objectives • WP3: Experiments • WP4: Abstraction • WP5: Intrinsic motivations • WP6: Hierarchical architectures • WP7: Integration and demonstrators • Conclusions
OutlineIM-CLeVeR: Intrinsically Motivated Cumulative Learning Versatile Robots • Integrated project • Call: Cognitive Systems, Interactions and Robotics • EU funds: 5.9 ml euros • 7 (8) partners • Start: May 2009 • End: April 2013
Vision: the problem • How can we create “truly intelligent” robots? • Versatile: have many goals; re-use skills • Robust: function in different conditions, with noise • Autonomous: learning is paramount • Weng, McClelland, Pentland, Sporns, Stockman, Sur, Thelen, (Science, 2001): • …knowledge-based systems (e.g. production systems)… • …learning systems focussed on single tasks (e.g. RL)… • …evoluationary systems… • Important results, but limited autonomy and scalability. . . . . . on the contrary . . . • . . . organisms do scale, are flexible, and are robust!
Vision: the idea • Why are organisms so special? • Looking at children…
Vision: the idea Ingredients: • Powerful abstractions: “elefant on table leg”, “it slides down” • Explore • Record interesting states • Intrinsic motivations (interesting states, learning rates): • motivate to reproduce states (goals) • guide learning of skills • Skills are re-used and composed: • to explore • to produce new skills • Science: which brain and behaviouralmechanisms are behind these processes? • Technology: can we reverse engineer them? can we design algorithms with a similar power?
Vision: 2 promises • Science: we can understand organisms • Technology: we can develop a new methodology for designing robots… … in particular … Learn actions cumulatively … …re-use them to build other actions… …on the basis of intrinsic motivations… …and achieve externally assigned goals with them.
Vision: how we will do it:3 pillars + 4 S/T objectives From Technologyto Science FromScience to Technology WP4: Abstraction and attention 2. Computational bio-constrained models: mechanisms underlying brain and behaviour Technology Science Suitable representations 4. Two robotic demonstrators: - CLEVER-B - CLEVER-K 1. Empirical investigations: -Monkeys - Children - Adults - Parkinson patients WP5: Intrinsic motivations 3. Machine-learning models: powerful algorithms and architectures Focussing learning WP6: Hierarchical architectures to support cumulative learning
The project WPs WP4 WP7 From Science to Technology WP3 WP4: Abstraction and attention 2. Computational bio-constrained models: mechanisms underlying brain and behaviour Technology Science Suitable representations 4. Two robotic demonstrators: - CLEVER-B - CLEVER-K 1. Empirical investigations: -Monkeys - Children - Adults - Parkinson patients WP5: Intrinsic motivations WP5 3. Machine-learning models: powerful algorithms and architectures Focussing learning WP6: Hierarchical architectures to support cumulative learning WP6
WP3: Experiments and mechatronic board From Science to Technology WP3 WP4: Abstraction and attention 2. Computational bio-constrained models: mechanisms underlying brain and behaviour Technology Science Suitable representations 4. Two robotic demonstrators: - CLEVER-B - CLEVER-K 1. Empirical investigations: -Monkeys - Children - Adults - Parkinson patients WP5: Intrinsic motivations 3. Machine-learning models: powerful algorithms and architectures Focussing learning WP6: Hierarchical architectures to support cumulative learning
WP3: “Joystick experiment” background • USFD (Peter Redgrave & Kevin Gurney) • Actions novel outcomes dopamine BG learning Redgrave Gurney, 2006, Nature Rev. Neuroscience
WP3: Empirical Experiments: “Joystick experiment” • Method: • Adult humans and Parkinsonian patients • Joystick manoeuvring (gesture, location, timing) of a cursor on a screen to obtain reinforcement or salient event • For studying: Actions novel outcomes dopamine BG learning
Tactile sensors Inertial/magnetic unit + battery + wireless WP3: Empirical Experiments: “Board experiment” • UCBM-LBRB (Eugenio Guglielmelli); • Mechatronic board, intelligent sensors • UCBM-LDN (Flavio Keller): children • CNR-ISTC-UCP (Elisabetta Visalberghi): monkeys; • Goals: (a) Investigating properties of stimuli causing intrinsic motivations; (b) acquisition of skills based on intrinsic motivations Sabbatini, Stammati, Tavares, Visalberghi, 2007,Amer. J. Primatology Campolo, Taffoni, Schiavone, Formica, Guglielmelli, Keller, 2009, Int. J. Sicial Robotics
WP4: Abstraction WP4 From Science to Technology WP4: Abstraction and attention 2. Computational bio-contrainedmodels: mechanisms underlying brain and behaviour Technology Science Suitable representations 4. Two robotic demonstrators: - CLEVER-B - CLEVER-K 1. Empirical investigations: -Monkeys - Children - Adults - Parkinson patients WP5: Intrinsic motivations 3. Machine-learning models: powerful algorithms and architectures Focussing learning WP6: Hierarchical architectures to support cumulative learning
WP4 Abstraction: motor, perception, attention, vergence, • Abstraction is a key ingredient for intrinsic motivations and hierarchical actions • Motor: key in hierarchies • Perceptual: key in intrinsic motivations: e.g., retina images would be always novel without abstraction • Attention/vergence: two key forms of abstraction
WP4 Intrinsic motivations for developing vergence and perceptual abstraction • FIAS (Jochen Triesch) • E.g.: reward whentarget fixated with both eyesdrives development of vergence • Similar mechanisms to develop perceptual abstraction Franz & Triesch, 2007, ICDL Weber Triesch, 2009, IJCNN
WP5: Novelty detection From Science to Technology WP4: Abstraction and attention 2. Computational bio-contrainedmodels: mechanisms underlying brain and behaviour Technology Science Suitable representations 4. Two robotic demonstrators: - CLEVER-B - CLEVER-K 1. Empirical investigations: -Monkeys - Children - Adults - Parkinson patients WP5: Intrinsic motivations WP5 3. Machine-learning models: powerful algorithms and architectures Focussing learning WP6: Hierarchical architectures to support cumulative learning
WP5 Intrinsic (extrinsic) motivations • Intrinsic motivations (skill/knowledge acquis.): • Psychology: motivate actions for their own sake • Drive actions whose effects are an increase in:(a) knowledge or prediction ability;(b) competence to do • Terminate to drive actions when knowledge/ competence is acquired • Extrinsic motivations (e.g. food, sex, money): • Psychology (Berlyne, White, Deci & Rayan):motivate actions to achieve specific goals • Drive actions whose effects directly increase fitness • Come back again with the homeostatic needs they are associated with
WP5 Intrinsic motivations • CNR-LOCEN (Gianluca Baldassarre, Marco Mirolli) • Young robot: low level of hierarchy develps skills based on evolved ‘reinforcers’ (knowledge-based intrinsic motivations) • Young robot: high level of hierarchy selects skills which produce the highest suprise (competence-based intrinsic motivations) • Adult robot: high level of hierarchy performs skill composition to achieve salient goals (external rewards fitness measure) Young robot: results Before learning After learning Adult robot: results Child robot task Adult robot tasks Schembri, Mirolli, Baldassare, 2007, ICDL, ECAL, EPIROB
From Marsland et al. 2005 (J. Rob. Aut. Sys.) WP5 Novelty detection with habituable neural networks • UU: (Ulrich Nehmzow) • Task: find novel elements in world • Image pre-processing (abstraction) • Habituable neural network Task Neto Nehmzow, 2007, Rob. & Aut. Syst.
WP5 Intrinsic motivations based on information theory • IDSIA (Juergen Schmidhuber) • Theoretic ML, robotics, information-theory intrins. mot. • ‘Data compression improvement’ = intrinsic motivation Schmidhuber, 2009, Journal of SICE
WP6: Hierarchical architectures From Science to Technology WP4: Abstraction and attention 2. Computational bio-mimeticmodels: mechanisms underlying brain and behaviour Technology Science Suitable representations 4. Two robotic demonstrators: - CLEVER-B - CLEVER-K 1. Empirical investigations: -Monkeys - Children - Adults - Parkinson patients WP5: Intrinsic motivations 3. Machine-learning models: powerful algorithms and architectures Focussing learning WP6: Hierarchical architectures to support cumulative learning WP6
WP6 Hierarchical architectures Cumulative learning needs hierarchical architectures: • To avoid catastrophic forgetting • To find solutions by ‘composing skills’: dirty but fast solutions, then refine • Because brain is hierarchical • Because brain has a (soft) modularity at all levels From Fuster, 2001, Neuron Mcgovern Sutton Fagg
WP6 Intrinsic motivations, hierarchical RL (options) • UMASS (Andrew Barto) • Intrinsically Motivated Reinforcement Learning • HRL: options theory Sutton et al., Option theory Simsek Barto, 2006, ICML; Singh Barto Chentanez, 2004, NIPS
WP6 Bio-inspired / bio-constrained hierarchical reinforcement learning • CNR-LOCEN (Gianluca Baldassarre & Marco Mirolli) • Piaget theory: actions support learning of other actions • Camera, dynamic arm, reaching tasks • Continuous state/action reinforcement learning • Hierarchical RL: segmentation, Piaget Caligiore Borghi Parisi Mirolli Baldassarre, ongoing
WP6 Development sensorimotor mappings in robots • AU (Mark Lee) • Developmental psychology and robotics • Staged development of sensorimotor behaviour • LCAS – Lift Constraint, Act, and Saturate Lee Meng Chao, 2007, Adaptive Behaviour; Lee Meng Chao, 2007, Rob. & Auton. Sys.
WP7: Integration From Science to Technology WP7 WP4: Abstraction and attention 2. Computational bio-mimeticmodels: mechanisms underlying brain and behaviour Technology Science Suitable representations 4. Two robotic demonstrators: - CLEVER-B - CLEVER-K 1. Empirical investigations: -Monkeys - Children - Adults - Parkinson patients WP5: Intrinsic motivations 3. Machine-learning models: powerful algorithms and architectures Focussing learning WP6: Hierarchical architectures to support cumulative learning
WP7 CLEVER-K: Kitchen scenario 3 iCub robots from IIT (Giorgio Metta) Leave a robot alone for a monthor so… …it will build up a repertoire of actions incrementally. …interacting with the environment: Come back and assign it a goal (e.g. by reward)… on the basis of intrinsic motivations… …and it will learn to accomplish it very quickly. Main responsible: IDSIA, UU
WP7 CLEVER-B: Board scenario Main responsible: AU, CNR-LOCEN
Conclusions: A timely project • Timely research goals:intrinsic motivations, hierarchical architectures • Within important trends: • developmental robotics • computational system neuroscience • emotions/motivations • In synergy with various events:EpiRob, ICDL, J. of Autonomous Mental Development • In line with EU calls:“Cognitive Systems, Interactions and Robotics” • First EU Integrated Project wholly focussed on these topics www.im-clever.eu