290 likes | 415 Views
Developmental Artificial Intelligence. 27 March 2014 Olivier.georgeon@liris.cnrs.fr http:// www.oliviergeorgeon.com. t. Outline. Organization in time and space. Formalism for Spatio -temporal coupling. Cognitive architecture . Demonstrations Exercise
E N D
Developmental Artificial Intelligence 27 March 2014 Olivier.georgeon@liris.cnrs.fr http://www.oliviergeorgeon.com t oliviergeorgeon.com
Outline • Organization in time and space. • Formalism for Spatio-temporal coupling. • Cognitive architecture. • Demonstrations • Exercise • Implement your self-programming agent (follow up). oliviergeorgeon.com
Reminder of theoretical ideas • The objective is to learn (discover, organize and exploit) regularities of interaction in time and space to satisfy innate criteria (survival, curiosity, etc.). • To autonomously construct an ontology of reality. • From experience of interaction • Detect and respond to increasingly sophisticated affordances • (self-programming). oliviergeorgeon.com
Trace-Based Reasoning More abstraction Propose Matching Hierarchical abstraction Select Time Try to Enact oliviergeorgeon.com
Example 2 oliviergeorgeon.com
Examples of learned behaviors oliviergeorgeon.com
Example 2 oliviergeorgeon.com
Spatio-sequential regularity learning oliviergeorgeon.com
Concept of affordance • Property of an object or an environment that allows an individual to perform an action. • “To afford” an action. • « objects push us into doing » (Heinz Werner) • Affordances are properties of the coupling between the agent and the environment. • We know the world in terms of possibilities of interaction. oliviergeorgeon.com
Formalisme Traditionnalformalization Learning by experiencing Résultat Expérience Observation Action Agent Agent O A R E I = E ✕ R Environment Environment Enacted Interaction Intended Interaction Agent I I X: Interactions composites Environment Radical Interactionism: oliviergeorgeon.com
Spatial Radical Interactionism • How to learn the existence of possibly persistent entities in the environment? • How to adapt to different categories of entities? Agent Enacted Interaction e I Intended Interaction i I Spatial position σ Spatial transformation τ Environment • τrepresents the vestibular system. Can be implemented through an accelerometer. • σ represents position information (eye convergence, interaural phase difference, etc.) oliviergeorgeon.com
Spatial exemple 1 Agent Intended interaction Enacted interaction:…….. Spatial position σ = (1,0) Spatial transformation τ= (0,0) Environment oliviergeorgeon.com
Recursivity problem ecd Xd icd Xd Decisional mechanism Agent Environment “known” at time td ep1 ip1 epj I ipj I Spatial position σ Spatial transformation τ Environment How to maintain recursivity ? oliviergeorgeon.com
Spatial Exemple 1 oliviergeorgeon.com
Enactive Cognitive Architecture • The agent programs itself through experience of interaction. • The architecture does not program itself. • (“Kantian space”, e.g., Buzsaki 2013, Space, time, and memory) oliviergeorgeon.com
Inspiration from neurosciences oliviergeorgeon.com
More inspiration from neurosciences? Cotterill R. (2001). Progress in Neurobiology. oliviergeorgeon.com
ECA agent exemple 2 oliviergeorgeon.com
From “drives” to “goals” Afforded Simulated oliviergeorgeon.com
Exercise 3 oliviergeorgeon.com
Exercise • Two possible experiences E = {e1,e2} • Two possible results R = {r1,r2} • Four possible interactions E x R = {i11, i12, i21, i22} • Two environments • environment0: e1 -> r1 , e2 -> r2 (i12 et i21 are never enacted) • Environment1: e1 -> r2 , e2 -> r1 (i11 et i22 are never enacted) • Environment2 : e1 -> r2 , e2 -> r2 • Environment3 • Motivational systems: • motivation0: v(i12) = v(i22) = 1, v(i11) = v(i21) = -1 • motivation1: etc. • Implement un agent that learn to enact positive interactions without knowing its motivatins and its environnement a priori. • Write a rapport of behavioral analysis based on activity traces. oliviergeorgeon.com
Implémentation public staticExperiencee1 = new experience(); Experience e2 = new experience(); public staticResultr1 = new result(); Resultr2 = new result(); public static Interaction i11 = new Interaction(e1,r1, 1); etc. Public staticvoid main() Agent agent = new Agent3(); // Agent1(); Agent2(); Environnement env = new Env3(); // Env1(); // Env2(); for(int i=0 ; i < 10 ; i++) e = agent.chooseExperience(r); r = env.giveResult(e); System.out.println(e, r, value); Class Agent Class Agent3 public ExperiencechooseExperience(Result r) Class Environnement Class Env3 public ResultgiveResult(experiencee) Class Experience Class Result Class Interaction(experience, result, value) public intgetValue() oliviergeorgeon.com
Analyse de traces d’activité. Motivation1, Environnement 2. Motivation1, Environnement 0. Motivation1, Environnement 1. e1,r2,1 learn e1r1-e1r2,0 e1,r1,-1 learn e1r2-e1r1,0 e1,r1,-1 learn e1r1-e1r1,-2 e2,r2,1 learn e1r1-e2r2,0 e2,r1,-1 learn e2r2-e2r1,0 e2,r1,-1 learn e2r1-e2r1,-2 e1,r2,1 learn e2r1-e1r2,0 e2,r2,1 learn e1r2-e2r2,2 e1,r2,1 learn e2r2-e1r2,2 e2,r2,1 e1,r1,1 e1,r1,1 e1,r1,1 e1,r1,1 e1,r1,1 e1,r1,1 e1,r1,1 e1,r1,1 e1,r1,1 e1,r1,1 e1,r2,-1 e2,r1,1 e2,r1,1 e2,r1,1 e2,r1,1 e2,r1,1 e2,r1,1 e2,r1,1 e2,r1,1 e2,r1,1 oliviergeorgeon.com
Environnement 3 • Behaveslike Environnement0 during the first 10 cycles, thenlike environnement1. • Implémentation • If (step < 10) • If (experiment = e1) thenresult = r1 • If (experiment = e2) thenresult = r2 • Else • If (experiment = e1) thenresult = r2 • If (experiment = e2) thenresult = r1 • Step++ oliviergeorgeon.com
Principle of Agent 3 AGENT Activated (i11,i12) (it-1,it) (i11i11) it-1 it i12 Choose e1 i11 i11 learn Execute Propose Activate it-2 it-1 it= i11 it-4 it-3 i11 … Temps PAST FUTUR PRESENT oliviergeorgeon.com
Implementation of Agent 2 • At the end of time step t • Record or reinforce ic = it-1, it, w = pre(ic), post(ic), weight • If ic already belongs to the set of existing interactions It • Weight ++. • At the beginning of time step t • Construct the list of activated composite interactions At • At= { iIt | pre(i) = it-1 } • For each activated composite interaction in At • Create a proposition for post(ic).experiencewithproclivityic.weight* post(ic).valence • For each experience, sum up the proclivity of all its propositions. • Choose the experience that has the highest total proclivity. oliviergeorgeon.com
Class Interaction Class Interaction //attributes: Experience experience; Result result; int value; String label; Interaction preInteraction; Interaction postInteraction; int weight; Composite Interaction weight preInteraction PostInteractoin oliviergeorgeon.com
Decisionmechanism List<Proposition> propositions = newArrayList<Proposition>(); for(Interaction activatedInteraction : getActivatedInteractions()){ Proposition proposition = new Proposition( activatedInteraction.getPostInteraction().getExperience(), activatedInteraction.getWeight() * activatedInteraction.getPostInteraction().getValence()); intindex = propositions.indexOf(proposition); if(index < 0) propositions.add(proposition); else propositions.get(index).addProclivity(activatedInteraction.getWeight() * activatedInteraction.getPostInteraction().getValence()); } Collections.sort(propositions); If (propositions.size()> 0) proposedExperience = propositions.get(0).getExperience(); oliviergeorgeon.com
Class Proposition • Class Proposition implements Comparable • // attributs: • Experienceexperience • Int proclivity • // constructor • Proposition(Experienceexperience, intProclivity) • // methods • intcompareTo(Proposition proposition) • return new Integer(proposition.getProclivity()).compareTo(proclivity); • booleanequals(Object otherProposition) • return ((Proposition)otherProposition).getExperience() == this.experience; • voidaddProclivity(intproclivity) • this.proclivity += proclivity; oliviergeorgeon.com