SB-CoRLA: S chema- B ased Co nstructivist R obot L earning A rchitecture

SB-CoRLA:Schema-Based Constructivist Robot Learning Architecture Yifan Tang DiLab

Agenda Intro Related Work Approach Simulation Conclusion Agenda Information types, schemas, and evaluation criteria Team task solution Team task solution Team configuration Information types and schemas Robot Team Feedback Offline Evolutionary Learning (EL)‏ • Introduction: background, research objectives, and motivation • Related work • Approach • Simulations and results • Conclusion and future work Parsing Task Online Goal-Directed Feedback-Based Learning Task definition EL Solution SB-CoRLA overview Harvesting Information types and schemas General SCS Repository Chunks Schemas Chunks Evaluation New schemas and chunks Available schemas and chunks Information types and schemas Specific SCS Repository Online Solution Searching (ECA)‏ Schemas and chunks Chunks Accommodation Assimilation/Chunking

Agenda Agenda Intro Intro Related Work Related Work Approach Approach Simulation Simulation Conclusion Conclusion Agenda Information types, schemas, and evaluation criteria Team task solution Team task solution Team configuration Information types and schemas Robot Team Feedback Offline Evolutionary Learning (EL)‏ • Introduction: background, research objectives, and motivation • Related work • Approach • Simulations and results • Conclusion and future work Parsing Task Online Goal-Directed Feedback-Based Learning Task definition EL Solution SB-CoRLA overview Harvesting Information types and schemas General SCS Repository Chunks Schemas Chunks Evaluation New schemas and chunks Available schemas and chunks Information types and schemas Specific SCS Repository Online Solution Searching (ECA)‏ Schemas and chunks Chunks Accommodation Assimilation/Chunking

Agenda Intro Related Work Approach Simulation Conclusion Goal of my research • To develop an architecture that enables continuous robot learning • To extend the existing ASyMTRe architecture to enable constructivist robot learning • A method to learn new knowledge and skills based upon past experience • To explore the robot team solution search problem in a different way • Task allocation problems is an NP-hard search problem

Agenda Intro Related Work Approach Simulation Conclusion Why is robot learning beneficial? • Tasks that have been solved before; or new task, same set of skills • When designer doesn’t want to reinvent the wheel • Development of new behaviors • When robot adapts to the environment through interaction • Unknown environment • When human expertise is not sufficient • Biological inspiration • Learning is widely observed in the biological world • Learning reduces genetic material

Agenda Intro Related Work Approach Simulation Conclusion My dissertation builds upon the past research (ASyMTRe) for software reconfiguration • ASyMTRe: Automated Synthesis of Multi-team member Task solution through software Reconfiguration [Parker, F. Tang, 2006] • Inspired by the theory of information invariants [Donald 1994,1997] and schema theory [Lyons and Arbib,1989, 2003] • Automatically connects schemas through matching information types to generate a task solution • Enables the robots to share sensory, computational, and effector capabilities

Agenda Intro Related Work Approach Simulation Conclusion Example of reconfiguring interconnections of schemas; team task: go to goal R1 R2 • Case 1: R1: laser localization; R2: no environmental sensors • Solution:R1 uses laser to calc. its own global position and to find relative position of R2. R1 communicates global position of R2 to R2. R1 goes to goal on its own. R2 goes to goal based on assistance from R1. • Case 2: R1 :GPS; R2: camera • Solution: R1 uses GPS to localize, communicates its own position to R2; R2 uses camera to determine its location relative to R1, then to calc. its own global position. R1 goes to goal on its own. R2 goes to goal based on assistance from R1, plus its own relative positioning calc. laser cs1 ps3 ms1 ms1 ps1 cs2 R1 R2 GPS Camera cs1 ps4 ms1 ms1 ps1 cs2 L. E. Parker and F. Tang, Building Multi-Robot Coalitions through Automated Task Solution Synthesis, Proceedings of the IEEE, special issue on Multi-Robot Systems, vol. 94, no. 7, 2006: 1289-1305.

CS ES PS MS CS Roboti PS Agenda Intro Related Work Approach Simulation Conclusion Formal problem definition for ASyMTRe • Set of nrobots, R = {R1, R2, …, Rn} • Set of Information Types, F = {f1, f2, …} • Environmental Sensors, ES = {es1, es2, …} • Input: physical sensor signal • Output: • Perceptual Schemas, PS = {ps1, ps2, …} • Input: • Output: • Communication Schemas, CS = {cs1, cs2} • Input: • Output: • Motor Schemas, MS = {ms1, ms2, …} • Input: • Output:

Agenda Intro Related Work Approach Simulation Conclusion Example of reconfiguring interconnections of schemas; team task: go to goal • Case 1: Fully Capable Robots • Robot 1 -> Laser Scanner + Map • Robot 2 -> Laser Scanner + Map • Case 2: Communicate Own Current Position • Robot 1 (Helper) -> Laser Scanner + Map • Robot 2 (Needy) -> Camera • Case 3: Communicate Other’s Current Position • Robot 1 (Helper) -> Laser Scanner + Map and Camera • Robot 2 (Needy) -> nil L. E. Parker, Chandra, and Tang, “Enabling Autonomous Sensor-Sharing for Tightly-Coupled Cooperative Tasks”, 3rd NRL International Workshop on Multi-Robot Systems, March 2005. Chandra, “Software Reconfigurability for Heterogeneous Robot Cooperation”, UTK M.S. thesis, Spring 2004.

Red (Laser) Blue (Laser) Own Pos Own Pos Agenda Intro Related Work Approach Simulation Conclusion Case 1: fully capable robots Fully Capable Robots: Robot 1 -> Laser Scanner + Map Robot 2 -> Laser Scanner + Map Blue Red

Agenda Intro Related Work Approach Simulation Conclusion ASyMTRe-derived schema configurations for case 1 Blue Red PS5 PS5 CS1 CS1 Map Map ES1 ES1 PS1 PS1 MS MS ES2 ES2 PS2 PS2 PS4 PS4 PS3 PS3 CS2 CS2

Agenda Intro Related Work Approach Simulation Conclusion Case 2 : helper robot communicates own global position Communicate Own Current Global Position Robot 1 (Helper) -> Laser Scanner + Map Robot 2 (Needy) -> Camera Needy Helper Needy (Camera) Helper (Laser) Rel Pos + Own Pos Own Pos

Agenda Intro Related Work Approach Simulation Conclusion ASyMTRe-derived schema configurations for case 2 Helper: Laser (ES1) Needy: Camera (ES2) PS5 PS5 CS1 CS1 Map Map ES1 ES1 PS1 PS1 MS MS ES2 ES2 PS2 PS2 PS4 PS4 PS3 PS3 CS2 CS2

Agenda Intro Related Work Approach Simulation Conclusion Case 3 : helper robot communicates other robot’s global position Communicate Other’s Current Global Position Robot 1 (Helper) -> Laser Scanner + Map and Camera Robot 2 (Needy) -> nil Helper (Camera)(Laser) Needy Needy Helper Rel Pos + Own Pos Own Pos

Agenda Intro Related Work Approach Simulation Conclusion ASyMTRe-derived schema configurations for case 3 Helper Needy PS5 PS5 CS1 CS1 Map Map ES1 ES1 PS1 PS1 MS MS ES2 ES2 PS2 PS2 PS4 PS4 PS3 PS3 CS2 CS2

Agenda Intro Related Work Approach Simulation Conclusion Research objectives and inspiration • Research objectives: To extend ASyMTRe to enable constructivist learning • To learn collections of schemas (“chunks”, or “SCS”) constructively, in order to store knowledge from previous search process, and to improve the efficiency for future search • Inspiration: Piaget’s child development theory • Assimilation: Reorganize existing knowledge and skills to reflect novelties in the environment • Accommodation: Modify existing knowledge and skills to adjust to novelties in the environment PS5 PS5 CS1 CS1 Map Map ES1 ES1 PS1 PS1 MS MS ES2 ES2 PS2 PS2 PS4 PS4 PS3 PS3 CS2 CS2

Agenda Intro Related Work Approach Simulation Conclusion Key contributions • Facilitates constructivist learning • Includes both assimilation and accommodation in the new SB-CoRLA architecture • Learns schema chunks • Enables more efficient searches • Re-uses schema chunks • Finds online team solutions more quickly, rather than searching exhaustively over all possible solutions first First-level chunk Second-level chunk

Agenda Intro Related Work Approach Simulation Conclusion Related Work • Schema Theory • Lyons and Arbib (1989 and 2003) • Arkin (1998) • Information Invariants • Donald et al. (1994 and 1997) • Constructivist Learning • Drescher (1991) • Chaput (2004) • ASyMTRe • F. Tang and Parker (2005 and 2006)

Agenda Intro Related Work Approach Simulation Conclusion Recall the basic components in ASyMTRe • Schema • Presents basic robot capabilities • Categorizes into perceptual schema (PS), motor schema (MS), and communication schema (CS) • Information type (i.e. semantic content) • Each schema requires and produces information types • Inputs and outputs of schemas can be connected if their information types match • ASyMTRe automatically connects the schemas to generate a task solution GPS Camera cs1 ps4 ms1 ms1 ps1 cs2

Agenda Intro Related Work Approach Simulation Conclusion The special terms in SB-CoRLA • Sensori-Computational System (SCS) • SCS = “Chunk” • First-level chunk • Second-level chunk • Higher-level chunk • SCS repository • CA: Centralized ASyMTRe • RA: Randomized ASyMTRe • EL: Evolutionary Learning • ECA: Extended Centralized ASyMTRe First-level chunk example

Agenda Intro Related Work Approach Simulation Conclusion The SB-CoRLA architecture Information types, schemas, and evaluation criteria Team task solution Team task solution Team configuration Information types and schemas Robot Team Feedback Offline Evolutionary Learning (EL)‏ Parsing Task Online Goal-Directed Feedback-Based Learning Task definition EL Solution Harvesting Information types and schemas General SCS Repository Chunks Schemas Chunks Evaluation New schemas and chunks Available schemas and chunks Information types and schemas Specific SCS Repository Online Solution Searching (ECA)‏ Schemas and chunks Chunks Accommodation Assimilation/Chunking

Agenda Intro Related Work Approach Simulation Conclusion Assimilation in SB-CoRLA Team task solution Team configuration Information types and schemas Robot Team Offline Evolutionary Learning (EL)‏ Parsing Task Task definition EL Solution Harvesting Information types and schemas General SCS Repository Chunks Chunks Evaluation New schemas and chunks Available schemas and chunks Information types and schemas Specific SCS Repository Online Solution Searching (ECA)‏ Chunks Assimilation/Chunking

Agenda Intro Related Work Approach Simulation Conclusion Learning chunks (off-line) Team task solution Team configuration Information types and schemas Robot Team Offline Evolutionary Learning (EL)‏ Parsing Task Task definition EL Solution Harvesting Information types and schemas General SCS Repository Chunks Chunks Evaluation New schemas and chunks Available schemas and chunks Information types and schemas Specific SCS Repository Online Solution Searching (ECA)‏ Chunks Assimilation/Chunking

Agenda Intro Related Work Approach Simulation Conclusion Saving chunks (off-line) Team task solution Team configuration Information types and schemas Robot Team Offline Evolutionary Learning (EL)‏ Parsing Task Task definition EL Solution Harvesting Information types and schemas General SCS Repository Chunks Chunks Evaluation New schemas and chunks Available schemas and chunks Information types and schemas Specific SCS Repository Online Solution Searching (ECA)‏ Chunks Assimilation/Chunking

Agenda Intro Related Work Approach Simulation Conclusion Using chunks (online) Team task solution Team configuration Information types and schemas Robot Team Offline Evolutionary Learning (EL)‏ Parsing Task Task definition EL Solution Harvesting Information types and schemas General SCS Repository Chunks Chunks Evaluation New schemas and chunks Available schemas and chunks Information types and schemas Specific SCS Repository Online Solution Searching (ECA)‏ Chunks Assimilation/Chunking

Agenda Intro Related Work Approach Simulation Conclusion Find chunks from an EL solution

Agenda Intro Related Work Approach Simulation Conclusion Find chunks from an EL solution First-level Chunk 1

Agenda Intro Related Work Approach Simulation Conclusion Find chunks from an EL solution

Agenda Intro Related Work Approach Simulation Conclusion Find chunks from an EL solution First-level Chunk 2

Agenda Intro Related Work Approach Simulation Conclusion Combine chunks First-level Chunk 1 First-level Chunk 2

Agenda Intro Related Work Approach Simulation Conclusion Combine chunks Second-level Chunk First-level Chunk 1 First-level Chunk 2

Agenda Intro Related Work Approach Simulation Conclusion Learning chunks (off-line) Team task solution Team configuration Information types and schemas Robot Team Offline Evolutionary Learning (EL)‏ Parsing Task Task definition EL Solution Harvesting Information types and schemas General SCS Repository Chunks Chunks Evaluation New schemas and chunks Available schemas and chunks Information types and schemas Specific SCS Repository Online Solution Searching (ECA)‏ Chunks Assimilation/Chunking

Agenda Intro Related Work Approach Simulation Conclusion Evolutionary Learning Team task solution Team configuration Information types and schemas Robot Team Offline Evolutionary Learning (EL)‏ Parsing Task Task definition EL Solution Harvesting Information types and schemas General SCS Repository Chunks Chunks Evaluation New schemas and chunks Available schemas and chunks Information types and schemas Specific SCS Repository Online Solution Searching (ECA)‏ Chunks Assimilation/Chunking

Agenda Intro Related Work Approach Simulation Conclusion What are the reasons for choosing evolutionary learning for schema chunking? • Current ASyMTRe search algorithm does not include learning • Does not produce schema chunks for constructivist learning • Has difficulty discovering certain large team solutions because it uses heuristics to search for small team size solutions first • Regenerates each solution from the beginning • Evolutionary learning enables constructivist learning • Solution evolves with increasing fitness value • Learns highly-fit schema chunks • Reuses schema chunks in new task assignment

Agenda Intro Related Work Approach Simulation Conclusion Approach for ensuring solution quality: compare three search algorithms • Centralized ASyMTRe search algorithm (CA) – previously developed • A two step, anytime algorithm • Greedy search that prefers small team size and lower cost solution for individual robot • Randomized ASyMTRe search algorithm (RA) – new • Similar to CA • Randomized search • Evolutionary Learning search algorithm (EL) – new • Uses genetic algorithm to evolve populations of team solutions

Robot team R1 R2 R3 … Rn L1 L2 L3 … Lk Potential solutions Agenda Intro Related Work Approach Simulation Conclusion Centralized ASyMTRe search algorithm (CA) 1: for each robot team of n robots with up to m available schemas for each robot and up to p inputs for each schema 2: generate a list of kpotential solutions (O(nmp)) 3: sort the potential solutions in ascending order of costs (O(k log(k))) cost = wc * (c/cmax) + wp * (1-p) 4: sort the robots in ascending order of available schemas (O(n log(n))) 5: end for 6: for each robot team sequence (O(n!)) 7: for each robot in the sequence (O(n)) 8: attempt to assign a potential solution to this robot (O(q)) 9: if the robot cannot do the task by itself 10: attempt to find another robot that can provide help (O(nq)) 11: if all robots can do the task and the cost of the solution is lower than existing solutions 12: record this solution 13: end for 1 2 3 4

Robot team R1 R2 R3 … Rn L1 L2 L3 … Lk Potential solutions Agenda Intro Related Work Approach Simulation Conclusion Randomized ASyMTRe search algorithm (RA) 1: for each robot team of n robots with up to m available schemas for each robot and up to p inputs for each schema 2: generate a list of kpotential solutions (O(nmp)) 4: end for 5: for each robot team sequence (O(n!)) 6: for each robot in the sequence (O(n)) 7: attempt to assign a random potential solution to this robot (O(q)) 8: if the robot cannot do the task by itself 9: attempt to find another random robot that can provide help (O(nq)) 10: if all robots can do the task and the cost of the solution is lower than existing solutions 11: record this solution 12: end for 2 4 3 1

S4 S2 S3 S1 S7 S1 S7 S4 S1 S1 Solution1 Solution1’ S5 S6 S2 S3 S5 S6 S2 S3 S2 S5 S6 S5 S6 Solution2 Solution2’ Agenda Intro Related Work Approach Simulation Conclusion Evolutionary Learning search algorithm (EL) 1: for each robot team of n robots with up to m available schemas for each robot 2: initialize the first population of size p by randomly connecting the schemas via matching information types [O((nm2+n2)p)] 3: evaluate each individual solution of the p population 4: calculate fitness F = wc·(1-c/cmax) + wx·(1-x/xmax)+ wq·(q/qmax) + wu·(u/n) [O(n2m2)] 5: for gmax generations, perform 6: fitness proportionate selection or tournament selection [O(p)] 7: pair wise single point crossover at crossover rate =γ [O(n2m2p)] 8: single point mutation at mutation rate = δ [O(nmp)] 9: prune solutions and calculate their fitness values [O(n2m2p)] 10: record the solution with the best fitness value 11: stop if no fitness improvement for a pre-defined number of generations 12: end for

Agenda Intro Related Work Approach Simulation Conclusion EL: the graph • A graph, in adjacency list format, is used to represent individual team task solution; • Each generation in the EL process contains a number of individual team task solutions, i.e., graphs; • Each graph has as many nodes as the number of schemas in the robot team that is current assigned a task; • The edges among the graph nodes represents schemas connected with each other via matching information types, and therefore indicate information flows. ES Goal

Agenda Intro Related Work Approach Simulation Conclusion EL: the process • Initialization • Evaluation • Selection • Crossover • Mutation • Pruning ES Goal

SB-CoRLA: S chema- B ased Co nstructivist R obot L earning A rchitecture