120 likes | 127 Views
This study aims to present a solution for the learning of non-linear path tracking controllers for quad-rotors using Reinforcement Learning. It also proposes a method for coordinating the movements of a quad-rotor team to build multiple truss-like structures cooperatively.
E N D
Learning of Coordination of a Quad-Rotors Team for the Construction of Multiple Structures. Sérgio Ronaldo Barros dos Santos. Supervisor: Cairo Lúcio Nascimento Júnior.
Objective • Present a solution for the learning of non-linear path tracking controllers for a quad-rotor using Reinforcement Learning. • Propose a learning method of the coordinating the movements of a quad-rotors Team to build multiple truss-like structures cooperatively. 1
Overview • The system is composed by: • A team with three quad-rotors; • Each quad-rotor is constituted by a decentralized low level controller and a collision avoidance algorithm implemented on-board. • A set of parts (beams and columns) of truss-like structures equipped with magnetic nodes. 2
Overview • A task planner based on Reinforcement Learning used to coordinate the team of quad-rotors during the building of a target structure. • A vision system used to get the real-time pose estimation of the quad-rotors, parts and also the assembly points. 3
First Stage: Low Level Controllers • Implement a control algorithm based on Reinforcement Learning that can adapt well to different flight conditions, such that the aircraft is able to track a defined trajectory. • It is taken into account during the training phase. • Different trajectories; • The transport of loads; • The presence of wind effects. 4
First Stage: Low Level Controllers • The controllers are derived off-line using the non-linear X-Plane model and Learning algorithm. • Immediately after the controllers are ported to an actual aircraft using the Real Time Workshop and Quarc Design . • The experimental results were submitted to the 2012 IEEE International Conference on Systems, Man, and Cybernetics. 6
Second Stage: Task Planner • Learning a set of optimal actions for each quad-rotor, such that the target structure can be built cooperatively. • The assembly plan will be learned off-line through a simulator. • The obtained solutions will be validated experimentally. • The learned assembly plan should be able to place parts without result in deadlock conditions. • The structure must be dynamically stable during the assembly. 7
Second Stage: Task Planner • To avoid collisions, the quad-rotors should satisfy a minimum distance condition among them while execute their maneuvers. • The system must allow several quad-rotors to pick up parts simultaneously from a supply bins. 8
Second Stage: Task Planner • Possible optimization conditions. • Minimize the total cost of construction; • - Minimize the time of construction of the structure, taking into account the limitations of the quad-rotor actuators; • - Maximize the operation time of the quad-rotors through the minimization of energy consumption. 10
Thank You ! 11