Design of an adaptive r obot controller for a predator-prey task using e-puck robots

Design of an adaptive robot controller for a predator-prey task using e-puck robots MEng Project

The Goal To design an adaptiverobot controller capable of performing a predator-prey task, in a reconfigurable maze

Project Specifics Hardware Software Camera • Programming in C; • Player/Stage; IR Sensors Maze Obstacles

Project Breakdown • The project can be broken down into 4 separate problems: • Obstacle Avoidance – how to prevent a robot from colliding with obstacles in the maze; • Object Identification – how each robot can identify each other and some additional objects; • Predator-Prey Task –how to make a predator chase a prey and a prey escape from a predator; • Evolution–how to evolve the controllers so that their performance increases. Evolution can also give a controller some adaptability.

Behaviour-based Architectures Traditional BB

Obstacle Avoidance Infra-red sensors

Obstacle Avoidance

Obstacle Avoidance Rules included

Obstacle Avoidance

Exploring

Exploring Monitor the turnrate of the robot for a given period of time; If the variance of the turnrate is below a defined threshold, then the robot must perform a turn; Otherwise, maintain course.

Exploring • Bigger chance of sweeping the whole maze; • Increased number of different perspectives.

Exploring

Object Identification • First solution: • Predator should identify yellow objects as its food; • Prey should identify green objects as its food and yellow objects as predators. • Problem comes up: how to isolate an object of a certain colour in na image? • Image subtraction was the first solution implemented.

Object Identification Which images should be subtracted?

Object Identification Red Channel Blue Channel Green Channel

Object Identification    Red - Blue Blue - Green Green - Blue    Red - Green Blue - Red Green - Red

Object Identification Segmenting the chosen images using the Otsu Method    • Due to poor results, green is ruled out as a possible colour for the prey’s food. Blue and red are the new candidates.

Object Identification Problem found: when no coloured objects are present in the image, the subtraction and Otsu Method give false results.

Object Detection New solution: include an mbed board, that has 4 bright blue LEDs.

Object Identification Establish new fixed threshold of 205, determined experimentally. • New feature: food is available for 3 minutes and unavailable for 20 seconds, cyclically.

Object Identification Red Channel Blue Channel Green Channel

Object Identification By looking at the images from all 3 channels, one can see that the yellow object is very dark in the blue channel, are bright in both red and green channels. By processing each pixel individually: • If both the red and green components of a pixel are 40% larger than the blue component, then that pixel is considered to be yellow.

Object Identification Results of the yellow object identification

Object Identification How to retrieve information from the thresholded images?

Object Identification

Object Identification • A scope is then defined for both food searching and for predator avoidance. • Food scope – if the target’s centre of mass is within the scope, no turning occurs. It provides additional stability to the controller. • Avoid scope – if the centre of mass of the object identified is outside of the scope, no turning takes place. If it is within the scope, then the robot must turn to escape. It is the equivalent to the field of view in the natural world.

Object Identification

Predator-Prey Task • The main premises of the task are: • The predator must chase the prey, and get as close to it as possible; • The prey must try to escape from the predator; • The prey, like the predator, also feeds. In its case, the food is the blue light source.

Predator-Prey Task • An energy level was included in each controller, to reflect the events of feeding and dying. Therefore, new specifications are made: • Each robot starts off with a random energy level between 80 and 100%; • If the robot’s energy falls below a defined Hungry Level, then the robot becomes hungry and starts to look for food. The predator begins searching for the prey, and the prey begins searching for the blue light source; • When a prey is caught by the predator, its energy level becomes 0% (death); • When a predator catches a prey, it eats until it is no longer hungry; • A prey, when eating, should take approximately 3 seconds to eat enough to give it 20 percentual energy points. It should eat until it is no longer hungry; • To be allowed to “eat”, the distance between the robot and the food source must not be larger than 5 cm.

Predator-Prey Task • Additional features included: • Randomness in rebirth – when a robot dies, it is reborn after a few seconds. When it is reborn, there is a 51% chance it will be reborn with the same role it had before, and a 49% chance it will be reborn with the other role (i.e., predator becomes prey, and vice-versa); • “Cannibalism” – one predator might make the decision of eating another predator. This is possible because of randomness in rebirth; • Role-changing – if a predator who is very low on energy attacks a prey that has a very high energy value, then they change roles; • 360 “sweep” – the robot comes to a halt and rotates 360 degrees around itself, hoping to find food.

Predator-Prey Task Communication between robots

Predator-Prey Task

Evolution

Evolution • After a robot dies, the following processes take place: • Evaluation – if the controller’s fitness value for the role it is playing is equal or superior to the best fitness value found up to that point, then the controller is selected. Otherwise, the controller gets discarded, and skips to Mutation. • Selection – if the controller is selected, then its parameters become the best for that role, and are stored in the robot’s memory. • Mutation – the best set of parameters for the role it is playing is downloaded from the robot’s memory, and they are slightly changed. Each parameter gets its value changed by [-RANGE; RANGE]. • The range is defined for each parameter, and it is percentual. The larger the parameter, the wider the range. • After these tasks are performed, the robot is then reborn with its new set of parameters, and competes until it runs out of energy. The process is then repeated.

Evolution

Evolution How to design the fitness function? First of all, it is important to be aware that if one chooses the fractional configuration for the fitness function: • The variables should be included in the function as shown. • The three variables chose to compose the fitness function are: • Number of times eaten by the predator; • Number of times that the robot fed; • Ratio between chases in which the robot caught food and total chases that the robot performed.

Evolution Keeping the fitness model in mind, the first value has to decrease the fitness, so it goes on the denominator. The other two contribute positively for the robot’s performance, and therefore should go on the numerator of the fraction model.

Evolution • The parameters chosen to be included in the Evolutionary Algorithm are as follows: • Prey’s Food Gain – gain associated to rule 20, if prey; • Prey’s Avoid Gain – gain associated to rule 21, if prey; • Predator’s Food Gain – gain associated to rule 20, if predator; • Predator’s Food Gain – gain associated to rule 21, if predator; • Size of “food scope” – width of the scope previously mentioned; • Size of “avoid scope” – width of the “field of view”; • Hungry Level – energy level below which the robot becomes hungry; • Energy Danger Level – energy level below which the robot knows it is about to die; • Variance Threshold – the limit that regulates the exploratory behaviour of the robot.

Evolution Each robot then has two different sets of parameters: the ones assigned to the predator and the ones assigned to the prey. Since two robots participate in the experiment, and they evolve differently, then 4 sets of paramateres will evolve in different ways.

Evolution Standard Configuration refers to when robot A is the predator and robot B the prey. Swapped Configuration refers to when robot B is the predator and robot A the prey.

Results The controllers were allowed to run for a few generations, and an interesting result came up: Standard Configuration Swapped Configuration

Results The prey’s fitness value increased dramatically because a prey-prey scenario occurred. Without anything to decrease its fitness value, the parameters’ path of evolution became corrupted. The environment would make the prey even less suitable to compete with the predator. Therefore, to prevent both prey-prey and predator-predator scenarios, randomness in birth and cannibalism were excluded from the project. Even so, the 4 sets of parameters all get a chance to evolve, since the role-changing still occurs when a predator is weak and catches a strong prey.

Evolution Evolution is restarted, and tests are carried out. During the rest of the project, the controllers were left to evolve, without any intereference. The Standard Configuration prey evolved throughout 141 generations, and the predator throughout 29. The Swapped Configuration prey evolved throughout 72 generations, and the predator throughout 19. It is important to be aware that all sets of parameters evolve from the same seed. The original set of parameters was defined through some basic experimentation.

Results The fitness values of each generation Standard Configuration

Results The fitness values of each generation Swapped Configuration

Results To keep it simple, the results will now be focused on the Standard Configuration, that evolved throughout more generations.

Design of an adaptive r obot controller for a predator-prey task using e-puck robots