This is a lecture seven of `Biologically Inspired Computing’

Biologically Inspired Computing: Evolutionary Algorithms: Encodings, Operators and Related Issues: Timetables and Groups This is a lecture seven of `Biologically Inspired Computing’

Encoding / Representation Maybe the main issue in (applying) EC Note that: • Given an optimisation problem to solve, we need to find a way of encoding candidate solutions • There can be many very different encodings for the same problem • Each way affects the shape of the landscape and the choice of best strategy for climbing that landscape.

E.g. encoding a timetable I 4, 5, 13, 1, 1, 7, 13, 2 Exam2 in 5th slot Exam1 in 4th slot Etc … • Generate any string of 8 numbers between 1 and 16, • and we have a timetable! • Fitness may be <clashes> + <consecs> + etc … • Figure out an encoding, and a fitness function, and • you can try to evolve solutions.

Mutating a Timetable with Encoding 1 4, 5, 13, 1, 1, 7, 13, 2 Using straightforward single-gene mutation Choose a random gene

Mutating a Timetable with Encoding 1 4, 5, 6 , 1, 1, 7, 13, 2 Using straightforward single-gene mutation One mutation changes position of one exam

Alternative ways to do it This is called a `direct’ encoding. Note that: • A random timetable is likely to have lots of clashes. • The EA is likely (?) to spend most of its time crawling through clash-ridden areas of the search space. • Is there a better way?

Constructive Methods Problems like timetabling, scheduling, and other `logistics’ activities are often `solved’ in practice via constructive heuristics, These are also called greedy heuristics. A constructive method is a technique that builds a single solution step by step, trying to be clever (often) about each step.

Examples Prim’s algorithm for building the minimal spanning tree (see an earlier lecture) is an example. Djikstra’s shortest path algorithm is also an example. In both of these cases, the optimal solution is guaranteed to be found, since MST and SP are easy problems. But usually we see constructive methods used to give very fast `OK’ solutions to hard problems.

A constructive method for the TSP Start with a random current city c; mark c as visited: Initialise Tour = {} (empty) Repeat ncities-1 times: choose, BTR, the closest unvisited city to c (call it d) add the edge cd to Tour mark d as visited Let d be the current city Try it yourself a few times. Can you construct examples where this will give a very poor result?

A constructive method for exam timetabling Repeat nexams times: choose an exam, e, randomly. let V be the set of valid timeslots for e – I.e. slots it can go in without introducing a clash. If V is empty, mark e as unplaced Else choose random slot t from V, and assign e to t. Is this how people do timetabling, or is there an even better way?

A (usually) better constructive method for exam timetabling Assign a difficulty score to each exam – e.g. this could be the number of other exams with which it clashes. Repeat nexams times: choose an unscheduled exam e with highest difficulty,BTR. Find V, the set of slots it can go in without introducing a clash. If V is empty, mark e as unplaced Else for each slot in V, find its usability score – e.g. the number of unplaced exams that could go in that slot without introducing a clash Choose a slot t with minimal usability score. Assign e to t.

Back to encoding … We can use constructive methods as encodings in the following sort of way; this is sometimes called a `hybrid’ approach. The EA searches through the space of orderings of items (e.g. exams to schedule, edges to put in a graph, etc…). When evaluating fitness, a constructive method builds a solution using the ordering provided in the chromosome, and then evaluates fitness in the normal way.

Encoding a timetable II E4 E6 E5 E2 E3 E8 E7 4, 5, 13, 1, 1, 7, 13, 2 Etc … E1 Use the 13th clash-free slot for exam3 Use the 5th clash-free slot for exam2 Use the 4th clash-free slot for exam1 Suppose these groups would clash {E1, E2}, {E1,E3}, {E2, E6}, {E2,E7}, {E2,E8}, {E3,E5}, {E3,E6}, {E4,E6}, {E4, E7}, {E5, E7}, {E5, E8}, {E6, E8}

Mutation withEncoding II E4 E6 E5 E2 E3 E8 E7 4, 5, 13, 1, 1, 7, 13, 2 E1 Use the 13th clash-free slot for exam3 Use the 5th clash-free slot for exam2 Use the 4th clash-free slot for exam1 Suppose these groups would clash {E1, E2}, {E1,E3}, {E2, E6}, {E2,E7}, {E2,E8}, {E3,E5}, {E3,E6}, {E4,E6}, {E4, E7}, {E5, E7}, {E5, E8}, {E6, E8}

Mutation withEncoding II E4 E6 E2 E3 E8 E5 4, 5, 13, 1, 14, 7, 13, 2 E1 E7 Use the 13th clash-free slot for exam3 Use the 5th clash-free slot for exam2 Use the 4th clash-free slot for exam1 Suppose these groups would clash {E1, E2}, {E1,E3}, {E2, E6}, {E2,E7}, {E2,E8}, {E3,E5}, {E3,E6}, {E4,E6}, {E4, E7}, {E5, E7}, {E5, E8}, {E6, E8}

Think about these things How could you design a `smart’ mutation operator for the direct timetable encoding? (hint – when you’ve randomly chosen a gene to mutate, can you do better than give it a random new slot?) How could you design a smart mutation operator for the indirect timetable encoding? (hint – hard)

Direct vs Indirect Encodings Direct: • straightforward genotype (encoding)  phenotype (individual) mapping • Easy to estimate effects of mutation • Fast interpretation of chromosome (hence speedier fitness evlaluation) Indirect/Hybrid: • Easier to exploit domain knowledge – (e.g. use this in the constructive heuristic) • Hence, possible to `encode away’ undesirable features • Hence, can seriously cut down the size of the search space • But, slow interpretation • Neighbourhoods are highly rugged.

Back to Bin-Packing The bin-packing encoding used in your assignment is a direct one. But there are some well-known constructive heuristics for bin-packing: the following ones are used when the bins have fixed capaities, and the problem is to pack the items into the smallest number of bins: First-fit-random (FFR): Repeat nitems times: Choose an item i randomly and place it in the first bin in which it will fit. First-fit-descending (FFD): Order the items from heaviest to lightest (BTR) For each item in order: place it into the first bin in which it will fit. How might you invent an indirect encoding for bin-packing?

An important aside about constructive methods Some Constructive Heuristics are deterministic. I.e. they give the same answer each time. Some are stochastic – I.e. they may give a different solution in different runs. Usually, if we have a deterministic constructive method such as FFD, we can engineer a stochastic version of it. E.g. instead of choosing the next-lightest item in each step, we might choose randomly between the lightest three unplaced items. When applying EAs, it is often found that a stochastic constructive heuristic is very useful for building an initial population. But care has to be taken with such an approach – why?

Crossover – some issues Consider our direct encoding for timetabling: Suppose this is a perfect solution with no clashes: Parent1: 1, 2, 5, 2, 2, 5, 2, 1, 2 And so is this: Parent2: 3, 4, 2, 4, 4, 2, 4, 3, 4 Consider a two-point crossover of them, such as: Child: 1, 2, 5, 4, 4, 2, 4, 3, 2 Would you expect this to be a perfect solution?

Crossover – some issues Probably not: let’s look at the parents in terms of the groupings into slots: Parent1: slot1 (e1, e8); slot2 (e2, e4, e5, e7, e9); slot5(e3, e6) Parent2: slot3 (e1, e8); slot4 (e2, e4, e5, e7, e9); slot2 (e3, e6) These parents are exactly the same in terms of the way exams are grouped together, and this is probably what accounts for their good fitness. I.e. it is a good idea to have e2, e4, e5, e7 and e9 in the same slot, etc. Child: slot1 (e1), slot2 (e2, e6, e9); slot 4 (e4, e5, e7); slot5 (e3) Our use of a standard `k-ary encoding’ crossover operator has disrupted these groupings.

Grouping Falkenauer (see paper on my teaching site – this one is examinable reading for the MScs, and recommended for the UGs) was the first to come up with a highly `tailored’ approach to applying an EA, in this case to the bin-packing problem. He used specialised initialisation, encoding, mutation, crossover, and fitness evaluation methods. His bin-packing work is generally a good example of how to design an EA so it works as well as it can on a particular problem. Of interest here is the encoding he used combined with the crossover operator – this type of encoding/operator combination has become common in cases where the problem at hand involves finding good `groups’ of some kind or other.

Group Based Encoding and Crossover Simplified from Falkenauer, a group-based encoding is simply a direct encoding of groups. E.g. for bin-packing, where we are trying to minimise the number of bins, and have 9 items, two chromosomes might be: P1: (3, 8, 2) – (1, 4) – (6, 7, 9) – (5) P2: (1, 6, 7, 9) – (3, 8, 2) – (4, 5) The chromosomes are simply appropriate datastructures that can hold a variable number of groups. The ordering of items within groups doesn’t matter. Notice that the underlying encoding can just be the correct one. The only really key point is that the crossover operator should work in the way described next.

Group Based Crossover Take two parents: P1: (3, 8, 2) – (1, 4) – (6, 7, 9) – (5) P2: (1, 6, 7, 9) – (3, 8, 2) – (4, 5) Start constructing a child C, which at first is a copy of P1: C: (3, 8, 2) – (1, 4) – (6, 7, 9) – (5) Now choose a random group from P2, and add it to the child: C: (1, 6, 7, 9) (3, 8, 2) – (1, 4) – (6, 7, 9) – (5)

Original parents P1: (3, 8, 2) – (1, 4) – (6, 7, 9) – (5) P2: (1, 6, 7, 9) – (3, 8, 2) – (4, 5) Child currently: C: (1, 6, 7, 9) (3, 8, 2) – (1, 4) – (6, 7, 9) – (5) Now remove all previous groups from the child that contain duplicated items. C: (1, 6, 7, 9) – ( 3, 8, 2) – (5) (note that group (1, 4) is gone, because it contained a duplicated `1’) We now have some missing items – take each one in turn, and add it back to the groups using a suitable heuristic (e.g. FFD). In this case, we have only lost the 4 – suppose in this problem we find it fits into the (3, 8, 2) group – we now end up with: C: (1, 6, 7, 9) – ( 3, 8, 2, 4) – (5)

Notes on group based crossover The intuition behind crossover is that: • The parents are presumably good (they were selected, after all • So the parents have good combinations of genes • If we combine good combinations of genes from different parents, we may get even better combinations. Fine, but we have realised it is important to have an idea, for a given problem, of what these combinations might look like. In grouping based problems, and considering the direct encoding, the combinations likely to be important are groups of genes with the same value. These are disrupted badly by ordinary crossover, but preserved with slight variation by group-based crossover

Next lecture Further encodings: Rules, Schedules, Trees, and more Trees

This is a lecture seven of `Biologically Inspired Computing’

This is a lecture seven of `Biologically Inspired Computing’

Presentation Transcript

CS101 Introduction to Computing Lecture 2 Evolution of Computing

Biologically Inspired Noise-Robust Speech Recognition for Both Man and Machine

Creative analogies: Learning about and learning through biologically inspired design

Biologically Inspired Computing, Nanoelectronic (Molecular Scale) Architectures Dr. Dan Hammerstrom http://web.cecs.pdx.

A BIOLOGICALLY INSPIRED PURSUIT PROBLEM: CHIMPANZEES AND COLOBUS MONKEYS

Biologically Inspired Computing: Introduction

Biologically Inspired Intelligent Systems

Mathematical Operations Using Bacteria

Programming Methodology for Biologically-Inspired Self-Assembling Systems

Biologically Inspired Intelligent Systems

Biologically Inspired Computation

Evolutionary Learning

Biologically Inspired Intelligent Systems

Biologically Inspired Intelligent Systems

Biologically Inspired Intelligent Systems

Biologically Inspired Computing: Introduction to Evolutionary Algorithms

Biologically Inspired Computation

Cellular Automata: Some Applications in Detail

Biologically Inspired Computation

Biologically Inspired Computation

CS101 Introduction to Computing Lecture 42 The Computing Profession

Bioinspired Computing Lecture 4