Dr. Richard Allmendinger richard.allmendinger@manchester.ac.uk

COM3: Data-driven optimization via search heuristics Day 2 Dr. Richard Allmendinger richard.allmendinger@manchester.ac.uk The Jyväskylä Summer School, 7th – 18th August, 2017

Easy problem Difficult problem 5 1 2 4 5 7 3 2 2 1 Past trips, customer requests, traffic data, etc

Data-driven optimization Traditionally • Assumed that quality of candidate solutions can be computed using a closed-loop function • Models of decision-making under uncertainty assume perfect information, i.e. accurate values for the system parameters and specific probability distributions for the random variables. Data-driven optimization • Closed-loop functions may not be available • Precise knowledge of models is rarely available in practice Data-driven optimization uses historical data and/or observations of the random variables as direct inputs to the optimization problems

Typical challenges in (data-driven) optimization include • Constraints • Uncertainty about data • Dynamic/online decision making • Multiple conflicting objectives • Limited data/data is expensive to obtain • Huge amount of data • User preferences may need to be accounted for How to solve (data-driven) optimization problems? • Systematic enumeration (brute force) • Problem specific, dedicated algorithms • Generic methods for exact optimization • Heuristic methods (combined with data pre-processing and/or analytics)

Data-driven heuristic optimization (DDHO) field Data-driven heuristic optimization (DDHO) field Operations research DDHO Applications Machine Learning Statistics

Mathematical optimization Optimization refers to choosing the best element from some set of available alternatives minimum smallest

Some formal problem definitions… Continuous fence problem Knapsack problem Traveling salesman problem

Many types of optimization models Many more problem classifications, e.g.: • Deterministic vs stochastic • One or multiple objective functions • Computational vs physical evaluations • Online vs offline optimization Optimization problem Convex Non-convex Linear Non-linear Linear Non-linear … Constrained Unconstrained

What else did we do on Day 1... • Combinatorial vs continuous problems • Mathematical programming vs Heuristics • Global optimization vs local optimization • Problem vs Problem instance • Decision problem vs Optimization problem • PvsNP • Time complexityvs Worst-case time complexity

Agenda for today Finally we will look at some optimization algorithms with focus on combinatorial optimization • Greedy algorithms • Dynamic programming • Approximation algorithms • Local search • Simulated annealing • Population-based search heuristics All algorithms could be used in your assignment in 4.d but in particular greedy and local search, simulated annealing and population-based heuristics!

Agenda for today Finally we will look at some optimization algorithms with focus on combinatorial optimization • Greedy algorithms • Dynamic programming • Approximation algorithms • Local search • Simulated annealing • Population-based search heuristics

Combinatorial optimization • Discrete decision variables • Search space often finite but typically too vast for a brute force approach  more efficient algorithms needed Algorithms for combinatorial optimization • Typically problem-specific but some general concepts are repeatedly used, such as • Greedy algorithms • Integer linear programming • Branch-and-bound • Dynamic programming • Heuristics (e.g. local search, nature-inspired algorithms)

Greedy algorithms “A greedy algorithm is an algorithmic paradigm that follows the problem solving heuristic of making the locally optimal choice at each stage with the hope of finding a global optimum.” Wikipedia https://en.wikipedia.org/wiki/Greedy_algorithm - In general a greedy strategy does not produce an optimal solution + May yield locally optimal solutions that approximate a global optimal solution in a reasonable time Can you think if any advantages and/or disadvantages of greedy algorithms?

Greedy algorithms Goal of the next slides: Build up an understanding about simple but important optimization problems and simple heuristics to tackle these problems

Knapsack-type problem… Example 1: Coin changing Goal: Given the currency denominations, 1, 2, 5, 10, 20, 50, 100, 200, pay a certain amount to the customer using as few coins as possible Example: 34¢ Cashier’ algorithm: Until total change not reached, add coin of the largest value that does not take us past the amount to be paid Example: €2.89 Can you think of a greedy algorithm to solve this problem? Try it out for an amount of €2.89 (optional: write down your algorithm in form of a pseudocode)

Example 1: Coin changing Exercise: Provide the formal definition of the change-making problem

Example 1: Coin changing Two issues with the Cashier’ algorithm • Optimal for standard coin sets but not for any set of denominations • For example, given U.S. postage denominations: 1, 10, 21, 34, 70, 100, 350, 1225, 1500, dispense amount to customer using fewest number of stamps. • Cashier's algorithm: 140¢ = 100 + 34 + 1 + 1 + 1 + 1 + 1 + 1 • Optimal: 140¢ = 70 + 70 • may not even lead to a feasible solution if smallest denomination is > 1 • For example, given the denominations: 7, 8, 9 • Cashier's algorithm: 15¢ = 9 + ??? • Optimal: 15¢ = 7 + 8 Does the Cashier’s algorithm perform well in all situations?

Example 2: Interval scheduling • Job j starts at sj and finishes at hj • Two jobs compatible if they don't overlap • Goal: find maximum subset of mutually compatible jobs

Example 2: Interval scheduling • Job j starts at sj and finishes at hj • Two jobs compatible if they don't overlap • Goal: find maximum subset of mutually compatible jobs Real-world scenario: modifyassignment problem such that you have one cluster only and one can do one project at the time… Can you think of a greedy algorithm to solve this problem?

Example 2: Interval scheduling Greedy template: Consider jobs in some natural order. Then take each job provided it is compatible with the ones already taken. • [Earliest start time]Consider jobs in ascending order of sj Counterexample

Example 2: Interval scheduling Greedy template: Consider jobs in some natural order. Then take each job provided it is compatible with the ones already taken. • [Shortest interval] Consider jobs in ascending order of hj – sj Counterexample

Example 2: Interval scheduling Greedy template: Consider jobs in some natural order. Then take each job provided it is compatible with the ones already taken. • [Fewest conflicts] For each job j, count the number of conflicting jobs cj. Schedule in ascending order ofcj 5 3 4 Counterexample 6 5 5 4 2

Example 2: Interval scheduling Greedy template: Consider jobs in some natural order. Then take each job provided it is compatible with the ones already taken. • [Earliest finish time] Consider jobs in ascending order of hj Optimal algorithm for this problem type

Example 2: Interval scheduling Optimal algorithm earliest-finish-time first: • Keep track of job j* that was added last to A • Job j is compatible with A iffsj ≥ fj* • Sorting by finish time takes O(n log n) time Interesting question: Can you somehow adapt this algorithm to the allocation problem in your assignment

Example 3: Traveling salesman problem • Given n cities and a network of connections and distances between cities dij R • Goal: Find shortest round trip Can you think of a greedy algorithm to solve this problem? 1 3 16 4 5 12 2 8 4 1 2 2 5 3 15

Example 3: Traveling salesman problem • Given n cities and a network of connections and distances between cities dij R • Goal: Find shortest round trip Nearest-Neighbor Algorithm: Start with initial city, and then choose always the closest city next that you have not visited yet. Starting city: 2 Next closest city: 4 Next closest city: 5 Next closest city: 3 Next closest city: 1 f(x) = 1 + 2 + 2 + 16 + 12 = 33 1 1 3 3 16 16 4 4 4 5 5 5 3 1 12 12 2 2 8 8 4 4 1 1 2 2 Is the path discovered by this algorithm always the same given a particular network? 2 2 5 5 3 3 15 15

Example 4: 0-1 knapsack problem • Given are n items with weights wiR and value vi R • Also given is a weight restriction WR • Goal: fill knapsack so as to maximize total value. Done yesterday

Exercise: Come up with a greedy algorithm for the knapsack problem and write down its pseudocode • Don’t expect to come up with an algorithm that finds the optimal solution reliably (it is an NP hard problem…) • Think about which greedy choices you can make when you have to come up with a solution for the problem

Example 4: 0-1 knapsack problem Exercise: Come up with a greedy algorithm for the knapsack problem • Rank items by value/weight ratio: vi/wi • (Alternatively rank items by weight or value only) • Consider items in order of decreasing ratio • Pack as many items as possible

Example 4: 0-1 knapsack problem Exercise: Come up with a greedy algorithm for the knapsack problem • Rank items by value/weight ratio: vi/wi • Consider items in order of decreasing ratio • Pack as many items as possible 0.3 0-1 Knapsack(v, w, W) 1: Sort items in terms of vi/wi in decreasing order 2: load := 0; knapsack := ; i := 1 3: while load < W and i ≤ n do: 4: ifwi≤ W – load then 5: knapsack = knapsack Ui 6: load = load + wi 7: i = i + 1 8: return knapsack 1 2 1 2.5 Total load = 8 Total profit = 15

Example 4: 0-1 knapsack problem Note: • This greedy algorithm is not optimal for the 0-1 knapsack problem • Optimal (but computationally expensive) alternatives are dynamic programming or branch & bound

Agenda for today Finally we will look at some optimization algorithms with focus on combinatorial optimization • Greedy algorithms • Dynamic programming • Approximation algorithms • Local search • Simulated annealing • Population-based search heuristics

Dynamic programming Greedy algorithm: Build up a solution incrementally, myopically optimizing some local criterion Dynamic programming (DP): Break up a problem into a series of overlappingsubproblems, and build up solutions to larger and larger subproblems. • DP also makes sure that the subproblems are not solved too often but only once by keeping the solutions of simpler subproblems in memory (“trading space vs. time”) • DP is an exact method, i.e. in comparison to the greedy approach, it always solves a problem to optimality • Applicable to problems with a certain (decomposable/multi-stage) structure only  Bellman equation

Dynamic programming (DP) Richard Bellman: Pioneered the systematic study of DP in the 50’s Etymology: • Dynamic programming = planning over time • Secretary of Defence was hostile to mathematical research • Bellman sought an impressive name to avoid confrontation  600 citations  20K citations

Dynamic programming DP has been used in many areas, such as • Bioinformatics • Control theory • Information theory • Operations research • Computer science: theory, graphics, AI, compilers, systems,… Some famous DP algorithms: • Unix diff for comparing two files • Viterbi for hidden Markov models • De Boor for evaluating spline curves • Smith-Waterman for genetic sequence alignment • Bellman-Ford for shortest path routing in networks

Example: DP for the 0-1 knapsack problem • Given are n items with weights wiR and value vi R • Also given is a weight restriction WR • Goal: fill knapsack so as to maximize total value. Knapsack instance Different possible greedy strategies, such as: • Greedy by value: Repeatedly add item with maximum vi : {5}, total profit = 28 • Greedy by weight: Repeatedly add item with minimum wi : {1,2,3}, total profit = 25 • Greedy by ratio: Repeatedly add item with maximum ratio vi / wi : {5}, total profit = 28 Observation: None of greedy algorithms is optimal  Weight limit W= 11

Example: DP for the 0-1 knapsack problem Questions to answer with DP: a) what could be subproblems? b) how to solve subproblems with the help of smaller ones? [write down the Bellman equation!] c) how to implement the Bellman equation? a) V(i,w): maximum total value across the items 1,…, i with weight limit w b) Bellman equation: Current value = Immediate reward + Future value We can make the optimal choice of packing item i or not for a knapsack of weight w if we know the optimal choice for items 1, …, i − 1: V(i,w) = To be maximized

Example: DP for the 0-1 knapsack problem How to implement the Bellman equation? • A recursive implementation of the Bellman equation is simple, but the V(i,j) might need to be computed more than once • To circumvent computing the subproblems more than once, we can store their results (in a matrix for example)… • V(i,w) = Weight limit w Subset of items 1,…, i maximum total value across the items 1,2,3 with weight limit w = 4

Example: DP for the 0-1 knapsack problem Knapsack instance How to implement the Bellman equation? V(i,w) = Weight limit w Subset of items 1,…, i Initialization: V(i,w) = 0 if i = 0 or w = 0 V(i,w) = maximum total value across the items 1,…,i with weight limit w

Example: DP for the 0-1 knapsack problem Knapsack instance How to implement the Bellman equation? V(i,w) = Weight limit w Subset of items 1,…, i +vi (=1) V(i,w) = maximum total value across the items 1,…,i with weight limit w

Example: DP for the 0-1 knapsack problem Knapsack instance How to implement the Bellman equation? V(i,w) = Weight limit w Subset of items 1,…, i V(i,w) = maximum total value across the items 1,…,i with weight limit w

Example: DP for the 0-1 knapsack problem Knapsack instance How to implement the Bellman equation? V(i,w) = Weight limit w Subset of items 1,…, i How to determine which items should actually be packed? V(i,w) = maximum total value across the items 1,…,i with weight limit w

Example: DP for the 0-1 knapsack problem Knapsack instance How to implement the Bellman equation? V(i,w) = Weight limit w Subset of items 1,…, i +vi (=18) How to determine which items should actually be packed? V(i,w) = maximum total value across the items 1,…,i with weight limit w

Example: DP for the 0-1 knapsack problem Knapsack instance How to implement the Bellman equation? V(i,w) = Weight limit w Subset of items 1,…, i How to determine which items should actually be packed? V(i,w) = maximum total value across the items 1,…,i with weight limit w

Example: DP for the 0-1 knapsack problem Knapsack instance How to implement the Bellman equation? V(i,w) = Weight limit w Subset of items 1,…, i x1 = 0 x2 = 0 x3 = 1 x4 = 1 x5 = 0 How to determine which items should actually be packed? V(i,w) = maximum total value across the items 1,…,i with weight limit w

Dr. Richard Allmendinger richard.allmendinger@manchester.ac.uk