210 likes | 426 Views
Neural Heuristics For Problem Solving: Using ANNs to Develop Heuristics for the 8-Puzzle. by Bambridge E. Peterson. What is a problem? (informal). question to be answered paradox to be resolved obstacle to be overcome goal to be achieved crisis to be averted challenge to be met.
E N D
Neural Heuristics For Problem Solving:Using ANNs to Develop Heuristics for the 8-Puzzle by Bambridge E. Peterson
What is a problem? (informal) • question to be answered • paradox to be resolved • obstacle to be overcome • goal to be achieved • crisis to be averted • challenge to be met
What is a problem? (formal) • Formulate problem as a graph search • Initial state (question), goal state (answer) • Actions - allowable actions for a given state • Transition function - T(S,A) - given a state S and action A, return the resulting state S’ when A is performed in S • Goal test - function to test whether we’ve reached the goal • Path-cost function - keeps track of path cost • (from Artificial Intelligence: A Modern Approach, 3rd Edition by Russell and Norvig)
Graph Search 1 1 1 • Idea: • Use explored set to keep • track of expanded nodes • Use frontier to store • successor nodes still to be • expanded • Many search algorithms differ in how to store nodes • in the frontier 3 G 1 12 1 2 S
Graph Search 1 1 1 • Some Examples: • Breadth-first search • Depth-first search • Iterative-deepening • Uniform cost • Greedy-best first • A* • Iterative-deepening A* 3 G 1 12 1 2 S
Graph Search 1 1 1 • A* search • order priority queue using • cost function • f(n) = g(n) + h(n) • f(n) is a cost function • g(n) : path cost to reach • node (n) • h(n) is the heuristic function - • estimated distance to the goal • A* optimal if h(n) is admissible and consistent 3 G 1 2 1 2 S
Heuristics in Graph Search • What is a heuristic? • General rule of thumb for solving a problem. Usually developed through experience • What is an admissible heuristic? • A heuristic that never overestimates the path-cost to the goal • What is a consistent heuristic? • never takes a step back (monotone) • Why use heuristics? • Brute force search is slow when state space is large • Reduces number of nodes necessary to explore
N-Puzzle • n = i2 - 1 for positive integer i • sliding block puzzle, grid • n - 1 tiles, 1 ‘blank space’ • start in random state • can move one tile at a time • exchange places with the ‘blank’ space • can only move up, down, left, right • 8-puzzle example (right) • goal state is numbers 1 through n in order, left to right, top to bottom
N-Puzzle Heuristics Why use heuristics??? N-puzzle is a good example • 8-puzzle: 9!/2 = 181,440 total states • 15-puzzle: 16!/2 approximately 1 trillion states • 24-puzzle: 25!/2 approximately 7.76 * 1024 states • Have fun with brute-force search in this state space Something more ‘clever’ than brute force approach is needed…. Manhattan Distance- sum total of city block distance of all tiles in their current position from position in goal state Misplaced tiles - total number of tiles not in goal state position
Symbolic vs. Subsymbolic • symbols + rules for their arrangement in space and transformation in time (syntax) is a general definition of language • Infinite meaningful arrangements can be generated from a finite set of symbols • Natural languages • Formal languages • Manhattan Distance is a symbolic heuristic • Connectionist • Parallel-distributed process • Simultaneous processing among multiple parallel channels • Can we use machine learning to develop heuristics? • Subsymbolic heuristics aka “Neural Heuristics”... So the goal is to develop a ‘better’ heuristic for the 8-puzzle...
Generating Training Data • generated 20,000 solved instances of the 8-puzzle • using Python to generate and solve states using the A*star algorithm • stored the instances in MongoDB as well as .txt file for processing in Octave • Note: A puzzle can be represented internally as a vector (3, 8, 2, 4, 5, 6, 1, 7, 9) - use 9 to represent the blank space. Obviously only certain operations can be performed...
Training Data FieldsExample • State n • # states explored • # nodes added to frontier • MD heuristic • Path-cost • Time (on my machine) • 8, 7, 1, 2, 9, 6, 3, 4, 5 • 1571 • 2448 • 18 • 24 • 94928 microseconds
Neural Heuristics • The idea... • Train various MLP networks with backpropagation • goal is approximation (regression) • Train network with different targets - • the optimal solution • the difference between the optimal solution and the manhattan distance of the state • perhaps another...
Neural Network Input • 9 element input state S was transformed in a 81 element vector of 1’s and 0’s - the 9 x k + t bit equaled 1 if and only if S[k] = t • Example: [2, 1, 3] = [0 1 0 1 0 0 0 0 1] • Example: [3, 2, 1] = [0 0 1 0 1 0 1 0 0] • Tried this because of the following paper: • Likely Admissible and subsymbolic heuristics
Neural Networks (cont.) • # hidden layers - 5, 10 and 15 • learning rate set at 0.1 • momentum 0.8 • Number of epochs 500-1000, 64 samples an epoch • used tanh activation function for the hidden layer • sigmoid activation function on output
Neural Networks (cont.) • 13,000 samples used for training set • 2,000 samples for tuning • 3,000 for testing the results of the trained MLP • 3,000 for ‘official’ testing in Python using A* • saved weights in a .txt file • tested in Python using Numpy
For the 3,000 remaining testing samples, I compared the stats between the manhattan distance heuristic and various neural heuristics developed in training Preliminary Results A bit disappointing so far... h* - heuristic developed with optimal path cost as target h*_md - heuristic function developed with optimal path cost minus manhattan distance as target h*_md_avg - mean of the two above heuristics
Preliminary Results A bit disappointing so far... Examples… Using MDheuristic, takes less than 1 second to solve 10 n-puzzle examples. Average explored for these examples is 963, with 1508 nodes added to the frontier For the same puzzles, using h*, it took over 2 minutes to solve the puzzles, with an average of 27,000 nodes explored and 40000 added to the frontier Something isn’t right here...
Next Up • Double check code for errors • Try 9-h-1 topology, using just the state input without transformation into bit vector • SVM - give Support Vector Machine a crack at it • Discuss with Professor Hu • Still a week left!