580 likes | 596 Views
Explore a new explanation to why Minimax works effectively in game tree search, challenging traditional assumptions and offering a real-valued Minimax model to overcome inherent errors. Dive into the depths of game-playing program strategies and understand the complexities of heuristic evaluation functions.
E N D
Why Minimax Works:An Alternative Explanation Mitja Luštrek 1, Ivan Bratko 2 and Matjaž Gams 1 1 Jožef Stefan Institute, Department of Intelligent Systems 2 University of Ljubljana, Faculty of Computer and Information Science
Plan of talk • Game tree search and minimax pathology • A Real-Valued Minimax Model • Explanation
Search in game trees Us to move Them to move
Search in game trees v = max( v1, v2) = true value + error Back up v1 v2 Evaluate heuristically with some error
Searching deeper Root value now more trustworthy? Back up Back up Back up ........ Back up ..... Evaluate heuristically here
The minimax pathology • Conventional wisdom in practice of game-playing: the deeper a program searches, the better it plays. • Early mathematical analyses of minimax [Nau 1979; Beal 1980] - some surprizing results • According to theoretical model, minimax SHOULD NOT work! • Minimaxing amplifies the error of the heuristic evaluation function • The deeper a game-playing program searches, the worse it plays! • Nau: “Performing worse by working harder!” • This is called the minimax pathology
Problems with these theoretical analyses • General impression: Something must have been wrong with these analyses! • Were the assumptions made by these mathematical models realistic?
Beal’s early assumptions • Uniform branching factor • Position values are binary: loss or win • Proportion of wins for side to move is constant throughout game tree • Position values within a level are independent of each other • Static heuristic evaluation error is independent of the depth of node Note: None of these looks very unrealistic
Modifying Beal’s assumptions • Subsequent analyses by various authors: modify the assumptions so that pathology disappeared
Successful attempts to explain the pathology • Pathology disappeared when assuming: • Positions close to each other have similar values [Bratko and Gams 1982; Beal 1982; Nau 1982; Scheucher and Kaindl 1998; Luštrek 2004 • Early terminations [Pearl 1983] • Geometrically distributed branching factor [Michon 1983] • Which explanation is most natural? • Which conditions for the absence of the pathology are really necessary?
This paper: an alternative explanation • Is there a more fundamental explanation, one that makes “the least assumption”? Is there something fundamental about the minimax relation that makes minimaxing successful in practice? • This paper: Yes, there is! • It has to do with Beal’s assumption 5 • This paper: estimate positions by real values (as game playing programs do). Surprisingly, then Beal’s assumption 5 is not tenable!
Two-value and real-value errors • Two-value error loss win • Real-value error 0.32 0.41
Beal’s assumption 5: P( two-value heuristic error) constant with level Our assumption: real-value heuristic error distrib. constant with level P( binary error) Real-value error noise Depth
Summary of this paper’s findings • Our assumption: real-value error distribution at bottom level of search is constant throughout game tree • Beal’s assumption 5: two-value error distribution at bottom level of search is constant throughout game tree • These assumptions look equivalent, BUT surprisingly they are NOT! • These assumptions are in fact INCOMPATIBLE: minimax relation between true values of positions in game tree does not permit both two-value error and real-value error to be constant
Summary of this paper’s findings • When real-value heuristic error is constant, backed-up heuristic values become more reliable with increased depth of search - i.e. no pathology! • Corresponding backed-up binary values also become more reliable with depth (binary values obtained from real values through thresholding)
Game tree search and minimax pathology • A Real-Valued Minimax Model • Explanation
Why multiple/real values? • Neccessary in games where the final outcome is multivalued (Othello, tarok). • Used by humans and game-playing programs. • Seem unnecessary in games where the outcome is a loss, a win or perhaps a draw (chess, checkers). • But: • in a losing position against a fallible and unknown opponent, the outcome is uncertain; • in a winning positon, a perfect two-valued evaluation function will not lose, but it may never win, either. • Multiple values are required to model uncertainty and to maintain a direction of play towards an eventual win.
A real-valued minimax model Aims to ba a real-valued version of Beal’s model. • Uniform branching factor; • position values are real numbers; • if the real values are converted to losses and wins, the proportion of losses for the side to move is constant throughout game tree; • position values within a level are independent of each other; • static heuristic evaluation error is independent of the depth of node (the error is normally distributed noise).
Building of a game tree True values distributed uniformly in [0, 1]
Building of a game tree True values backed up
Building of a game tree True values backed up
Building of a game tree True values backed up
Building of a game tree True values backed up
Building of a game tree Search to this depth
Building of a game tree Heuristic values = true values + normally distributed noise
Building of a game tree Heuristic values backed up
Building of a game tree Heuristic values backed up
Building of a game tree Heuristic values backed up
What we do with our model • Monte Carlo experiments: • generate 10,000 sets of true values; • generate 10 sets of heuristic values per set of true values per depth of search. • Measure the error at the root: • real-value error = the average difference between the true value and the heuristic value; • two-value error = the frequency of mistaking a loss for a win or vice versa. • Compare the error at the root when searching to different depths.
Conversion of real values to losses and wins (1) • To measure two-value error, real values must be converted to losses and wins. • Value above a threshold means win, below the threshold loss. • At the leaves: • the proportion of losses for the side to move = cb (because it must be the same at all levels); • real values distributed uniformly in [0, 1]; • therefore threshold = cb. • At higher levels: • minimaxing on real values is equivalent to minimaxing on two values; • therefore also threshold = cb.
Conversion of real values to losses and wins (2) Real values Two values
Conversion of real values to losses and wins (2) Real values Two values Minimaxing
Conversion of real values to losses and wins (2) Real values Two values Minimaxing
Conversion of real values to losses and wins (2) Real values Two values Apply threshold
Conversion of real values to losses and wins (2) Real values Two values
Conversion of real values to losses and wins (2) Real values Two values
Conversion of real values to losses and wins (2) Real values Two values Apply threshold
Conversion of real values to losses and wins (2) Real values Two values Minimaxing
Conversion of real values to losses and wins (2) Real values Two values Minimaxing
Conversion of real values to losses and wins (2) Real values Two values
Error at the root / constant real-value error • Plotted: real-value and two-value error at the root. • Real-value error at the lowest level of search: normally distributed noise with standard deviaiton 0.1.
Error at the bottom / constant real-value error • Plotted: two-value error at the lowest level of search. • Real-value error at the lowest level of search: normally distributed noise with standard deviaiton 0.1.
Error at the bottom / constant two-value error • Plotted: real-value error at the lowest level of search. • Two-value error at the lowest level of search: 0.1.
Error at the root / constant two-value error • Plotted: two-value error at the root in our real-value model and in Beal’s model. • Two-value error at the lowest level of search: 0.1. • After a small tweak of Beal’s model, we get a perfect match.
Conclusions from the graphs • Real-value error at the lowest level of search is constant: • two-value error at the lowest level of search decreases with the depth of search; • no pathology. • Two-value error at the lowest level of search is constant: • real-value error at the lowest level of search increases with the depth of search; • pathology. • What is right?
Game tree search and minimax pathology • A Real-Valued Minimax Model • Explanation
Should real or two values be constant? (1) • Already explained why real values are necessary. • Real-value error most naturally represent the fallibility of the heuristic evaluation function. • Game playing programs do not use two-valued evaluation functions, but if they did: • they would more often make a mistake in uncertain positions close to the threshold; • they would rarely make a mistake in certain positions far from the threshold.