Why Minimax Works: An Alternative Explanation

Why Minimax Works:An Alternative Explanation Mitja Luštrek 1, Ivan Bratko 2 and Matjaž Gams 1 1 Jožef Stefan Institute, Department of Intelligent Systems 2 University of Ljubljana, Faculty of Computer and Information Science

Plan of talk • Game tree search and minimax pathology • A Real-Valued Minimax Model • Explanation

Search in game trees Us to move Them to move

Search in game trees v = max( v1, v2) = true value + error Back up v1 v2 Evaluate heuristically with some error

Searching deeper Root value now more trustworthy? Back up Back up Back up ........ Back up ..... Evaluate heuristically here

The minimax pathology • Conventional wisdom in practice of game-playing: the deeper a program searches, the better it plays. • Early mathematical analyses of minimax [Nau 1979; Beal 1980] - some surprizing results • According to theoretical model, minimax SHOULD NOT work! • Minimaxing amplifies the error of the heuristic evaluation function • The deeper a game-playing program searches, the worse it plays! • Nau: “Performing worse by working harder!” • This is called the minimax pathology

Problems with these theoretical analyses • General impression: Something must have been wrong with these analyses! • Were the assumptions made by these mathematical models realistic?

Beal’s early assumptions • Uniform branching factor • Position values are binary: loss or win • Proportion of wins for side to move is constant throughout game tree • Position values within a level are independent of each other • Static heuristic evaluation error is independent of the depth of node Note: None of these looks very unrealistic

Modifying Beal’s assumptions • Subsequent analyses by various authors: modify the assumptions so that pathology disappeared

Successful attempts to explain the pathology • Pathology disappeared when assuming: • Positions close to each other have similar values [Bratko and Gams 1982; Beal 1982; Nau 1982; Scheucher and Kaindl 1998; Luštrek 2004 • Early terminations [Pearl 1983] • Geometrically distributed branching factor [Michon 1983] • Which explanation is most natural? • Which conditions for the absence of the pathology are really necessary?

This paper: an alternative explanation • Is there a more fundamental explanation, one that makes “the least assumption”? Is there something fundamental about the minimax relation that makes minimaxing successful in practice? • This paper: Yes, there is! • It has to do with Beal’s assumption 5 • This paper: estimate positions by real values (as game playing programs do). Surprisingly, then Beal’s assumption 5 is not tenable!

Two-value and real-value errors • Two-value error loss win • Real-value error 0.32 0.41

Beal’s assumption 5: P( two-value heuristic error) constant with level Our assumption: real-value heuristic error distrib. constant with level P( binary error) Real-value error noise Depth

Summary of this paper’s findings • Our assumption: real-value error distribution at bottom level of search is constant throughout game tree • Beal’s assumption 5: two-value error distribution at bottom level of search is constant throughout game tree • These assumptions look equivalent, BUT surprisingly they are NOT! • These assumptions are in fact INCOMPATIBLE: minimax relation between true values of positions in game tree does not permit both two-value error and real-value error to be constant

Summary of this paper’s findings • When real-value heuristic error is constant, backed-up heuristic values become more reliable with increased depth of search - i.e. no pathology! • Corresponding backed-up binary values also become more reliable with depth (binary values obtained from real values through thresholding)

MITJA TO CONTINUE

Game tree search and minimax pathology • A Real-Valued Minimax Model • Explanation

Why multiple/real values? • Neccessary in games where the final outcome is multivalued (Othello, tarok). • Used by humans and game-playing programs. • Seem unnecessary in games where the outcome is a loss, a win or perhaps a draw (chess, checkers). • But: • in a losing position against a fallible and unknown opponent, the outcome is uncertain; • in a winning positon, a perfect two-valued evaluation function will not lose, but it may never win, either. • Multiple values are required to model uncertainty and to maintain a direction of play towards an eventual win.

A real-valued minimax model Aims to ba a real-valued version of Beal’s model. • Uniform branching factor; • position values are real numbers; • if the real values are converted to losses and wins, the proportion of losses for the side to move is constant throughout game tree; • position values within a level are independent of each other; • static heuristic evaluation error is independent of the depth of node (the error is normally distributed noise).

Building of a game tree

Building of a game tree True values distributed uniformly in [0, 1]

Building of a game tree True values backed up

Building of a game tree Search to this depth

Building of a game tree Heuristic values = true values + normally distributed noise

Building of a game tree Heuristic values backed up

What we do with our model • Monte Carlo experiments: • generate 10,000 sets of true values; • generate 10 sets of heuristic values per set of true values per depth of search. • Measure the error at the root: • real-value error = the average difference between the true value and the heuristic value; • two-value error = the frequency of mistaking a loss for a win or vice versa. • Compare the error at the root when searching to different depths.

Conversion of real values to losses and wins (1) • To measure two-value error, real values must be converted to losses and wins. • Value above a threshold means win, below the threshold loss. • At the leaves: • the proportion of losses for the side to move = cb (because it must be the same at all levels); • real values distributed uniformly in [0, 1]; • therefore threshold = cb. • At higher levels: • minimaxing on real values is equivalent to minimaxing on two values; • therefore also threshold = cb.

Conversion of real values to losses and wins (2) Real values Two values

Conversion of real values to losses and wins (2) Real values Two values Minimaxing

Conversion of real values to losses and wins (2) Real values Two values Apply threshold

Conversion of real values to losses and wins (2) Real values Two values Minimaxing

Error at the root / constant real-value error • Plotted: real-value and two-value error at the root. • Real-value error at the lowest level of search: normally distributed noise with standard deviaiton 0.1.

Error at the bottom / constant real-value error • Plotted: two-value error at the lowest level of search. • Real-value error at the lowest level of search: normally distributed noise with standard deviaiton 0.1.

Error at the bottom / constant two-value error • Plotted: real-value error at the lowest level of search. • Two-value error at the lowest level of search: 0.1.

Error at the root / constant two-value error • Plotted: two-value error at the root in our real-value model and in Beal’s model. • Two-value error at the lowest level of search: 0.1. • After a small tweak of Beal’s model, we get a perfect match.

Conclusions from the graphs • Real-value error at the lowest level of search is constant: • two-value error at the lowest level of search decreases with the depth of search; • no pathology. • Two-value error at the lowest level of search is constant: • real-value error at the lowest level of search increases with the depth of search; • pathology. • What is right?

Game tree search and minimax pathology • A Real-Valued Minimax Model • Explanation

Should real or two values be constant? (1) • Already explained why real values are necessary. • Real-value error most naturally represent the fallibility of the heuristic evaluation function. • Game playing programs do not use two-valued evaluation functions, but if they did: • they would more often make a mistake in uncertain positions close to the threshold; • they would rarely make a mistake in certain positions far from the threshold.

Should real or two values be constant? (2)

Why Minimax Works: An Alternative Explanation

Why Minimax Works: An Alternative Explanation

Presentation Transcript

alternatives 1 2 for control for listeria monocytogenes

Alternative Medicine

by Ahmed Aseeri April 2, 2014

A New Look At Some Old Foes

DB Libraries: An Alternative to DBMS

Chapter 2: The World of Science

DEPARTMENT of PUBLIC WORKS and PLANNING

Explanation

ALTERNATIVE CO-ORDINATES SYSTEM FOR AFS DRAFT REGULATION

Considering Alternative Compensation Structures

Alternative Sanctions

Exercise 13

Explanation and Trust for Adaptive Agents

Significance

Minimax with Alpha Beta Pruning

Alternative Public Works Procedures Sunset Review :

Alternative Public Works Procedures Sunset Review :

Developmental Food Aid: Alternative Approaches

Alternative Public Works Sunset Review

Experts Learning and The Minimax Theorem for Zero-Sum Games