160 likes | 270 Views
CGW 2007. Factors Affecting Diminishing Returns for Searching Deeper. Matej Guid and Ivan Bratko. Introduction. Deep search behaviour and diminishing returns for additional search in chess have been burning issues in the game-playing scientific community
E N D
CGW 2007 Factors Affecting Diminishing Returns for Searching Deeper Matej Guid and Ivan Bratko
Introduction Deep search behaviour and diminishing returns for additional search in chess have been burning issues in the game-playing scientific community Two different approaches took place in the rich history of research on this topic: self-play and go-deep. Self-play experiments Two otherwise identical programs are matched with one having a handicap. Usually the handicap is search depth. Go-deep experiments Best move changes resulting from different search depths of a set of positions are observed.
Go-deep approach Go-deep experiments were introduced for determining the expectation of a new move being discovered by searching one ply deeper. Based on Newborn's discovery: the results of self-play experiments are closely correlated with the rate of best move changes. Newborn’s hypothesis (1985) RI (d+1) – the rating improvement with increasing search depth by one ply BC (d+1) – the expectation of finding a best move at the next ply Although there were some objections about the above equation, determining best move changes were consistently used in several experiments.
Go-deep experiments In 1997, Phoenix (Schaeffer) and The Turk (Junghanns et al.) were used to record bestmove changes at iteration depths up to 9 plies. In the same year, Hyatt and Newborn let Crafty search up to 14 plies. Heinz (1998) repeated their go deep experiment with DarkThought. Diminishing returns for additional search effort All these experiments were performed on somehow limited datasets. They did NOT provide any conclusive empirical evidence that the best move changes decrease continuously with increasing search depth.
Search and Knowledge An interesting go-deep experiment was performed by Sadikov and Bratko in 2006. Very deep searches were made possible by concentrating on chess endgames with limited number of pieces. The results confirmed the existence of diminishing returns in chess. More importantly: they showed that the amount of knowledge a program has influences when diminishing returns will start to manifest themselves.
Going deeper Remarkable follow-up on previous work done on deep search behaviour using chess programs was published in 2005 by Steenhuisen. Crafty was used to repeat go-deep experiment on positions taken from previous experiments to push the search horizon to 20 plies. Also a set of 4.500 positions were used for searching up to 18 plies. • The results showed that the chance of new best moves being discovered: • decreases exponentionally when searching to higher depths, • decreases faster for positions closer to the end of the game. • Steenhuisen also reported that the speed with which the best-change rate decreases depends on the test set used.
Different test sets – different results How can one rely on statistical evidence from different go-deep experiments, if they obviously depend on the dataset used? We address this issue, and investigate the hypothesis that the rate at which returns diminish depends on the value of the position. Diminishing returns revisited again • A large dataset of more than 40,000 positions taken from real games has been used. • Go-deep experiments with programs Crafty and Rybka were conducted. • We show that the chance of new best moves being discovered at higher depths depend on: • the values of positions in the dataset, • the quality of evaluation function of the program used, and to some extent also on • the phase of the game, and the amount of material on the board.
Go-deep design Chess programs Crafty and Rybka were used to analyse more than 40.000 positions from real games played in World Championship matches. Each position occurring in these games after move 12 was searched to plies ranging from 2 to 12. • For the measurements done we use the same definitions as provided by Heinz and Steenhuisen: • Best Change • Fresh Best • (d-2) Best • (d-3) Best B(d) ≠ B(d - 1) B(d) ≠ B(j) j < d B(d) = B(d - 2) and B(d) ≠ B(d - 1) B(d) = B(d - 3) and B(d) ≠ B(d - 2) and B(d) ≠ B(d - 1) The estimated probabilities (in %) have been obtained for each measurement of best change. In each experiment, the original test set wasdivided into subsets, based on the values of positions.
Crafty goes deep Several researchers have used Crafty for their go-deep experiments. However, none had such a large set of test positions at disposal. Steenhuisen observed deep-search behaviour of Crafty on different test sets and reported different bestchange rates and bestchange rate decreases for different test sets. Division into subsets We divided the original test set into six subsets, based on the values of positions. Evaluations obtained at depth 12 served as the best possible approximations of the “real” values of positions.
Best Change and Fresh Best behaviour (1) Results of Crafty for the approximately equal positions of Group 4. The rates for Fresh Best are given as conditional to the occurrence of Best Change. Both Best Change and Fresh Best rates decrease consistently with increasing search depth.
Best Change and Fresh Best behaviour (2) Results of Crafty for the won positions of Group 6. The Best Change and Fresh Best rates again decrease consistently with increasing search depth. However, they decrease faster than in the subset of approximately equal positions.
Rybka goes deep Rybka is currently the strongest chess program according to the SSDF rating list, more than 250 higher rated than Crafty. The results Confirm that best-change rates depend on the values of positions. Demonstrate that the chance of new best moves being discovered at higher depths is lower at all depths compared to Crafty Subsets with positions of different range of evaluations obtained with Rybka at level 12:
Diminishing returns and phase of the game The experiments in this and the following section were performed by Crafty, with more or less balanced positions with depth 12 evaluation in range between -0.50 and 0.50. The results There is no obvious correlation between move number and the chance of new best moves being discovered at higher depth. In the positions very close to the end of the game it nevertheless decreases faster than in the positions of other groups. Six subsets of positions of different phases of the game, with evaluations in range between -0.50 and 0.50, obtained at search depth 12:
Diminishing returns and material Phase of the game is closely correlated with the amount of material on the board, so the rate of best-change properties will be lower in positions with less pieces on the board. The pawns are counted in and the commonly accepted values of pieces were taken (queen = 9, rook = 5, bishop = 3, knight = 3, pawn = 1). The results Material and best move changes are NOT clearly correlated. Only the curve for positions with the total piece value of less than 15 points of material (for each of the players) slightly deviates from the others. Six subsets of positions with different amount of material on the board (each player starts with the amount of 39 points) obtained at depth 12:
Possible applications The new discoveries are not only of theoretical importance – in particular, knowing that a quality of evaluation function influences diminishing returns can be very useful for practical purposes. Comparing evaluation functions of the programs While there are obvious ways to compare the strength of the programs, so far there were no possibilities of evaluating the strength of the evaluation functions. Observing best change rates in evaluations of different programs seems to provide such a possibility. Adjusting weights of attributes in evaluation functions Instead of computationally expensive self-play approaches, evaluating an appropriate set of positions at different depths could lead to desirable results. Possible approach: optimisation of the weights guided by diminishing returns based score function.
Conclusions Deep-search behaviour and the phenomenon of diminishing returns for additional search effort have been studied by several researchers, whereby different results were obtained on the different datasets used in go-deep experiments. In this contribution we studied some factors that affect diminishing returns for searching deeper. • The results obtained on a large set of more than 40,000 positions from real chess games using programs Crafty and Rybkashow that diminishing returns depend on: • the values of positions in the dataset, • the quality of evaluation function of the program used, and to some extent also on • the phase of the game, and the amount of material on the board.