560 likes | 775 Views
DEEP BLUE. Motivation. The quest of a computer scientist is to build a machine that can match a human mind.
E N D
Motivation The quest of a computer scientist is to build a machine that can match a human mind. One of the central conundrums of the last 40 years in AI research is that problems we thought were hard turned out to be fairly easy, and that problems we thought were easy have turned out to be profoundly difficult. One such field ,where major break through has been achieved is chess playing. Deep Blue is the culmination of a multi year effort to build a world class chess machine.
Rich history of cumulative ideas Minimax search, evaluation function learning (1950). Alpha-Beta search (1966). Transposition Tables (1967). Iterative deepening DFS (1975). End game data bases ,singular extensions(1977, 1980) Parallel search and evaluation(1983 ,1985) Circuitry (1987)
Chess Chess as a game has fascinated the academia .Alan Turing is known to have developed his chess playing algorithm(never implemented) A vast collection of various defence ,attack ,gambit etc have been deeply studied and play a major role in deep blue's strategy called ”Open book”. Deep Blue uses around 8000 different evaluation functions each based on heuristics designed by players over years of experience.
The teams The Deep Blue team consisted of Feng-hsiung Hsu and Murray Campbell, veterans from the Chip Test and Deep Thought days at Carnegie Mellon University, and IBM additions C.J. Tan, Joseph Hoane, and Jerry Brody. vs
Match History.. The first match (series of six matches), between Deep Blue and Gary Kasparov was played in February 1996 in Philadelphia, Pennsylvania Result 4 -2 (Kasparov). The rematch was held May 3-11 at the Equitable Center in downtown Manhattan (1997). Result 3.5 – 2.5 (Deep blue) The rematch witnessed the shortest game between man and machine at this level.
Chinese wall effect This is how humans excel at chess.
System overview Deep Blue is a massively parallel system designed for carrying out chess game tree searches( smart brute force) . The system is composed of a 30-node (30-processor) IBMRS/6000 SP computer and 480 single-chip chess search engines, with 16 chess chips per SP processor. Deep Blue is organized in three layers. One of the SP processors is designated as the master, and the remainder as workers. The master searches the top levels of the chess game tree, and then distributes “leaf” positions to the workers for further examination. The workers carry out a few levels of additional search, and then distribute their leaf positions to the chess chips, which search the last few levels of the tree.
Prelude • All fixed ply algorithms suffer from “horizon effect” and ways to overcome it like “singular extension”. • Deep chess knowledge behind evaluation function. Difference (between player and opponent) of • Material Mobility King position Bishop pair Rook pair Open rook files Control of central positions.
Prelude “Transpositions” in the game . Same board position may be reached playing different set of moves Using this property for optimization.
Chess Game Tree • A game of chess can be considered as a large n-ary treeThe chess tree is very bushy (usually about 35 branches from each position), and very deep • One way to search the complete tree is to stop at nodes only when some player wins. • Obviously searching each and every node recursively takes lots of time and space. • So Shannon proposed limiting how many moves from current position should be searched. • He used MinMax
MinMax Let,at the root position, it's White's turn to move.and White wants a more positive score if possible, the move with the largest score is selected as best, So does the Max Function The "Min" function works in reverse. The "Min" function is called when it's Black's turn to move, and black wants a more negative score, so the move with the most negative score is selected. These functions are dual recursive, meaning that they call each other until the desired search depth is reached. When the functions "bottom out", they return the result of the "Evaluate" function.
NegaMax • Nega-max is just min-max with an optimization. • The "Evaluate" function returns scores that are positive if the side to move at the current node is ahead, and everything else is also viewed from the perspective of the side to move. • When the value is returned, it is negated, because it is now being viewed from the perspective of the other side. • This function traverses the same nodes as "min-max" in the same order, and produces the same result. • It's much less code, which means that there is less opportunity to create a bug due to code replication, and the code is easier to maintain.
Alpha-Beta The AlphaBeta search procedure gets two additional arguments: Alpha, the best score that can be forced by some means. Anything worth less than this is of no use, because there is a strategy that is known to result in a score of alpha. Anything less than or equal to alpha is no improvement. Beta, worst-case scenario for the opponent. It's the worst thing that the opponent has to endure, because it's known that there is a way for the opponent to force a situation no worse than beta, from the opponent's point of view.
Alpha-Beta Fail-low :If a move results in a score that was less than or equal to alpha, it was just a bad move and it can be forgotten about, since, there is known to be a strategy that gets the moving side a position valued at alpha. Fail-High: If a move results in a score that is greater than or equal to beta, this whole node is trash, since the opponent is not going to let the side to move achieve this position, because there is some choice the opponent can make that will avoid it. If a move results in a score that is greater than alpha, but less than beta, this is the move that the side to move is going to plan to play, unless something changes later on. So alpha is increased to reflect this new value.
Problem with fixed depth Searches: if we only search n moves ahead, it may be possible that the catastrophy can be delayed by a sequence of moves that do not make any progress also works in other direction (good moves may not be found)
Horizon Effect The problem with abruptly stopping a search at a fixed depth is called the 'horizon effect' The negative horizon effect - MAX may try to avoid a bad situation which is actually inevitable. The positive horizon effect - MAX may not realise that something good is going to be achievable.
Quiescence Search This involves searching past the terminal search nodes (depth of 0) and testing all the non-quiescent or 'violent' moves until the situation becomes calm, and only then apply the evaluator. Enables programs to detect long capture sequences and calculate whether or not they are worth initiating. Expand searches to avoid evaluating a position where tactical disruption is in progress.
Quiescence Search • which moves are likely to cause a drastic change in the balance of power on the board? • material balance tends to be the overwhelming consideration in the evaluator, • so anything that changes material is fair game: • captures (especially those of major pieces) • and pawn promotions certainly qualify, • checks may also be worth a look • quiescence search considers extremely narrow, but dangerous lines.
Quiescence Search explosion • If any capture allowed, and searched in any old order, you'll destroy the efficiency of search and create a quiescent search explosion. • will result in dramatically reduced depth • and may cause a program crash. • a couple of ways of trying to avoid a quiescent explosion are: • MVV/LVA (Most Valuable Victim/Least Valuable Attacker):a move ordering technique to search the best capture first • SEE(Static Exchange Evaluation):improves move ordering. and allows to prune "bad" capturing moves, without many important captures being pruned out erroneously,
Forward Pruning: Null Move • Null-move forward pruning is a step performed prior to searching any of the moves. • You ask the question, "If I do nothing here, can the opponent do anything?“ • a reduced-depth search, (with the opponent to move first) • if that search results in a score >= beta, simply return beta without searching any moves.
Forward Pruning: Null Move • Null-move forward pruning doesn't work in some cases: • in zugzwang... • in endgames. • Null move during search has several advantages related to speed and accuracy: • null-move search may only consume 3% of the resources required by a full depth-N examination. • if, in a given position during quiescence search, it is revealed that the null move is better than any capture, this is a position where the evaluation function itself should be applied! • Overall, the null-move heuristic save between 20% and 75% of the effort required by a given search.
Order Evaluation The order of the evaluation of the nodes is crucial Good move order is crucial for good performance
Heuristics Capture moves first Forward moves first Remembers moves that produced most cutoffs at each level of search (Killer Heuristic) Maintains a table of all possible moves with history score (History Heuristic)
Minimal Window Search • If we have a good guess about the value of the position, we can further increase efficiency of Alpha-Beta by starting with a narrower interval than [−∞, +∞] • Extreme case: Minimal Window β = α + 1 • Possible results: • FAIL HIGH: • Value ≥ β = α + 1 ,Value > α • FAIL LOW: • Value <= α
NegaScout NegaScout assumes that the first node is best If the value of a node is lower (FAIL LOW), we can prune the node If FAIL-HIGH, we need to re-search the tree with a bigger window
Open Book Contains a set of positions, along with associated recommended moves. Computer selects one of the recommended moves by way of some random mechanism, then plays without further computation.
Extended Book It was derived automaticallyfrom a Grandmaster game database. For each position arising in first 30 moves, the system computes an evaluation for each move that has been played The move found, is searched by offsetting alpha-beta window by value of bonus.
Opening Game Deep Blue first checks whether a move is available from the opening book. Finding a move, it plays it immediately. Otherwise, it consults the extended book; if it finds the position there, it uses the evaluation information to award bonuses and penalties to a subset of the available moves. Deep Blue then carries out a search, with some preference for following successful Grandmaster moves. Automatic extended-book: In some situations, where the bonus for a move is unusually large, Deep Blue can make a move without computation.
System configuration • Was based on an IBM RS/6000 SP supercomputer • which could be viewed as a collection of IBM RS/6000 processors or workstations connected through a high-speed switching network. • Each processor in the system controlled upto 16 chess chips, distributed over two MicroChannel buses( a bus architecture like ISA , PC-AT ) • The 1997 Deep Blue had a 30-way machine with 30 RS/6000 processors. • The Search • Occurs in parallel on two levels. • One over the IBM RS/6000 SP switching network • Two over the MicroChannel bus inside a workstation node. • For a 12-ply search • The master workstation node would search the first four pliesin software. • All 30 nodes including the master , then search these new positions ( genrated in the step above). • At this point , the chess chips jump in and finish the last fourplies of the search , including quiescence search.
Chess Chips • The chess chip divides into four parts : • The move generator. • The smart-move stack. • The evaluation function. • The search control.
Entering a chess position after making a move , the chess machine processes two parallel paths : move generation and decision evaluation • The move generation path : • It first checks the legality of the opponent's most recent move by checking if we can capture the opponents king. If yes then this is an illegal moveand it returns immediately. • If the last move was legal , the move generation process is started. • If we cannot find a move , ( i.e. No legal moves exist , or in case of quiescence search no suitable forcing moves exist ) , we return to the parent position. • If we do have a move and the evaluation function say we cannot exit ,we cannot exit , we continue to search in the next level • The evaluation path : • We first check whether it's a leaf position ( usually by checking if we have reached sufficient depth ) . • If not , we do not need to carry out evaluation , and merge with the move generation process.
The move generator • 8x8 array of combinational logic. • Has a hard-wired finite state machine controlling move generation. • Can generate capturing , checking , check evasion moves and attacking moves. • The basic move generation algorithm is like that of the Belle move generator. • The combinational logic array is effectively a silicon chess board : • Each cell in the array has four major components : • A Find Victim Transmitter • A Find Attacker Transmitter • A Receiver • A Distributed Arbiter • A four bit piece register.
The move generation algorithm • Move generation consists of 2 phases : • A Find Victim Phase • A Find Attacker Phase
Find Victim Cycle • The Find-Victim Transmitter radiates appropriate attacking signals for the resident piece. • If a square is vacant , incoming attack signals from a ray piece ( bishop , rook or a queen ) pass through a cell. • The radiated attacking signals then reach the receiver , and a vote is taken to find the highest value victim. • At the receiver , if is some piece of opposing colour is attacking the resident piece, the receiver asserts a priority signal based on the piece type. • Since we want to find victims , the priority rises for higher value pieces , with the queen highest , then rook , bishop , knight , pawn and empty square in descending order. • The priority signals from all the square go to the arbitration network to find the highest valued victim.
Find Attacker Cycle • With the victim chosen the find-attacker cycle executes. • The Find-Attacker transmitter on the victim cell transmits reverse attacking signals as if it were a super-piece. • The receivers on all squares then detect whether on incoming reverse attacking signal matches the resident piece type. • If the resident is an appropriate attacker , the system asserts a priority signal. • Since we want to use the lowest valued attacker , the priority of pieces is reversed. • The priority signals then go through the arbitration network and with both the attacker and victim chosen , we have a move.
Generating Checking Moves • It activates all the find-victim transmitters as well as the opposing kings find attacker transmitter. • When both sets of signals collide on the same square , we have a square from which we can issue a check. • When ray signals align properly on a square with a piece • belonging to the moving side , the piece can • give a discovered check.