250 likes | 267 Views
Understanding different pattern matching approaches in computer Go, including formalization, DFA approach, tree approach, and implementation issues.
E N D
Pattern Matching in Computer Go Ling Zhao University of Alberta August 7, 2002
Outline • Motivations • Formalization • A straightforward approach • DFA approach • Tree approach • Conclusion
Motivations • Patterns are a very important representation of knowledge • Human players also use patterns • But, they are stronger on inexact matching • Computers are good at exact matching • Pattern databases can be very large (>10,000)
Formalization • A pattern is a set of points in an area of the board where each point (x,y) has a state in {Empty, Black, White, OutBound, DontCare} • 2D -> 1D for both the board and patterns GnuGo Explorer (early version?)
2D->1D Pattern: X: black stone O: white stone .: empty stone ?: don’t care *: don’t care, can even be out of bound ?X? .O? ?OO Scanning path: B C4A D5139 628 7 OO?X.?*O*?*?
What is the problem? • Given a set of patterns, try to find all matching in the board • Some issues Efficiency Scalability
Isomorphic patterns • Patterns can be rotated and mirrored. The color of stones can also be reversed. • A pattern can be presented in 16 forms
Straightforward approach For every point in the board For every transformation of patterns Try to match it Outcome A set of patterns matched in the board P: # of patterns C: average cost to match a pattern Computation: 361*16 * P * C
Mark Boon, "Pattern Matcher for Goliath", Computer go 13, winter 89-90. Pattern Matcher for Goliath • Pattern size: 5 X 5 (25 points), can be extended to 5 X 10. • 4 types for a point {White, Black, Empty, DontCare} • Three 32-bit integers: one for the bitmap of black stones, one for white stone, and one for empty point. • Each pattern is represented by an array of 3 32-bit integers
Matching Pattern test if (position & pattern == pattern) or equivalently test if (position & ~pattern == 0)
Implementation Issues • Try the mirrored and rotated patterns (8 of them) • First match the integer for empty points • 99% will result in a mismatch • Try matching the integers for white and black stones • swap white and black colors and try matching the integers for white and black stones Only need 8 arrays for every pattern
Implementation Issues (cont’d) • Incremental update: after a stone is added to the board, only need to try influenced positions. • Demand that every pattern has no DontCare points in the 5 interior points • only 243 (3^5) possibilities • Database is organized as 243 lists of patterns • Result in 100 times faster in general
Problems • scalability problem: think about 1,000,000 patterns • Lots of patterns may share the same prefix • Need to remove redundant comparisons
GnuGo Manual, 122-128, 2002 DFA Approach • DFA – Deterministic Finite Automaton • A finite set of states and a set of transitions from state to state which are caused by input symbols. • For each state, there is a unique transition on each symbol.
A DFA to recognize ????..X A DFA to recognize ????..X and XXO
Construct DFA • Question: given two DFAs to recognize two patterns, how to build one DFA to recognize both patterns? • synchronized product: B = L X R State in B are the couples (l,r) with l in L and r in R. The transition of B is the set of transitions (l1, r1)-a->(l2,r2) if l1-a->l2 in L and r1-a->r2 in R.
Discussions • In the worst case, the size of DFA is exponential in the number of patterns • In practical situations, the size tends to be stable. • Find the minimum-state DFA to recognize all patterns (optimization)
Martin Mueller’s Ph.D. Thesis, 75-78, 1995 Tree Approach • Use sampling to differentiate patterns • Reduce board-pattern matching
Tree Approach • Each pattern is covered by a grid of 4 X 4 tiles. • Each point can have one of the four values Empty, Black, White, DontCare. • A 32-bit integer is used to represent a tile. • A Patricia tree index for differentiating the patterns is built.
Patricia Tree • empty -- initial state • insert ababb: • ababb • insert ababa: 1st difference is at position 5 • [5] -- i.e. test position #5 • a b • ababa ababb • insert ba: • [1] • a b • [5] ba • a b • ababa ababb
Patricia Tree in Explorer • Sample positions and lead to different branches according to the value. • For each node, traverse both the subtree with matching color and the DontCare subtree. • Matching will end up with either mismatch leaves or pattern leaves. If sample points are matched, still need to do a full match for all points in the pattern.
Implementation Issues • Compare candidate patterns for each starting positions and orientation of the board. • Incremental Update • Adaptive tree: after a pattern-board mismatch, replace the pattern leaf with a new node branching at the index where the mismatch occurs. • Balance the Patricia tree
Wild Thoughts • Three approaches have their own advantages: • The straightforward one is simple • The DFA one is powerful • The Tree one is very flexible • Which one is better? • How to combine them?