110 likes | 264 Views
Running “ Zen ” on Computer Clusters. H IDEKI K ATO † and I KUO T AKEUCHI † † The University of Tokyo November 13 th , 2009. Contents. Background Related Work Parallel Monte-Carlo Tree Search Our Architecture Experiments Conclusion and Future Work. Background. Challenge
E N D
Running “Zen” on Computer Clusters HIDEKI KATO† and IKUO TAKEUCHI† † The University of Tokyo November 13th, 2009
Contents Background Related Work Parallel Monte-Carlo Tree Search Our Architecture Experiments Conclusion and Future Work
Background • Challenge • Can beat human professional players? • Interests • How “Zen” on an HPC cluster is strong? • Also, on a pc cluster at home? • Distributed MCTS on the Internet • Real world applications of MCTS • Provides smarter planning for intelligent robots, intelligent vehicles, etc • Environment: Many small processors on a LAN
Related Work • S. Gelly et al. introduced SMT PMCTS for shared-memory SMP systems (2006) • T. Cazenave et al. proposed and evaluated three PMCTS algorithms on a 16 Intel Pentium-4 MPI cluster (2007) • G. Chaslot et al. evaluated root, leafandtreeparallelization on 2 x 8 core IBM Power5 (2008) • S. Gelly et al. proposed a combination of tree and root parallel MCTS for MPI clusters of shared-memory SMP nodes (2008) • H. Kato et al. proposed and evaluated a leaf parallel MCTS on an asymmetrical pc cluster (2008)
Parallel Monte Carlo Tree Search • Tree parallelization • Symmetrical multithread parallelism on shared memory multiprocessor systems • Used by almost all MC Go programs • Leaf parallelization • For asymmetrical computer clusters • Fudo Go • Root parallelization • Shares search tree in part • Less communication • Best match with HPC clusters • MoGo, Many Faces of Go and Fuego
Leaf parallelization Root parallelization Global lock Local locks Parallel Monte Carlo Tree Search (cont’d) Tree parallelization G. Chaslot, et al. 2007
Our Architecture • Requirements • Since “Zen” is a commercial product, less modification is better • Root parallelization should be the best • Can run on non-MPI environments • Master manages GTP communications and conducts exchanging root information • Broadcasts incoming messages and sends answers back by the majority rule to/from the slaves • Slaves send their root information according to the master’s message • Master and slave programs are built into “Zen” for convenience and shorter delay
Master Master Root information GTP Broadcast Majority rule Gather, Average, & Broadcast Slave 1 Slave 1 Slave n Slave n Our Architecture (cont’d)
Experiments • No majority rule • All results are Elo ratings against Zen (self-play) on the same PC
Winning rate of self-play vs. Number of node computers (9 x 9) +400 HA8000 1 thread self-play 0.3s/move Handcraft pc 1 thread self-play 0.3s/move MoGo +300 +200 Winning rate (Elo) +100 0 -100 0 1 2 3 4 5 6 Log2 Number of node computers (master and slaves) Scalability
Conclusion and Future Work • We have implemented a root parallelization version of Zen by adding about 1,500 line of C++ code using Boost asio library • Not good scalability on HA8000 HPC cluster on 9 x 9 board • How about 19 x 19? How about other hardware? • Benchmark on 19 x 19