160 likes | 565 Views
Parallel Simulated Annealing for EAM potential fitting. By Tao Xu CS6230 Final Presentation 05/05/05. Outline. Introduction to EAM potential The object/cost function Simulated Annealing Algorithm Synchronous Parallel Simulated Annealing Asynchronous Parallel Simulated Annealing
E N D
Parallel Simulated Annealing for EAM potential fitting By Tao Xu CS6230 Final Presentation 05/05/05
Outline • Introduction to EAM potential • The object/cost function • Simulated Annealing Algorithm • Synchronous Parallel Simulated Annealing • Asynchronous Parallel Simulated Annealing • Conclusions and References
The EAM potential Where f(r) is a pair potential, r(r) an atomic density function and a embedding function U(n). Let a indicate the entire set of L parameters a1, a2, …, aL used to characterize the functions. The goal is try to determine the optimal set a* by matching the forces from first-principle calculations with those predicted by the classical potentials.
The object function The key of force-matching method is to minimize the object function: Z(a) = ZF(a) + ZC(a) where, M: # of sets of atomic configurations(e.g. structures). Nk: # of atoms in configuration k. Fki(a) is the force on the ith atom in set k obtained with parameter set a. Fki0 is the reference force from first principle. ZC: contains contribution from Nc additional constraints. Ar(a) are physical quantities as calculated from potentials.
Simulated Annealing Algorithm Initial configuration a Random number generator Create new random configuration a’ Evaluate the cost function Acceptance probability No Yes Accept new config Terminate Search? Adjust Temperature END
Synchronized Parallel Simulated Annealing Send initial configuration ROOT 0 Collect data from workers Send best configuration . . . . P1 P2 P3 P4 Collect final configuration
How to make a selection from workers’ results • There are three ways here: • Minimum: The root chooses the configuration with the lowest cost function; • Random: The root chooses one of the configuration at random; • Metropolis-like: The root chooses the minimum sometimes but accepts others with some nonzero probability. • In the current implementation, the minimum one is chosen at each temperature.
Advantages and disadvantages of synchronous PSA • The advantages of synchronous PSA: • Attains a near-linear speedup. This is due to the fact that, with n processors, the program is searching a factor n more possible configurations, which increase the chances of “stumbling” onto the correct configuration more quickly. • Easy to implement. The only message-passing occurs at the synchronous steps. • The disadvantages of synchronous PSA: • Idle time: If one processor obtains the prerequisite number of successes before another one, it must wait for other processor to finish. • Synchronization cost: A global gathering and rebroadcasting of large configurations can be time-consuming. However, this is not usually a problem with smaller systems.
Asynchronous Parallel Simulated Annealing ROOT Send initial configuration Register T Collect data from workers . . . . P3 P4 P1 P2 Collect final configuration
Differences between Synchronous and Asynchronous PSA • Every processor controls its own cooling schedule; • Each processor works independently with each other to avoid any idle time for waiting others to finish. When it finishes at one temperature, it checks its value against the global register. If its value is worse, it takes the configuration from the register. Otherwise, it writes its value to the register. • The best configuration is always stored in a global register on a master processor. • Advantages of Asynchronous PSA: • No processors ever sit idle. When a processor finishes at one temperature, it goes on to the next. However, there might still be some idle time at the end of the program. • No expensive synchronization steps. Communications are smaller but more frequent.
Conclusion • Simulated annealing always converges, but it takes a long time to find the minimum; • Thus, parallelization of simulated annealing is desirable. Due to a faster perusal of the search space, a near linear speedup is obtained in the convergence time; • Asynchronous annealing converges faster than synchronous due to the near-zero idle time and has a better speedup.
References and Questions [1] M.S. Daw and M. I. Baskes: Embedded-atom method: Derivation and application to impurities, surfaces, and other defects in metals. Physical Review B, Vol. 29, No. 12, June 1984, pp. 6443-6453 [2] R. A. Johnson: Analytic Nearest-neighbor Model for fcc Metals. Physical Review B, Vol. 37, No. 8, March 1988, pp. 3924-3931 [3] S. Kirkpatrick, C.D. Gelatt, M.P. Vecchi: Optimization by Simulated Annealing. Science, Vol. 220, No. 4598, May 1983, pp. 671-780