200 likes | 274 Views
Stanimire Tomov 1 Andrew Canning 2 , Jack Dongarra 1 , Osni Marques 2 Christof Vömel 2 and Lin-Wang Wang 2 Innovative Computing Laboratory 1 Lawrence Berkeley National Laboratory 2 University of Tennessee Computational Research Division.
E N D
Stanimire Tomov1 Andrew Canning2, Jack Dongarra1, Osni Marques2 Christof Vömel2 and Lin-Wang Wang2 Innovative Computing Laboratory 1 Lawrence Berkeley National Laboratory 2 University of Tennessee Computational Research Division Efficient Eigensolvers for Large-scale Electronic Nanostructure Calculations ________________________________________________ Supported by: U.S. DOE, Office of Science SC05, Seattle 11/16/2005
Outline • Background • Problem formulation • Solution approach • Iterative Conjugate Gradients (CG) type eigensolvers • Preconditioning • The Bulk-band (BB) preconditioner • Numerical results • Conclusions
Background • Quantum dots • Tiny crystals ranging from a few hundred to few thousand atoms in size; made by humans • Electronic properties critically depend on shape and size • Colors of light absorbed and emitted can be tuned by the quantum dot size • Absorbed energy can lift an electron from its valence band to its conduction band (generate electrical current) • Electron falling back from conduction to valence band lead to loss of energy, emitted as light • The mathematical simulation leads to eigen-value problems • Different electronic properties than their bulk material • But still, bulk material properties may be useful: we found ways to use them in designing preconditioners that would significantly accelerate quantum dots electronic structure calculations Total electron charge density of a quantum dot of gallium arsenide, containing just 465 atoms. Quantum dots of the same material but different sizes have different band gaps and emit different colors
Problem formulation • Solve a single particle Schrödinger-type equation (E) (- 0.5 + V ) i = i i with periodic boundary conditions • Many electronic nano-structure calculations lead to it • Leads to a discrete eigenvalue problem H i = Ei i , where H is Hermitian • Many additional requirements • Find a few (4-10) interior eigenvalues closest to a given point Eref • Repeated eigenvalues are allowed (degeneracy up to 4), etc. • The problem size requires a parallel iterative solution approach
Solution approach • Phase 1: Iterative eigen-solvers • Conjugate Gradients (CG) type with spectral transformation • Based on their previous successful use in the field • Folded spectrum: solve for (H-Eref)2 to get interior eigen-states(L.W.Wang & A. Zunger, 1993) • Developed library of 3 non-linear CG eigen-solvers • The library includes the A. Knyazev’s LOBPCG method • Supports blocking • Supports preconditioning • Developed and integrated in NanoPSE (S.Tomov and J.Langou)
Solution approach … • We use the Nanoscience Problem Solving Environment (NanoPSE) package • Integrate various nano-codes (developed over ~10 years) • Its design goal: provide a software context for collaboration • Features easy install; runs on many platforms, etc. • Collected and maintained by Wesley Jones (NREL) • Results: • 43% improvement in speed and 49% in number of matrix-vector products • On a InAs nanowire system of ~ 70,000 atoms, eigen-system of size 2,265,827 (A. Canning and G. Bester) • Results are good: reference algorithm & implementation were very efficient • But limited by the effectiveness of the available preconditioner • Phase 2: Preconditioning
Preconditioning • Preconditioning: term coming from accelerating the convergence of iterative solvers for linear systems Ax = b in particular, find operator/preconditioner T “A-1” s.t.(TA) x = Tb be “easier” to solve • Preconditioning for eigenproblems • Harder problem / not “as straightforward” • Can be shown that efficient preconditioners for linear systems are efficient preconditioners for CG-type eigensolvers
Bulk Band (BB) Preconditioner Basic idea: • Use the electronic properties of the bulk materials constituent for the nanostructure in designing a preconditioner • What does it mean and how?
BB preconditioner • Find electronic properties of the bulk materials: • Solve (E) on infinite crystal (bulk material) • Because of the periodicity solve just on the primary cell (much smaller problem); Find solution in form (Bloch theorem):nk (r ) = unk( r) eikr, unk (r+A) = unk( r) • Denote span{nk } as BB space • Denote by HBB the Hamiltonian stemming from a bulk problem; if BB space, HBB-1 is easy to compute • Note that if H stems from a bulk problem HBB-1 is the exact preconditioner for H (=H-1)
BB preconditioner, continued … • Decompose the current residual R as R = QBB R + (R – QBB R)where QBB is the L2 projection in the BB space • Use HBB-1 to precondition the QBB R component of R and a diagonal preconditioner D-1 for the (R–QBB R) component, i.e. (1) T R HBB-1 QBB R + D-1 (R – QBB R) • TR in (1) is just one example … • Preconditioners of form (1) are refered to in the literature as additive; another variation is (2) T R HBB-1 QBB R + w D-1 R,where w>0 is a dumping parameter
BB preconditioner, continue … • (2) can be viewed as a multilevel (two-level) preconditioner: “correct” the low frequency components of R with HBB-1 and “smooth” the high frequencies with D-1 • How to choose w in (2); also present in (1)? • Avoid the problem of determining it by considering a multiplicative multilevel version of the BB preconditioner: r1 = D-1 R r2 = r1 + HBB-1 QBB (R – H r1) T R r2 + D-1 (R – H r2)
Numerical results 64 atoms of Cd48-Se34 512 atoms of Cd48-Se34 • Tests on a bulk problem • The BB preconditioner should be most efficient for this case (speedup of factor 3, increasing with problem size increase) • We start with arbitrary initial guess • Here BB space dimension is 1.5% of solution space dimension
Numerical results 64 atoms of Cd48-Se34 512 atoms of Cd48-Se34 • Tests with “perturbed” potential (simulate a quantum dot) • Factor of 2 speedup • Increasing with increasing problem size
Numerical results • Tests with “perturbed” potential (simulate a quantum dot) • Localized wave-functions with density charge confinement simulating a quantum dot
Numerical results • Various perturbations with the BB multiplicative preconditioner 64 atoms of Cd48-Se34 512 atoms of Cd48-Se34 • Not that sensitive to perturbation increase
Numerical results • BB vs diagonal preconditioning on a bigger system (4096 atoms of Cd48-Se34) for various perturbations BB multiplicative preconditioning Diagonal preconditioning • Speedup exceeding a factor of 3 • Goes to about factor of 7 for perturbation 4
Numerical results • Comparison of diagonal (in red) vsBB preconditoining (in green) using folded spectrum; (H-Eref)2 64 atoms of Cd48-Se34 512 atoms of Cd48-Se34 • The speedup from the H case is multiplied by a factor of 2 • A speedup of factor 4 for small problems; increasing with problem size increase
Conclusions • A new preconditioning technique was presented • Numerical results show the efficiency of the BB preconditioning • A factor of 4 speedup for small problems with folded spectrum (compared to diagonal preconditioning) • Increased efficiency with problem size increase • More testing has to be done • On bigger problems • With real quantum dots