1 / 30

Large-Scale Density-Functional calculations for nano-meter size Si materials

Large-Scale Density-Functional calculations for nano-meter size Si materials. Jun-Ichi Iwata Center for Computational Sciences University of Tsukuba. Feb 23, 2010, Tsukuba-Edinburgh Computational Science Workshop, Edinburgh. Outline. Quantum Mechanical ( First-Principles )

noelle
Download Presentation

Large-Scale Density-Functional calculations for nano-meter size Si materials

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Large-Scale Density-Functional calculations for nano-meter size Si materials Jun-Ichi Iwata Center for Computational Sciences University of Tsukuba Feb 23, 2010, Tsukuba-Edinburgh Computational Science Workshop, Edinburgh

  2. Outline Quantum Mechanical (First-Principles) Simulation in Solid-State Physics Density-Functional Theory W. Kohn (Nobel Prize in 1998) Density-Functional simulations for large systems Real-Space DFT program code for Parallel Computation -RSDFT- Applications of RSDFT for Si nano materials >10,000-atom system

  3. First-Principles Calculation in Material Physics • We describe material properties from the behavior of electrons and ions. • ions → classical, electrons → quantum • We solve the Schrodinger equation for electronic ground state • Density-functional theory is a powerful tool for this purpose.

  4. electron density Density-Functional Theory Energy Functional (minimize) We get stable atomic & electronic structures. minimize with respect to Kohn-Sham equation Potential → We have to solve this equation self-consistently ( Nonlinear eigenvalue problem ) P. Hohenberg and W. Kohn, Phys. Rev. 136 (1964) B864. W. Kohn and L. J. Sham, Phys. Rev. 140 (1965) A1133.

  5. Performance of DFT with simple approximation Exchange functional in Local-Density Approx. Correctly describe various properties quantitatively good results Si(in diamond structure) M. T. Yin and M. L. Cohen Phys. Rev. B26, 5668 (1982).

  6. Everybody wants to apply the DFT for Large systems A. Ichimiya et al., Surf. Sci. 493, 555 (2001). • Usually, we treat 10- to 1000-atom systems by DFT. • However, we need to treat larger systems. • to study large objects (nano structures, proteins) • to make the atomic model more realistic Proteins(cytochrome c oxidase) ~30,000 atoms Nano structures (Si pyramid) ~100,000 atoms

  7. Real-Space DFT program code(RSDFT) Solve Kohn-Sham equation (eigenvalue problem) → Computational costs ~ O(N3) Developed for parallel computers

  8. discretize Higher-order finite difference pseudopotential method J. R. Chelikowsky et al., Phys. Rev. B, (1994) Real-Space Method ( ⇔ Reciprocal-Space (Plane-Wave) Method ) continuous space discrete space function Column vector Laplacian→ Higher-Order Finite-Difference Typical number of grid points:10,000~1,000,000

  9. RSDFT – suitable for parallel first-principles calculation - • Real-Space Finite-Difference • Sparse Matrix • FFT free (FFT is inevitable in the conventional plane-wave code) • MPI ( Message Passing Interface ) library 3D grid is divided by several regions for parallel computation. Kohn-Sham eq. (finite-difference) CPU7 CPU8 CPU6 Higher-order finite difference CPU3 CPU4 CPU5 MPI_ISEND, MPI_IRECV CPU0 CPU1 CPU2 Integration MPI_ALLREDUCE

  10. with our recently developed code “RSDFT” Massively Parallel Computing Iwata et al, J. Comp. Phys. (2010) Real-Space Density-Functional Theory code (RSDFT) Based on the finite-difference pseudopotential method (J. R. Chelikowsky et al., PRB1994) Highly tuned for massively parallel computers Computations are done on a massively-parallel cluster PACS-CS at University of Tsukuba. (Theoretical Peak Performance = 5.6GFLOPS/node) The largest system in the present study →Si10701H1996 Grid points = 3,402,059 Bands = 22,432 Convergence behavior for Si10701H1996 Computational Time(with 1024 nodes of PACS-CS) 6781 sec. × 60 iteration step = 113 hour

  11. Flow chart Algorithm → subspace iteration method (Rayleigh-Ritz method) Input initial configuration of Ions Calc. Ionic Potentials Conjugate-Gradient Method O(N2) Gram-Schmidt orthonormalization O(N3) Convergence Check Convergence Check Density, Potentials update O(N) Atomic structure optimization yes Hellman-Feynman Force Move ions Subspace Diagonalization O(N3) yes Electronic structure optimization Electronic structure optimization must be performed in each atomic optimization step Total Computational Cost ~O(N3)

  12. Algorithm1 →Subspace Iteration Method(Rayleigh-Ritz Method) Problem M-dimensional eigenvalue problem We need smallest N(≪M) eigen-pairs Initial guess Minimize Reyleigh quotients by Conjugate-Gradient Method wave function update

  13. Algorithm 2 Gram-Schmidt Orthogonalization O(MN2) Subspace Diagonalization → as a basis set Calc. Matrix Elements O(MN2) O(MN2) (Ritz vectors) O(N3) ← initial guess for the next iteration

  14. Gram-Schmidt orthogonalization ~Active use of Level 3 BLAS in O(N3) computation~ → Collaboration with computer scientists much improve the performance of the RSDFT! Time & Performance for Gram-Schmidt Theoretical peak performance = 5.6 GFLOPS/node O(N3) part can be computed at 80% of the theoretical peak performance! Algorithm of GS Part of the calculations can be performed as Matrix × Matrix operation!

  15. PACS-CS(5.6GFLOPS/node) 256nodes Elapsed time for 1 step of iteration O(N2) O(N3) O(N3) → time for O(N2)-part and O(N3)-part become comparable

  16. Application 1 Nano-meter size Si quantum dots

  17. Si quantum dot is a promising material for several device applications • Memory • Single-electron transistor • Optical Device Clarifying the relation between the “Dot size” and “Band gap” is important for controlling the device properties. First-principles calculations are useful for such studies? → Yes, but … • System size is very large! A model of the Si quantum dot of 6.6 nm diameter(Si7055H1596)

  18. Band Gaps Experimental fit curve From STS measurement B.Zanknoon et al., Nano letters 8, 1689 (2008). (eV) 300 atoms >10,000 atoms The ΔSCF gap seems to be closer to the ΔKS gap …

  19. Application 2 Si nanowires

  20. Samsung Si nanowire devices

  21. Several size of Si nanowires 4 nm diameter ( 425 atoms) 20 nm diameter ( 8941 atoms) 10 nm diameter ( 2341 atoms) There may be an optimum diameter in the region of 10 nm ~ 20 nm.

  22. Band Structure and DOS of SiNW (d=1nm)  X d=1nm Si21H20(41atoms) Eg=2.60eV(LDA Bulk : 0.53eV)

  23. Band Structure and DOS of SiNW (d=4nm)  X d=4nm Si341H84(425 atoms) Eg=0.81eV (LDA Bulk=0.53eV)

  24. Band Structure and DOS of SiNW (d=8nm) Si1361H164(1525atoms), Eg=0.61eV  X Bulk Si  X Eg=0.53eV

  25. Si nano wire with surface roughness Si12822H1544 Side View Top View Si12822H1544(14,366 atoms) ・10nm diameter、3.3nm height、(100) ・Grid spacing:0.45Å (~14Ry) ・# of grid points:4,718,592 ・# of bands:29,024 ・Memory:1,022GB~2,044GB

  26. PACS-CS1024 nodes(peak performance:5.6 GFLOPS/node) Subspace diagonalization:4600 sec. Gram-Schmidt:2300 sec. Conjugate-Gradient Method:3700 sec. Total Energy calc.:1200 sec. Total(1 step):12,000 sec. DOS of SiNW with roughness DOS of Bulk Si d=10nm(with roughness) Si12822H1544(14,366 atoms) Eg=0.57eV

  27. Application3 Si divacancy

  28. There are two possibilities for the structure of Si divacancy. Resonant-Bond type Large-paring type Structure of Si divacancy : Small-yellow balls : vacancies (no atoms) Green balls : Si atoms with dangling bonds. Si divacancy What is the stable structure ? LDA calculation (Saito & Oshiyama, 1994) EPR experiment (Watkins & Corbett, 1965) Resonant-Bond typeis stable (Large-Paring type was not found) Large-Paring type Model size ~ 60 atoms More recent LDA calculation (Oguet et al., 1999) ・Both “Large-paring” and “Resonant-Bond” structure were found. ・Large-Paring type is the most stable (RB type is a local minimum) →Model Size dependence ? Model size ~ 300 atoms

  29. There are two possibilities for the structure of Si divacancy. Resonant-Bond type Large-paring type Structure of Si divacancy : Small-yellow balls : vacancies (no atoms) Green balls : Si atoms with dangling bonds. Sidivacancy • Structures converge at • 998-atom model. • LPstructure appears • at 510 or larger models. • RB structure is most • stable, but the energy • difference is very small • (<10 meV) dac, dab (Å) Model size (# of atoms) Large-paring Resonant-Bond Small-Paring J.-I. Iwata, et al., Phys. Rev. B 77 (2008) 115208

  30. Summary • We have developed Real-Space DFT program code for large systems • by utilizing the massively parallel computers • Collaboration with computer scientist much improve the performance of RSDFT • (Especially, O(N3)-part calculation with BLAS 3) • By using a few hundred~1000CPUs, we have achieved the first-principles calculation for • ・Si 1000-atom system with atomic structure optimization • ・Self-Consistent electronic structures of Si 10,000-atom systems • By using large atomic models → eliminate the model-size dependence • We have applied the RSDFT for nano-meter scale Si materials (SiNW, SiQD) • I think the RSDFT becomes an useful tool for future device development

More Related