180 likes | 312 Views
Cyber-Infrastructure for Materials Simulation at the Petascale. Trends/opportunities Recommendations for CI support. David Ceperley National Center for Supercomputing Applications Department of Physics Materials Computation Center University of Illinois Urbana-Champaign
E N D
Cyber-Infrastructure for Materials Simulation at the Petascale • Trends/opportunities • Recommendations for CI support David Ceperley National Center for Supercomputing Applications Department of Physics Materials Computation Center University of Illinois Urbana-Champaign N.B. All views expressed are my own.
CPU time increase by >30 Memory increases similar Supercomputer performance on the desktop Some methods (e. g. simulations) are not communication/memory bound. They will become more useful. Community is large and diverse; Many areas Large movement to local clusters and away from large centers. Impact of new computing trends: consumer devices, grids, data mining,… What will happen during the next 5 years? What will the performance be used for?
Reductionist approach to “Nano-scale” systems • Unity of chemistry, condensed matter physics, materials science, biophysics, nanotechnology at the nano-scale • All is described by the quantum mechanics for electrons, statistical mechanics for the ions. The computational problem is well-posed but very hard. • Rapid progress possible for calculations of real materials • What are challenges for the next decade? How will the petaflop computers be used? • Four main predictable directions:
accuracy 1. Quest for accuracy in condensed matter simulations “Chemical accuracy” is difficult to achieve • Hard sphere MD/MC ~1953 • Empirical potentials (e.g. Lennard-Jones) ~1960 • Local density functional theory ~1985 • Correlated electronic structure methods ~2000
Typical accuracy today (systematic error) is 1000K for ab initio simulations. • Accuracy needs to be 100K to predict room temperature phenomena. • Simulation approach only needs 100x the current resources if systematic errors are under control and efficiency maintained. • Example problem achievable within 5 years: • direct (ab initio) simulation of liquid water from electrons and nuclei with accuracy much better than the current 50C. • Problem that could be solvable: • simulation of strongly correlated electronic systems such as the copper oxides.
2. Larger Systems • Complexity of simulation methods are similar, ranging from O(1) to O(N). Only some methods are ready to scale up. • But simulations are really 4d -- both space and time need to be scaled • 104 increase in CPU means 10-fold increase in length-time scales. Go from 2nm to 20nm by the end of the decade for high accuracy. Many important physical applications.
3. More Systems • Parameter studies are very promising use of petaflop resource. • Typical example: materials design. • Combinatorics leads to a very large number of possible compounds to search. [>92k] • Needs both accurate QM calculations, statistical mechanics, multiscale methods, easily accessible experimental data,… Interdisciplinary!
What is the most stable binary alloy? What about 4 components? Each square represents a PhD in 1980
4. Multiscale Challenge is to integrate what is happening on the microscopic quantum level with the mesoscopic classical level. • How to do it without losing accuracy? • QMC /DFT • DFT-MD • SE-MD • FE • How to make it parallel? (Load balancing with different methods) • Simulation of chemical reactions in solution. Lots of software/interdisciplinary work needed. Important recent progress.
How do we achieve our potential use of computer technology in research and education? • How to make best use of existing resources? • What are the problems?
Why are some groups more successful than the US materials community? • Europeans (VASP ABINIT …), Quantum Chemistry, Lattice Gauge theory, applied math,… • CI does not fit into the professional career path. • Software is expensive • We need long term, carefully chosen projects • Unlike research, the effort is wasted unless the software is, documented, maintained and used. • Big opportunities: my impression is that the state of software in our field is low. We could be doing research more efficiently. • Basic condensed matter software needed in education.
3 legs of CI • Research • New algorithms have led to advances greater than hardware in efficiency • Many new methods appearing • Deployment and maintenance • New efficient codes just don’t happen • Education • We need to bring more people into the CI game. Funding for the 3 legs should be separate since they have different aims.
Software/infrastructure development • Support development of tested methodology, including user documentation, training, maintenance • Market based approach to what software we need • Yearly open competitions for small (1 PDRA) grants for developing & maintaining CI indefinitely. • Standing panel to rank proposals based on expected impact within 5 years. • Institutional memory in panel is important. • Key factors in the review should be experience with actual users of the software, experts in the methodology and measurable scientific impact of code. • Not research but deployment and maintenance.
Education in Computational Science • Need for ongoing specialized training:workshops, tutorials, courses • Parallel computing, optimization • Numerical Libraries and algorithms • Languages, code development tools • Develop a computational culture and community. Need to refill the pipeline for algorithm experts. • Falls in between NSF directorates • Meeting place for scientists of different disciplines having similar problems (like KITP?) • Reach a wider world through the web. Explore new ways of sustaining groups. • Large payoff for relatively low investment
Inter-aspects • Interdisciplinary teams are needed: especially CS & applications, applied math & applications (performance analysis, best practice software development) • Disciplines sharing the same problem • International team • ….
Databases for materials? • We need vetted benchmarks with various theoretical and experimental data • Storage of all the outputs? • What is balance between computation and storage? • Computed data is perishable in that the cost to regenerate decreases each year and improvements in accuracy mean newer data is more reliable. • Useful in connection with published reports in testing codes and methods. • Expanding role of journals? New Journal: “Computational Science and Discovery” has this as its aim. • Need to handle “drinking from firehose.” This could be handled by XML based data structure (standards) to store inputs and outputs. (Zurich meeting next month will address this)
Computational funding modes • Large collaborations (medium ITR’s) Needed for multidisciplinary/large projects • Algorithmic research (small ITR’s) Fits into “scientific/academic” culture. • Cycle providers (NSF centers&local clusters) “time machine”, for groups not having their own cluster or having special needs • Coupled research/development/CPU grants for the petascale machines • Software/infrastructure development • Education in CI At risk unmet opportunities