130 likes | 146 Views
Summary of Bill Saphir's reflections on the Greenbook talks at NERSC. Emphasizing the need for better latency/bandwidth with new clusters and processors to enhance interactive resources. Topics include workload analysis, systems management, user services, software, networking, storage, grid technologies, scientific computing, data analysis, and visualization.
E N D
NERSC Reflection of Greenbook Talks Bill Saphir
Summary x {HPC} need(x)
Compute system architecture • You want • Clusters • X1 • Better relative latency/bandwidth w.r.t. Seaborg • Current generation processors • More interactive resources [note: you’ll need to say that the utilization tradeoff is worthwhile for science] • Themes • Different resources for different types of computing (but interface should be similar!) • Fast processors are good • Note: • Easy to confuse requirement for fast processors with requirement for faster relative interconnect performance • We are working on quantitative workload analysis/requirements based on code instrumentation
Capacity/Capability/Interactive • This is the right place to bring this up! • As you know, there are significant barriers to change • If capacity is important, develop a compelling and coordinated case in the Greenbook. Same with interactive use. We can amplify a clear message but not a mixed message. • Beware of being flooded with lots of small-scale users • Consider developing new terms to describe types of use • Is 10:1 capacity:capability the right ratio? • How much should be devoted to interactive use?
Systems Management • Positive feedback • Easy to take for granted • Don’t forget to indicate its importance • Queue policies – there is some disagreement with these. Best place to discuss may be in discussion of NERSC role.
User Services • Positive feedback • Easy to take for granted • Don’t forget to indicate its importance
Software • Software build/support infrastructure is important and could be much better for folks developing the core algorithms/libraries (see Phil’s talk)
Networking • Several presentations mention high bandwidth networks. • Quantitative requirements will be helpful • Capacity vs. capability is an issue for networks also • The physical infrastructure over which we have some influence is NERSC connections to backbone networks, not the backbone networks themselves. • Don’t forget importance of end-to-end tuning – hardware alone isn’t sufficient!
Storage • Several speakers said they need more storage but didn’t say how much (compute requirements were much more specific) • archival • online: scratch and permanent • shared (within NERSC) • moving data to/from other sites • No discussion of quantitative performance requirements • Parallel systems storage • Archival storage • Large online storage vs. fast online storage mentioned by a couple of speakers. This tradeoff is important to understand. • When are you willing to copy data? • between sites? • within NERSC? • from archive system to online disks? • between small/fast disk and large/slow disk?
Grid Technologies/Workflow • Several speakers mentioned growing importance of data management, workflow management, integrated development environment • Concerns raised about one-time-password solutions. Important to clearly express the potential pitfalls in the Greenbook. • What should NERSC do more? better?
Scientific computing • Several speakers commented on numerical algorithms and the importance of collaborations (mostly w.r.t. SciDAC) • What kind of scientific computing/algorithm expertise is most useful or would be useful to you in the NERSC program?
Data analysis and Visualization • Several speakers referred to visualization services. • We need specific information about the vis challenges of your projects and what you expect/want the center to do for you. • How can we best help you with your visualization and analysis needs? • What do you want to be able to do that is hard today? • One speaker asked for more software support. what type of software? • A few speakers talked about large datasets. Do you want to do the analysis at NERSC or move the data and do the analysis locally? • What type of hardware resources should NERSC provide? • Note: lots of connections between vis, storage, networking, grid infrastructure – paint a complete picture.
Closing Thoughts • Doug list of questions was a good idea • NERSC may be able to provide data on current usage and capabilities for comparison or illustration – let us know if we can get you any information. • Excellent feedback -- this will get the Greenbook process off to a good start