80 likes | 202 Views
PDSF Computing model Thomas Davis ASG/NERSC, LBNL LCCWS. Contents. Why PDSF at a supercomputing center? Short history of PDSF What does PDSF do? How do I get access to PDSF? Hardware Software. Why PDSF at a Supercomputing Center?. Sharing of resources 24x7x365 glass house.
E N D
PDSF Computing model Thomas Davis ASG/NERSC, LBNL LCCWS
Contents • Why PDSF at a supercomputing center? • Short history of PDSF • What does PDSF do? • How do I get access to PDSF? • Hardware • Software
Why PDSF at a Supercomputing Center? • Sharing of resources • 24x7x365 glass house. • Can leverage some vendors expertise. • Access to expertise in data management, networking, and other important services. • Large cluster could be the next HPC system; so PDSF provides production experience with clusters.
Short history of PDSF • Came from SSC, crated and moved to Livermore. • No new hardware, software for close to 8 years. • Life support applied when NERSC moved to Berkeley from Livermore in 1997 • Has picked up experimental support ever since, with what appears to be a 1000% growth rate.
How do I get PDSF access? • Buy in service model. Clients pay for a fixed slices of CPU time/disk space. • Over time, as cluster grows, the amount of disk space and cpu time drops. Moore's law is in effect. • Client has no direct ownership of cluster. PDSF admins have the right to move resources around when needed. • Client can get more CPU power than what they buy • if no one else is using CPU's in cluster, then resources are available to other users. • LSF Fairshare is used to arbitrate between clients. • Equipment is retired based on Moore's law, and warranties.
PDSF Hardware model • CPU power is always moving. • We expect to retire any Intel based system after 3 years. • Disk size is climbing even faster; new drives replace old disks, increasing volume size and performance. • Memory constraints can also force retirement of systems.
Hardware, Continued. • New hardware is always bought as late as possible. • CPU speeds are based on the knee of the curve; 2x650mhz machines are better than 1x1Ghz machine. • Memory is also bought based on size; always buy largest dimm's when possible.
Software model. • Platform LSF is used for batch queuing. • Fought for better pricing.. but Platform still wants lots of money. • Redhat 6.1 is software base. • Only updated when needed; many experiments can't change. • Control of software is vigorous; because PDSF has many experiments, any software changes are controlled. • Opensource is preferred, but properiety software is acceptable.