200 likes | 270 Views
Results of the Fermilab 64-bit Linux Hardware and Software Evaluation. Spring 2005 HEPiX meeting Karlsruhe, Germany Ken Schumacher, Steven Timm. Goals of the Evaluation. Gain experience with x86_64 architecture of Linux kernel and see if it is a stable OS platform
E N D
Results of the Fermilab 64-bit Linux Hardware and Software Evaluation Spring 2005 HEPiX meeting Karlsruhe, Germany Ken Schumacher, Steven Timm Fermilab 64-bit Linux Evaluation
Goals of the Evaluation • Gain experience with x86_64 architecture of Linux kernel and see if it is a stable OS platform • Evaluate AMD Opteron CPU and the associated hardware platforms to see if they are reliable hardware platforms. • Obtain relative performance numbers between Intel Xeon EM64T “Nocona” and AMD Opteron processors • Obtain relative performance numbers on applications compiled in 32-bit and 64-bit mode. Fermilab 64-bit Linux Evaluation
64-bit hardware • Intel IA64 as implemented in Itanium 2 • Not considered in this evaluation, • Not binary-compatible with IA32 instruction set • Expensive • Intel - EM64T Xeon “Nocona” • Fermilab already has >240 of these in production • AMD - AMD64 Opteron • Note-Spec CINT2000 are about the same. Opteron 250=1452 and Xeon 3.6GHz=1429 Fermilab 64-bit Linux Evaluation
Extending 32-bit instruction set • Intel and AMD scheme very similar • 48-bit virtual address space • 64-bit General Purpose Registers • Support 64-bit addressing and integer math • Eight extra GPR added • Eight extra XMM added • Difference—EM64T supports SSE3 instructions, Opteron has 3DNow! Fermilab 64-bit Linux Evaluation
Vendor Selection • Only used vendors that Fermilab has previous experience with. • Requested 12 evaluation units, got 9. • Opteron units from Koi, ASA, Penguin, CSI, Rackable, IBM, HP, Sun • Purposely requested a variety of CPU speeds • Motherboard manufacturers represented include Tyan, Accelertech, Sun (by Newisys), IBM (by MSI), HP. • Dell Poweredge SC1425 Xeon unit (3.6 GHz) from Dell, as a reference. (Dell doesn’t offer Opteron). Fermilab 64-bit Linux Evaluation
Machine configurations: Fermilab 64-bit Linux Evaluation
Hardware features • Dual Opteron boards designed with NUMA • Each CPU has its own memory bank • No contention between CPU’s on front side bus • Some remote management available on all of them; we did not test it. • Several with SATA drives, they work fine. • Broadcom tg3 is network interface on all. • Rackable has low voltage Opteron 246HE chip, only 55W but same compute power as regular Opteron 246. Fermilab 64-bit Linux Evaluation
Evaluation units Fermilab 64-bit Linux Evaluation
OS Installation • Successfully installed all systems with 64-bit NPACI-Rocks, Scientific Linux Fermi i386, and Scientific Linux Fermi x86_64. • Tested operations of XFS file system, OK • Default SL kernel in version 3.0.3. is 2.4.21-20, ran with that most of time. • 2.6.9 kernel needed to take full advantage of NUMA architecture of Opterons, that works too. Fermilab 64-bit Linux Evaluation
Linux kernels and distros. • One architecture x86_64, kernels come compiled for ia32e (Xeon) and amd64 (Opteron). • Similar to i386 architecture with separate i686 and athlon kernels. • All other rpms are the same for either. • Able to run almost all of our 32-bit applications under the 64-bit kernel/distro in compatibility mode with little trouble. Fermilab 64-bit Linux Evaluation
Reliability Testing • Full Fermilab Acceptance test for 30 days • Continual disk activity both disks • Both cpu’s continuously busy. • 20 days in 64-bit mode, 10 in 32-bit mode • Excluding one node with two catastrophic disk failures (which was disqualified), other seven Opterons had 97.6% uptime. • Downtime was due to kernel hangs in 64-bit mode that we haven’t been able to reproduce since. Fermilab 64-bit Linux Evaluation
Benchmarks • All major Fermilab computing users contributed benchmarks and people to run them. • CDF: reconstruction • D0: reconstruction • CMS: OSCAR and ORCA simulation and digitization, Root stress test, Pythia • SDSS: Supernova search program • LQCD: QCDStreams, MILC lattice code • General: seti@home, CERN unit benchmark, tiny • Many more details in our paper Fermilab 64-bit Linux Evaluation
CMS Root Benchmark 64-bit mode gives gains on Opterons of about 40% Fermilab 64-bit Linux Evaluation
Fermi Cycles • Reconstruction farms use Fermi Cycles (to account for differences in clock speed between Intel and AMD hardware). • Pentium III 1 GHz is defined to have 1000 Fermi Cycles • All other platforms take the average of the performance of CDF Reconstruction and D0 Reconstruction, normalized to PIII 1GHz performance. • D0 and CDF executables are 32-bit, optimized only at Pentium architecture, not recompiled. • We find D0 legacy executable runs is 2.93x faster on Opteron 250, 2.38x faster on Xeon 3.6 (than PIII 1GHz). Fermilab 64-bit Linux Evaluation
Compilers • Use “tiny” (3000-line mock reconstruction program in Fortran, runs all in cache) • Opteron 250 • Legacy executable, i386: 1290 VUPS • Gcc 3.4.2 optimized: 2440 VUPS • Pathscale compiler: 2677 VUPS • Xeon 3.6 • Legacy executable, i386: 1386 VUPS • Gcc 3.4.2 optimized: 2309 VUPS • Intel 8.1 compiler:2910 VUPS • Intel 8.1 compiler with profile feedback: 4332 VUPS • Intel Fortran (and C) 8.1 uses SSE3 instructions to optimize, makes it incompatible with Opterons. • For comparison PentiumIII 1.0 GHz=568 VUPS. Fermilab 64-bit Linux Evaluation
Run II benchmarks Fermilab 64-bit Linux Evaluation
Power Draw In general Opterons draw 10-27% less current at full load than comparable Xeon chips. Four Opteron 248’s vary in current draw, explained by increasing numbers of fans and higher-performance disk drives. Low voltage Opteron246HE saves 10-15% over high-voltage Opteron 246. We need to average 10 kVA per rack in our facility. Have many racks now that are 12kVA. 10kVA/rack = 2.1A/node Fermilab 64-bit Linux Evaluation
Conclusions • 64-bit Linux OS is a stable operating platform • Opteron CPU and associated platforms have sufficient reliability for Fermilab production Farms • Opteron CPU gives us slightly better performance for significantly less power draw and about the same price as Xeon. • Using 64-bit compilation and optimization can lead to significant performance gains on AMD and Intel. Fermilab 64-bit Linux Evaluation
Referances • Fermilab Evaluation Results: • http://www-oss.fnal.gov/scs/public/qualify2005/opteron_external.ps • AMD Developer Symposium 2002 • “Optimizing for the AMD Opteron™ Processor” by Tim Wilkens PH.D. • http://www.amd.com/us-en/assets/content_type/DownloadableAssets/Optimization_-_Tim_Wilkens.pdf Fermilab 64-bit Linux Evaluation
Power Supply Efficiency • General Information on PS Efficiency • http://www.efficientpowersupplies.org/ • “Energy Efficiency of Computer Power Supplies” from CEPE website • http://www.cepe.ethz.ch/download/staff/bernard/28_formated.pdf • http://www.xbitlabs.com/articles/other/display/psu-methodology.html Fermilab 64-bit Linux Evaluation