150 likes | 282 Views
Farm Completion. Beat Jost and Niko Neufeld LHCb Week St. Petersburg June 2010. Filling the farm. Thanks for interesting and useful discussions to Loic Barda, Rolf Lindner, Laurent Roy and Eric Thomas Thanks for measurements and plots to Juan Caicedo and Patrick Robbe.
E N D
Farm Completion Beat Jost and Niko Neufeld LHCb Week St. Petersburg June 2010
Filling the farm • Thanks for interesting and useful discussions to • Loic Barda, Rolf Lindner, Laurent Roy and Eric Thomas • Thanks for measurements and plots to • Juan Caicedo and Patrick Robbe Farm Completion St. Petersburg 06/2010 - Niko Neufeld
The three limits:Power, Cooling, Money • Power: 550 kW available (105 kW used) • Cooling: nominally available 525 kW • Rack-space: 1700 Us (plenty) • Money: xx MCHF We will be limited by money! Farm Completion St. Petersburg 06/2010 - Niko Neufeld
Event Filter Farm • Level 1: • 100 SuperMicro Twin servers (2 servers in a single 1U chassis with shared power-supply), Intel Harpertown CPU 5420 (2.5 GHz) 4 cores / socket, 1 GB RAM /core • Level 2: • 350 DELL Bladeservers (up to 16 blades in a 10 U chassis), Intel Harpertown CPU 5420 (2.5 GHz) 4 cores / socket, 2 GB RAM /core Farm Completion St. Petersburg 06/2010 - Niko Neufeld
The new farm-node • Both Intel and AMD have brought out new processors: with up to 12 cores / chip and (Intel) hyper-threads (a.k.a. virtual CPUs) • Memory has (again) become faster and cheaper (DDR-3) and each processor has 3 memory channels ( “good” memory configuration = 3 * n, where n = 2, 4, 8, 16 • Both processors are now NUMA (non-uniform memory access) • Study program ongoing to take profit from this Farm Completion St. Petersburg 06/2010 - Niko Neufeld
How many jobs / server Farm Completion St. Petersburg 06/2010 - Niko Neufeld
How fast? Farm Completion St. Petersburg 06/2010 - Niko Neufeld
Server specifications • 1 GB RAM per hardware thread == virtual core • 1 Power supply failure should not affect more than 2 units • 2 Gigabit Ethernet ports • No constraints on power-consumption • CPU (AMD 61xx / Intel 56xx) chosen such as to optimise the Moore/CHF Farm Completion St. Petersburg 06/2010 - Niko Neufeld
A likely candidate • 1.2 kW • redundant PS • 4 servers with each • 12 cores • 24 GB (up to 96) RAM • 1 HDD • 2 x Gigabit Ethernet • 21 kCHF list-price Farm Completion St. Petersburg 06/2010 - Niko Neufeld
Conclusions • We will run with 16 Moore jobs / server (twice as many as today) • Each server will be 2 to 2.5 x faster than the current HLT node • Each Moore instance can use up to 1.5 GB RAM • If really need more RAM • Reduce number of jobs • Increase (double) memory Farm Completion St. Petersburg 06/2010 - Niko Neufeld
Procedure / planning Farm Completion St. Petersburg 06/2010 - Niko Neufeld
To-do list Hardware • Unpacking (surface SX8 need a lot of space and friendly volunteers) • Installation in D1 • Power, network • Burn-in (3 days) • Exchange faulty servers / parts Software • Install OS, verify OS tuning (NIC, memory arrangement etc…) • Integrate in software-management (Quattor) • Add to farm-control Farm Completion St. Petersburg 06/2010 - Niko Neufeld
Details Farm Completion St. Petersburg 06/2010 - Niko Neufeld
How fast? (Moore v9r2 HLT1 only) DAQ & electronics upgrade - Niko Neufeld