130 likes | 261 Views
Infrastructure availability and Hardware changes. Slides prepared by Niko Neufeld Presented by Rainer Schwemmer for the Online administrators. Hardware changes. Hardware here means essentially PCs, storage, network (ECS interface hardware: SPECS, CAN Clara’s talk)
E N D
Infrastructure availability and Hardware changes Slides prepared by Niko Neufeld Presented by Rainer Schwemmerfor the Online administrators
Hardware changes • Hardware here means essentially PCs, storage, network (ECS interface hardware: SPECS, CAN Clara’s talk) • In principle every PC older than 6 years will be thrown out • On PCs between 3 and 5 years preemptive exchanges will be done (disks, memories) • Old storage systems will be decommissioned • Network devices will be kept Online Infrastructure during LS1
Operating system changes • Why? • New hardware not supported by old OS • Newer versions of application code do not support old OS versions (WinCC, etc…) • What do we run today? • Linux: • SLC4 (CreditCard PC, some SPECS), • SLC5 (farm, controls-PCs, plus, hist, etc…) • SLC6 (“hidden” services: DNS, admin servers etc…) • Windows: • XP (consoles, some special cases: VeloIseg, Rasnik, Rich gas monitoring etc…) • 2003 SP2 (everything else) Online Infrastructure during LS1
OS future • Linux: • SLC5 32-bit – only for CCPCs • SLC6 for all Control PCs • SLC6 for gateways and hidden services • SLC5 64-bit for the farm until offline moves to SLC6 • Windows: • (Ideally): Windows 2008 SP2 only • Only under duress: Windows 7 • For testing: Windows 2012 Online Infrastructure during LS1
Hardware changes • Baseline: • All control-PCs will be replaced by virtual machines • Farm File-server (NFS) and PVSS/WinCC function will be split (irrelevant for users) will refurbish old farm-blades for this (hlt[A-E]07-11) • Backup • Run control-PCs on re-furbished farm-servers Online Infrastructure during LS1
Hardware Changes 2) • We would like to get rid of most “corner” cases and limit the number of different hardware types – if you have a special need (old PCI card, USB device, etc…) please contact us for driver upgrade etc… • After LS1 there will be no XP and no Linux below SLC6 (except the CCPCs) so hardware must be able to run this, this includes semi-private consoles etc… • New farm-nodes will be bought and installed during 2014, to be ready for LHC startup. (As usual this is a lot of work, but has only a minor influence on day-to-day operations of the Online system) Online Infrastructure during LS1
Core Online Infrastructure work • Change of main storage system (DDN9900 DDNSFA10K – requires physically touching every disk, recreation of all disk-sets, recreation of all file-systems • Upgrade of SX controls network (improved redundancy) • Change of electrical distribution to homogeneous battery-backed up dual-feed power distribution • Recablingof all servers in the SX server room • Exchange of some old racks • Redundancy / emergency procedure tests Online Infrastructure during LS1
Planning • Online system will be shutdown from25/02/13 to 15/03/13 • Web-services (logbook, wiki, rundb) will be kept up • No user logins, no access to data, WinCC systems (no here really means no) • Detailed planning for these 3 weeks in preparation (will be internally discussed and agreed after the X-mas break) Online Infrastructure during LS1
Planning during LS1 • System will be kept up and running • Manager on Duty will be at P8 Monday – Friday as usual and answer to direct requests and tickets • Farm will be kept up and running (in particular for OnlineDirac) • No intervention outside working hours (except very serious incidents: SAN or network failure, loss of outbound connectivity), no piquet • Two farms will be reserved for tests (SLC6 migration preparation etc…) • The “entire” farm will be needed for tests, upgrades, re-organisationsfor a few (4 – 5) one-week periods during 2013 exact planning will be synced with Operations and Offline needs Online Infrastructure during LS1
More Changes • Password policy will be brought in line with CERN policy • Minimum complexity • History • Maximum validity 1 year • Accounts will be cleaned up • inactive accounts will be blocked but the data will be kept • Can be made available to the user or the respective project-leader anytime • Gateway machines (windows and Linux) will be upgraded to latest OS versions Online Infrastructure during LS1
Transparent changes • These do not affect users directly, they will be done in the back-ground • Registering of all machines in LanDB • Needed for easier sharing with IT resources (licenses, etc…) • Pilot-project to use CERN accounts instead of Online accounts (easier user-management) this is really a test not a commitment Online Infrastructure during LS1
External Changes • Independent from the Online team there are other changes going which will affect us to some extent: • Maintenance on the Technical Network (TN) • Changes to the gas control system • Maintenance on the data-base servers • … Online Infrastructure during LS1
Summary • A lot of replacement work to do during LS1 • Transition plan for the Control PCs / PVSS/WinCC systems in place • Some infrastructure consolidation done already with a view to the upgraded experiment after LS2 (consolidation of server-room in SX8) • 3 weeks of complete shutdown from 25/02/13 to 15/03/13 • Detailed Planning for 2013 will be put up in January and announced to everybody – frequent updates are initially to be expected since many (external) teams are involved Online Infrastructure during LS1