140 likes | 306 Views
BINP Contribution to ATLAS TDAQ SysAdmin Group Activities (2007-2010). BINP Contribution to ATLAS TDAQ SysAdmin Group Activities (2007-2010). Overview. BINP is contributing to all the activities of ATLAS Trigger/DAQ SysAdmin Group since 2007: D.Popov ( 2007-2008, 1 visit )
E N D
BINP Contributionto ATLAS TDAQ SysAdminGroup Activities (2007-2010) BINP Contributionto ATLAS TDAQ SysAdminGroup Activities (2007-2010)
Overview • BINP is contributing to all the activities of ATLAS Trigger/DAQ SysAdmin Group since 2007: • D.Popov (2007-2008, 1 visit) • A.Zaytsev (2008-2010, 2 visits up to now) • A.Korol (2009-2010, 1 visits up to now) • A.Bogdanchikov (2009-2010, 2 visits up to now) • The contribution includes: • Support of the existing TDAQ environment (> 1500 servers,> 200 racks of equipment in SDX1 and USA15, ATLAS MainControl Room and Satellite Control Room equipment) • Support of ATLAS Point 1 users (> 3000 users, > 300 user roles) • Development of various system administration tools for internal use within the group • Building and validating hardware solutions for future use in the ATLAS TDAQ environment • Taking part in 24-hours TDAQ SysAdminshifts (since mid-summer 2008) BINP Contribution to ATLAS TDAQ
LHC Point1 SysAdmins IT Centre Lab4 Lab32 Lab40 BINP Contribution to ATLAS TDAQ
ATLAS Point 1 Computing Facilities ATLAS-Novosibirsk Group Meeting @ Novosibirsk 11 June, 2009
SysAdmin Group Evolution (2008-2010) • Nominal amount of resources assigned to the team:10 FTE (stabilized for 2011-2012) • Minimum number of people ever observed in the team:4 (2009Q1) • Present situation: 10sysadmins working on site plus 10-12 sysadmins on remote sites (Pakistan + Russia: BINP only) • 3 shifts per month per person in the average • Two rotation cycles are now established: • 3 people in the loop (BINP: the 2nd cycle ongoing, remote operations areallowed) • 10 people in the loop (Pakistan: 1st cycle ongoing, remote operations not allowed) • 80% of the team is renewed since 2007 • No more than 30% of staff renewal is expected in 2010-2011 BINP Contribution to ATLAS TDAQ
Previous Achievements (2009) • Migration of the ATLAS Gateways to the new servers provided with XEN based virtualization solution: • Initial deployment is performed in 2008Q4 • Migration was finalized in 2009Q2-3 • Implementation of bulk server firmware upgrade tools for the netbooted nodes deployed in ATLAS Point 1: • Successfully applied in 2008Q4 for upgrading of more than 1000 nodes installed in SDX1 • Deployment and support of ATLAS Remote Monitoring servers: • Evaluation of commercial and free NX servers and the SGD (Sun Global Desktop) based solutions for ATLAS remote monitoring infrastructure • Implementation of monitoring and accounting data analysis tools based on ROOT toolbox which were successfully applied in 2008Q4-2009Q2 for • ATLAS DCS and Nagios RRD temperature data analysis for SDX1 • ATLAS Gateway accounting system data visualization • Contributing to everyday activities of the group including ATLAS TDAQ SysAdmin shifts since Sep 2008 & taking part in multiple hardware maintenance operations in SDX1 and ATLAS Control Room
Recent Achievements (2010Q1-2) • Major upgraded of the ATLAS Remote Monitoring nodes: • Reinstalling the nodes under SLC5.4 x86_64 • The current installation is fully documented • Supporting the ATLAS P1 Gateways and Remote Monitoring nodes: • Keeping the nodes up-to-date • Adding more functionality and increasing the reliability of these subsystems • Getting through the highest peaks of user activity, e.g. the recentLHC media day (Mar 30, 2010) smoothly • Continuing to contribute to everyday activities on supporting the ATLAS TDAQ computing environment over the period of LHC data taking • Providing ATLAS TDAQ SysAdmins Team with the virtualized nodes used for testing solutions for a new components, e.g.: • New ATLAS P1 webservers, • Tools for deploying the nodes of ATLAS HLT farm (BWM, Quattor/Puppet), etc. • Taking part in commissioning of the new ATLAS TDAQ HLT computing hardware to be deployed in Point1 in 2010Q3 • 10 racks of equipment (new high density computing nodes) • Adding more than 5000 CPU cores to the ATLAS HLT computing farm (SDX1)
New High Density Machines for HLT Farm • New HLT racks: 95 boxes • one 2Us box has 4 motherboards • 10 x HLT rack – 80 boxes • 15 extras for ONL/MON, LFSes, replacements • Overall Dell chassis features: • 4 CPU Sockets/1U • 16 real CPU cores/1U • 32 CPU threads/1U • 64 GB RAM/1U • 1 kW/1U ($300/CPU thread)
Areas of Our Responsibility (2010-2011) • Support/Maintenance (since 2009) • ATLAS P1 Gateways (‘atlasgw’) • Preseries Gateways (‘preseriesgw’) • ATLAS RMON Infrastructure (‘pc-atlas-rmon’) • Development/Validation (added in 2010) • ATCN Test VM Box (test webservers, LFC, Puppet, ClamAV) • GPN Test VM Box (test public webserver, Puppet, upgraded BWM infrastructure VMs) • Future prospects (starting from 2010Q3) • Put virtualized BWM infrastructure to production • Virtualization of Lab32 (for sake of compactification) • Virtualization of ATLAS TDAQ MON subsystem(archiving higher stability) • Load balancing solutions for Point1 proxy and webservers (archiving better handling of the peak load) BINP Contribution to ATLAS TDAQ
Generic Milestones in 2010-2011 • Past (up to 2010Q2) • ATLAS RMONs reinstallation under SLC5 (Feb 2010) • LHC Media Day (Mar 30, 2010): continuous data taking period begins, no more intensive development allowed • ATLAS P1 Gateways upgrade (new VM image, Apr 2010) • ATLAS P1 Gateways proxy authentication schema upgrade (migration to NTLM, May 2010) • Recovery from 18 kV power line failure (end of May 2010) • Near Future (2010) • “LHC First Heavy Ion Physics” Public Event (?) • Put extra 5000 CPU cores to production in HLT farm (SDX1) • Put ConfDB UI v2.0 in production • Migrating to the new ATLAS Point1 webservers • 2010-2011 Christmas Shutdown • Distant Future (2011) • Put and improved access manager into production • Replacing extender solution for the ACR (?) • LHC long term shutdown in the end of 2011 BINP Contribution to ATLAS TDAQ
Talks and Conference Contributions (2008-2009) published ATLAS TDAQ Week, 2008Q4 2008Q2 CHEP2009 Poster Contribution, Mar 2009
Talks and Conference Contributions (2010) accepted ICSOFT2010 Poster Contribution, Jul 2010 CHEP2010, Oct 2010 (not yet accepted)