250 likes | 383 Views
Oxford University Particle Physics Site Report. Pete Gronbech Systems Manager and South Grid Technical Co-ordinator. Physics Department Computing Services. Physics department restructuring. Reduced staff involved in system management by one. Still trying to fill another post. E-Mail hubs
E N D
Oxford University Particle Physics Site Report Pete Gronbech Systems Manager and South Grid Technical Co-ordinator Hepix SLAC - Oxford Site Report
Physics Department Computing Services • Physics department restructuring. Reduced staff involved in system management by one. Still trying to fill another post. • E-Mail hubs • A lot of work done to simplify the system and reduce manpower requirements. Haven’t had much effort available for anti-spam. Increased use of the Exchange Servers. • Windows Terminal Servers • Still a popular service. More use of remote access to user’s own desktops (XP only) • Web / Database • A lot of work around supporting administration and teaching. • Exchange Servers • Increased size of information store disks from 73-300GB. One major problem with failure of a disk but was solved by reloading overnight. • Windows Front End server – WINFE • Access to windows file system via SCP, SFTP or web browser • Access to exchange server (web and outlook) • Access to address lists (LDAP) for email, telephone • VPN service Hepix SLAC - Oxford Site Report
Windows Status • Systems are now almost entirely Windows XP or 2000 on clients and Windows Server 2003 for services. • More machines brought into centrally managed domain. • More automation of updates, vulnerability scans etc. • More laptops to support Hepix SLAC - Oxford Site Report
Software Licensing • Cost increasing each year • Have continued NAG deal (libraries only) • New deal for Intel compilers run through OSC group • Also deals for Labview, Mathematica, Maple, IDL • System management tools, software for imaging, backup, anti-spyware etc. • (MS OS’s and Office covered by Campus select agreement) Hepix SLAC - Oxford Site Report
Network • Gigabit connection to campus operational since July 2005. Several technical problems with the link delayed this by over half a year. • Gigabit firewall installed. Purchased commercial unit to minimise manpower required for development and maintenance. Juniper ISG 1000 running netscreen. • Firewall also supports NAT and VPN services which is allowing us to consolidate and simplify the network services. • Moving to the firewall NAT has solved a number of problems we were having previously, including unreliability of videoconferencing connections. • Physics-wide wireless network. Installed in DWB public rooms, Martin Wood and Theory. Will install same in AOPP. New firewall provides routing and security for this network. Hepix SLAC - Oxford Site Report
Network Access Super Janet 4 2.4Gb/s with Super Janet 4 Physics Firewall Physics Backbone Router 1Gb/s OUCS Firewall 1Gb/s 10Gb/s Backbone Edge Router 10Gb/s 100Mb/s Campus Backbone Router 1Gb/s 10Gb/s depts Backbone Edge Router depts 100Mb/s depts 100Mb/s depts
Network Security • Constantly under threat from vulnerability scans, worms and viruses. We are attacking the problem in several ways • Boundary Firewall’s ( but these don’t solve the problem entirely as people bring infections in on laptops.) – new firewall • Keeping operating systems patched and properly configured - new windows update server • Antivirus on all systems – More use of Sophos but some problems • Spyware detection – anti-spyware software running on all centrally managed systems • Segmentation of the network into trusted and un-trusted sections – new firewall • Strategy • Centrally manage as many machines as possible to ensure they are uptodate and secure – most windows machines moved into domain • Use Network Address Translation (NAT) service to separate centrally managed and `un-trusted` systems into different networks – new firewall plus new Virtual LANs • Continue to lock-down systems by invoking network policies. The client firewall in Windows XP –SP2 is very useful for excluding network based attacks – centralised client firewall policies Hepix SLAC - Oxford Site Report
Particle Physics Linux • Aim to provide general purpose Linux based system for code development and testing and other Linux based applications. • New Unix Admin (Rosario Esposito) has joined us, so we now have more effort to put into improving this system. • New main server installed (ppslgen) running Scientific Linux (SL3) • File server upgraded to SL3 and 6TB disk array added. • Two dual processor worker nodes reclaimed from Atlas Barrel assembly and connected as SL3 worker nodes. • RH7.3 worker nodes being migrated to SL3 • ppslgen and worker nodes form a mosix cluster which we hope will provide a more scalable interactive service. These also support conventional batch queues. • Some performance problems with gnome and SL3 from exceed. Evaluating alternative (NX) for exceed which doesn’t exhibit this problem (also has better integrated ssl). Hepix SLAC - Oxford Site Report
PP Linux Batch Farm Scientific Linux 3 Migration to Scientific Linux pplxwn11 2 * 2.8GHz P4 Red Hat 7.3 pplxwn10 2 * 2.8GHz P4 Addition of New 6TB SATA RAID Array pplx3 2 * 800MHz P3 pplxwn09 1 * 2.4GHz P4 pplx2 2 * 450MHz P3 pplxwn08 2 * 2.4GHz P4 pplxwn04 2 * 2.4GHz P4 6TB pplxwn07 2 * 2.4GHz P4 pplxwn03 2 * 2.4GHz P4 pplxwn06 2 * 2.4GHz P4 pplxwn02 2 * 2.4GHz P4 4TB ppslgen 2 * 2.4GHz P4 pplxwn01 2 * 2.4GHz P4 1.1TB pplxwn05 8 * 700MHz P3 pplxgen 2 * 2.2GHz P4 pplxfs1 2 * 1GHz P3 File Server
Southgrid Member Institutions • Oxford • RAL PPD • Cambridge • Birmingham • Bristol • HP-Bristol • Warwick Hepix SLAC - Oxford Site Report
Stability, Throughput and Involvement • Pete Gronbech has taken on the role of technical coordinator for South Grid tier-2 centre. • The last Quarter has been a good stable period for SouthGrid • Addition of Bristol PP • All sites upgraded to LCG-2_6_0 • Large involvement in Biomed DC Hepix SLAC - Oxford Site Report
Status at RAL PPD • SL3 cluster on 2.6.0 • CPUs: 11 2.4 GHz, 33 2.8GHz • 100% Dedicated to LCG • 0.7 TB Storage • 100% Dedicated to LCG • Configured 6.4TB of IDE RAID disks for use by dcache • 5 systems to be used for preprodution testbed Hepix SLAC - Oxford Site Report
Status at Cambridge • Currently LCG 2.6.0 on SL3 • CPUs: 42 2.8GHz (Extra Nodes only 2/10 any good) • 100% Dedicated to LCG • 2 TB Storage (have 3 but only 2 available) • 100% Dedicated to LCG • Condor Batch System • Lack of Condor support from LCG teams Hepix SLAC - Oxford Site Report
Status at Birmingham • Currently SL3 with LCG-2_6_0 • CPUs: 24 2.0GHz Xenon (+48 local nodes which could in principle be used but…) • 100% LCG • 1.8TB Classic se • 100% LCG. • Babar Farm moving to SL3 and Bristol integrated but not yet on LCG Hepix SLAC - Oxford Site Report
Status at Oxford • Currently LCG 2.6.0 on SL304 • All 74 cpus’s running since ~June 20th • CPUs: 80 2.8 GHz • 100% LCG • 1.5 TB Storage – second 1.5TB will be brought on line as DPM or dcache. • 100% LCG. • Heavy use by Biomed during their DC • Plan to give local users access Hepix SLAC - Oxford Site Report
Oxford Tier 2 GridPP Cluster Summer 2005 ZEUS LHCb ATLAS Biomed Data Challenge, Supported a non LHC EGEE VO For about 4 weeks. Start of August 2005 Hepix SLAC - Oxford Site Report
Oxford Computer Room • Modern processors take a lot of power and generate a lot of heat. • We’ve had many problems with air conditioning units and a power trip. • Need new, properly designed and constructed computer room to deal with increasing requirements. • Local work on the design and the Design Office has checked the cooling and air flow. • Plan is to use two of the old target rooms on level 1, one for physics one for the new Oxford Supercomputer (800 nodes). • Requirements call for power and cooling between of 0.5 and 1MW • SRIF funding has been secured but this now means its all in the hands of the University’s estates. Now unlikely to be ready before next summer. Hepix SLAC - Oxford Site Report
Centre of the Racks Flow simulation showing temperature of air. Rows of racks arranged to form hot and cold aisles. Hepix SLAC - Oxford Site Report
Last Year you saw the space we could use. There are now 5 racks of computers located here. Bad News: No Air Conditioning Room also used as a store Conversion of Room is taking time…But SRIF Funding secured to build joint room for Physics and OSC Tier 2 Rack 2, Clarendon, Oxford grid development, and Ex RAL CDF IBM 8 way “Oxford Physics Level 1 Computer Room” Hepix SLAC - Oxford Site Report
Future • Intrusion detection system for increased network security • Complete migration of desktops to XP-SP2 and MS Office 2003. • Improve support for laptops – still more difficult to manage than desktops. • Once migration of central service to SL is complete, we will be developing a Linux desktop clone. • Investigating how best to integrate Mac OS X with existing infrastructure. • Scale up ppslgen in line with demand. Money has been set aside for more worker nodes and a further 12+ TB of storage. • More worker nodes for the tier-2 service • Look to use university services where appropriate. Perhaps the new supercomputer ? Hepix SLAC - Oxford Site Report