150 likes | 285 Views
HEPiX Fall 2001 Report NERSC, Berkeley. Alan Silverman 16 th Nov 2001 C5 Report. Where. National Energy Research Scientific Computing Centre, a division of LBNL – Lawrence Berkeley National Lab Situated in downtown Oakland, across the bay from SF, down the hill from Berkeley itself
E N D
HEPiX Fall 2001 ReportNERSC, Berkeley Alan Silverman 16th Nov 2001 C5 Report
Where • National Energy Research Scientific Computing Centre, a division of LBNL – Lawrence Berkeley National Lab • Situated in downtown Oakland, across the bay from SF, down the hill from Berkeley itself • The building was converted from a bank. The ground floor is now a huge, but huge, computer centre with reserved expansion space – they will eventually just break down the outer wall and take over the car park! Alan Silverman
NERSC • NERSC describes itself as “a world leader in accelerating scientific discovery through computation. NERSC provides high-performance computing tools and expertise that enable computational science of scale, in which large, interdisciplinary teams of scientists attack fundamental problems in science and engineering that require massive calculations and have broad scientific and economic impacts” • Funded by DoE and LBNL – hence non-restricted • HQ of Esnet, Home of PDSF (Parallel Distributed Systems Facility, from the defunct SSC) Alan Silverman
Who • About 40 participants from US and Europe • All major labs represented except BNL • From CERN, Wolfgang Friebel and me – left from an original 7 candidates; cut back caused by a combination of the current climate regarding travel to the US, reservations made on Swissair and travel budget restrictions. “And then there were 2 …” Alan Silverman
Main Issues • On the UNIX side, nothing dramatic to report. The trend to Linux goes on, other platforms talked about less and less • Choice of PCs and configurations very sensitive – many sites reported problems with a particular PC Motherboard for example. • Several sites reported on problems using NFS – it really does not seem to scale Alan Silverman
FNAL • FNAL’s Strong Authentication Project going on but with still limited aims for now (login basically) and nothing is said about extending it to mail, web • Long discussion on UNIX/Windows Kerberos domains – finally decided on making the UNIX domain as master – all Windows users need a UNIX account. Not all problems solved. Interesting talk – see overheads • FNAL’s NGOP now called production but still only monitoring objects for now and only on some 500 nodes still. Alan Silverman
FNAL Run II • CDF has taken 89M good events so far (cf – 100M in whole of Run I). Total of 22TB. • Reconstructed 50M: rate of 3-3.5M per day • Reconstruction farm has 150 worker nodes and SGI Origin 2000 I/O server • Analysis on 64 CPU Origin 2000 and a 4 CPU Linux PC. • No information from D0. Alan Silverman
SLAC • Central farm now 900 SUN CPUs and 512 PC CPUs. Guess which count is rising? • Plans to replace the SUN “mainframe” (64 CPU E10K) by smaller SUNs (a 10 CPU disc server and 20 and 24 CPU job servers) • About to Alpha test an HPSS 4.2 port on Solaris • They are having trouble (spontaneous reboots, system hangs, TLB errors) with VA Linux PCs but don’t blame VA Linux who have been “very helpful” (but not yet fixed the problems). Alan Silverman
SLAC’s Linux installation • SLAC’s scheme to install Linux clusters looks interesting • They use a PXE BIOS network boot which calls on tftp to get the bootloader image which uses an NFS server to get the Linux kernel and the node config files and then runs Kickstart. • All controlled by their scripts • Installed 200 newly-delivered nodes in 30 mins once they had wired them. Alan Silverman
UNIX Security • Two scary talks by Bob Cowles from SLAC • One on Grid security with an extremely pessimistic view of the planned use of Grid certificates • The better talk was his description of the Defcon 9 hackers conference in Las Vegas (where else?) • Open sharing of hacks, attacking tools, etc. • Look out for the Linux kis virus, or did we already get hit with a form of it? Alan Silverman
His Conclusions on Security • Poor system administration is still a major problem • Firewalls cannot substitute for applying patches • Multiple levels of virus/worm protection are necessary • Is Windows XP a threat to national security? • ATA bill makes intrusion/damage a terrorist act … “if calculated to influence or affect the conduct of government by intimidation or coercion …. or to retaliate against government conduct.” • Microsoft to add ratings to security bugs (PG, R, X) • Proposal for “.gov” separate Internet Alan Silverman
Other points • LAL is using our Printer Package on Windows; too many CERN specials inside – fixed now by Ivan? • Jefferson not using it on UNIX but plan to when their expert becomes free of other commitments • IN2P3 starting surveys of the cheap disc market and mass storage systems. Interest from other sites to participate and I am in contact with them • Several sites using OpenAFS clients, some looking at OpenAFS servers (e.g. INFN) Alan Silverman
Windows 2000 • FNAL has a plan, first domain structure (master and several sub-domains) to be in place by Nov 1st • DESY had a plan which was accepted but then the Mgmt changed the responsibilities – future unclear • SLAC has no plans it seems – “one or two years down the road” said one speaker. • We appear to be about 1-2 years ahead Alan Silverman
Overheads And of course, it’s all on the web – see http://pdsf.nersc.gov/hepix/agenda.html Alan Silverman