1 / 20

Oxford PP Computing Site Report

Oxford PP Computing Site Report. HEPSYSMAN 28 th April 2003 Pete Gronbech. General Strategy . Approx 200 Windows 2000 Desktop PC’s with Exceed used to access central Linux systems Digital Unix and VMS phased out for general use. Red Hat Linux 7.3 is becoming the standard. Network Access.

raoul
Download Presentation

Oxford PP Computing Site Report

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Oxford PP Computing Site Report HEPSYSMAN 28th April 2003 Pete Gronbech

  2. General Strategy • Approx 200 Windows 2000 Desktop PC’s with Exceed used to access central Linux systems • Digital Unix and VMS phased out for general use. • Red Hat Linux 7.3 is becoming the standard

  3. Network Access Super Janet 4 2.4Gb/s with Super Janet 4 Physics Backbone Router 100Mb/s Physics Firewall OUCS Firewall 100Mb/s 1Gb/s Backbone Edge Router 1Gb/s 100Mb/s Campus Backbone Router 100Mb/s 1Gb/s depts Backbone Edge Router depts 100Mb/s depts 100Mb/s depts

  4. Physics Backbone Upgrade to Gigabit Autumn 2002 Linux Server 1Gb/s Physics Firewall Server Gb/s switch 1Gb/s Win 2k Server 1Gb/s 100Mb/s Particle Physics 1Gb/s 100Mb/s Physics Backbone Router 100Mb/s 1Gb/s desktop Clarendon Lab 100Mb/s 1Gb/s desktop 1Gb/s 1Gb/s 100Mb/s Astro Atmos Theory

  5. Autumn 2002 RH7.3 RH7.3 RH7.3 RH7.3 PBS Batch Farm Autumn 2002 4*Dual 2.4GHz systems CDF General Purpose Systems Fermi7.3.1 RH7.1 RH7.3 RH7.3 RH6.2 1Gb/s pplx2 pplx1 morpheus pplxfs1 pplxgen minos DAQ RH7.3 RH7.3 RH6.2 RH7.1 ppminos1 ppminos2 pplx3 (SNO) ppnt117 (HARP) cresst DAQ RH7.1 RH7.3 Grid Development ppcresst1 ppcresst2 RH6.2 RH6.2 RH6.2 RH6.2 RH6.2 RH6.2 RH7.3 Atlas DAQ RH7.1 RH7.1 pptb01 tblcfg tbse01 tbce01 grid pplxbatch pptb02 sam testing edg ui ppatlas1 atlassbc

  6. RH7.3 RH7.3 RH7.3 RH7.3 PBS Batch Farm Autumn 2002 4*Dual 2.4GHz systems General Purpose Systems RH7.3 RH7.3 RH6.2 1Gb/s pplx2 pplxfs1 pplxgen

  7. Zero - D X- 3i SCSI -IDE RAID 12 * 160GB Maxtor Drives This proved to be a disaster and was rejected in favour of bare scsi disks which we internally mounted in our rack mounted file server Supplied by Compusys

  8. The Linux File Server: pplxfs1 8*146GB SCSI disks

  9. General Purpose Linux Server : pplxgen pplxgen is a Dual 2.2GHz Pentium 4 Xeon based system with 2GB ram. It is running Red Hat 7.3 It was brought on line at the end of August 2002 to share the load with pplx2 as users migrated off al1 (the Digital Unix Server)

  10. PP batch farm running Red Hat 7.3 with Open PBS can be seen below pplxgen This service became fully operational in Feb 2003.

  11. FEBRUARY 2003 CDF Fermi7.3.1 RH7.1 pplx1 (new) morpheus 1Gb/s LHCB MC Fermi7.3.1 Fermi7.3.1 RH6.2 RH6.2 Fermi7.3.1 node9 Fermi7.3.1 Fermi7.3.1 grid pplxbatch Fermi7.3.1 RH6.1 Fermi7.3.1 Fermi7.3.1 Fermi7.3.1 tbgen01 Fermi7.3.1 Grid Development node1 Fermi7.3.1 Fermi7.3.1 RH7.3 RH6.2 RH6.2 RH6.2 RH6.2 RH6.2 RH6.2 cdfsam matrix pptb01 tblcfg tbse01 tbce01 tbwn01 tbwn02 pptb02 edg ui sam testing

  12. Grid development systems. Including EDG software testbed setup.

  13. New Linux Systems Morpheus is an IBM x370 8 way SMP 700MHz Xeon with 4GB RAM and 1TB Fibre Channel disks Installed August 2001 Purchased as part of a JIF grant for the cdf group Runs Red Hat 7.1 Will use cdf software developed at Fermilab and here to process data from the cdf experiment.

  14. Tape Backup is provided by a Qualstar TLS4480 tape robot with 80 slots and Dual Sony AIT3 drives. Each tape can hold 100GB of data. Installed January 2002. Netvault Software from BakBoneis used, running on morpheus, for backup of both cdf and particle physics systems.

  15. Second round of cdf JIF tender: Dell Cluster - MATRIX 10 Dual 2.4GHz P4 Xeon servers running Fermi linux 7.3.1 and SCALI cluster software. Installed December 2002

  16. Approx 7.5 TB for SCSI RAID 5 disks are attached to the master node. Each shelf holds 14 146GB disks. These are shared via NFS with the worker nodes. OpenPBS batch queuing software is used.

  17. Plenty of space in the second rack for expansion of the cluster.

  18. Lhcb Monte Carlo Setup Compute Node Grid Gateway 8 way 700MHz Xeon Server RH6.2OpenAFSOpenPBS gridRH6.2Globus1.1.3OpenAFSOpenPBS The 8 way SMP has now been reloaded as a MS Windows Terminal Server and lhcb MC jobs will be run on the new pp farm.

  19. Problems • IDE Raid proved to be unreliable, caused lots of down time. • Problems with NAT (using iptables caused NFS problems and hangs) Solved by dropping NAT and using real IP addresses for PP farm • Trouble with ext3 journal errors. • Hackers…

  20. Problems • Lack of Manpower! • Number of Operating systems slowly reducing, Digital unix and vms very nearly gone. NT4 also practically eliminated. • Getting closer to standardising on RH 7.3 especially as the EDG software is now heading that way. • Still finding it very hard to support laptops but now have a standard clone and recommend IBM laptops. • Would be good to have more time to concentrate on security…. (See later talk)

More Related