1 / 8

CASPUR Site Report

This site report provides updates on central computers, storage news, network highlights, and projects for the year 2004 in Edinburgh. It covers IBM SMP, HP SMP, Itanium-2 SMP, NEC SX-6i, storage updates, and network highlights.

delarosaj
Download Presentation

CASPUR Site Report

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CASPUR Site Report Andrei Maslennikov Lead - Systems Edinburgh, May 2004

  2. Contents • Update on central computers • Storage news • Network highlights • Projects 2004 A.Maslennikov - Edinburgh 2004

  3. IBM SMP: • - 3 frames with 80 POWER-4 CPUs at 1.1 GHz and 144 GB of RAM • - 1 legacy frame with 64 POWER-3 CPUs at 375 MHz and 64 GB of RAM • - AIX 5.2 ML2+, AFS and SGEE on all nodes • - Very stable, all CPUs are heavily used • - Under lease until 2006; will be probably upgrading to POWER-5 in 2005 • HP SMP: • - 8 4-CPU ES45 nodes at 1.25 GHz, 64GB of RAM and 1.2 TB of local FC disk • - 6 Legacy ES40 nodes at 500-600 MHz used for BIOGRID project • - Tru64 5.1a++ on all nodes, AFS + SGEEE on 5 standalone nodes • - True Cluster on 9 nodes (AFS via Translator; powerful Solaris 9 gateway, memcache, modified SSH) • - Requires a lot of attention, but very fast and fully used (mainly computational chemistry apps) • - Arriving: 32-CPU EV7 node • Itanium-2 SMP: • - 1 single CPU, 5 biprocessor and 1 quad nodes (900 MHz - 1 Ghz – 1.5 GHz) • - RH AS 3 on one node, all others run CERN CEL3/AS3 Build for ia64, AFS, SGEEE • NEC SX-6i: • - single CPU 4GB RAM, 8 GFLOP • - speedup up to 10x against POWER4 for some apps, currently considering SMP purchase • Reference: several biprocessor Intel/32bit and AMD/64bit Central computers A.Maslennikov - Edinburgh 2004

  4. Storage update • AFS: 4 cells on site and 6 outside • - OpenAFS 1.2.11 on Linux • - Main Servers: SuperMicro 2x2.8 GHZ, June 2004: 6 TB (Infortrend SATA/FC) • - Vice partitions in SGI XFS – only one XFS-related problem in 1.5 years • - Standalone backup server on GigE, 84 GB/hour with 2 LTO2 drives • - 3 cells are running Heimdal KDC since 6 months • - AFS-aware SSH 3.8p1 binary builds (GSSAPI, K5 or AFSpw login+token) • - Linux / WXP Heimdal single sign-on and AFS homedir in one of the cells. • Administration: ssh but have just successfully tested AD (w. help of INFN-Lecce) • - Will soon be migrating INFN’s national cell to K5-MIT (cross-realm and Win issues) A.Maslennikov - Edinburgh 2004

  5. Storage update - 2 • NFS (Mountain View Data): • - In production since 1.5 years, very stable (runs off XFS, no crashes so far) • - 2 SuperMicro 2x2.8 GHZ, June 2004: 8 TB (Infortrend SATA/FC) • - 0.5 TB under staging (5 TB archived) • Digital Library services on GFS: • - Science Server, Web of Science web services – heavy load • - 3300 scientific magazines, 2.5 million articles in fulltext PDF, searchable DB • - Needed for load balancing: shared filestore with locking • - On Sistina GFS since 6 months, 3 SM 2-way servers, 16 TB (Infortrend SATA/FC) • - EXT3 copy of everything (tape backup is too slow for this number of files) A.Maslennikov - Edinburgh 2004

  6. Network highlights • Plentitude of networks under control of Clavister FW • - Internal workplaces, training class, visitor’s room – only outgoing connectivity • - Internal and external DMZ’s, lab networks, internal DNS – quite complex • - Private NAS GigE network outside FW • - FW is far from saturation • Internet Exchange Point - NAMEX • - About 20 big customers (Telecom, Tiscali, Albacom, mobile operators, industry) • - Traffic: around 1 Gbit / sec • F-Root Name Server • - Second in Europe after Madrid, first (and still the only one) in Italy • IPv6 • - Active member of 6NET project • - CASPUR’s web site can be reached on IPv6 A.Maslennikov - Edinburgh 2004

  7. CASPUR: principal resources in 2004 Internet Internal infrastructure Digital Library 16TB IBM – 150 CPUs (375 -1100 MHz) TSM Backup AFS Backup and Data Movers Itanium2 – 15 CPUs (0.9-1.5 GHz) HP 60 CPUs (667 – 1200MHz) NFS 8 TB AFS 6TB NEC 6Xi FC RAID SYSTEMS 32 TB FC TAPE SYSTEMS 60 / 120 TB Internet FC SAN Private NAS GigE Internal GigE’s A.Maslennikov - Edinburgh 2004

  8. Some activities in 2004 • Technology tracking (in collab. with CERN and other centers) – 1 FTE • - New storage devices • - New software solutions in the field of storage • - Excellent relationship with vendors, • tested so far: more than 600 KUSD worth of hardware • Data replication over WAN (in collab. with ENEA and GARR) – 0.5 FTE • - Several centers with identical data inside and outside RDBMS • - Each center has to be fully autonomous but should be able to forward any new • data to all other centers • - Bidirectional DB and plain data exchanges with eventual mediation at the head organization • - Data mirroring with non-disruptive release scheme • Staging IIa – 1 FTE (funded by CSP/Turin) • - New version of Tape Dispatcher coming out (general clean-up, virtual tape library support) • - Remote FC tape / libraries will be supported • University “La Sapienza” – student accounts • - Provide an account (space, personal web page, mail etc) for each of the 150 000 students • - In progress: active discussions with Interdepartmental Computing Authority (CITICORD) A.Maslennikov - Edinburgh 2004

More Related