180 likes | 207 Views
Updates from CERN's IT Division, Fabric Infrastructure, Operations, Architecture, and Data Challenges presented in the HEPiX/HEPNT Spring 2004 conference in Edinburgh. Details on CERN's structure, management changes, security, procurement, architecture, and database advancements.
E N D
Site report: CERN Helge.Meinhard (at) cern.ch HEPiX/HEPNT spring 2004 Edinburgh
Structure, management • As of 01 January 2004: DG (Aymar), CSO (Engelen), CFO (Naudi), 7 Departments • IT Division has become IT Department • Head: W. von Rüden, deputy: J. Ferguson • All of previous IT Division • Major part of AS Division: 3 groups working on AIS • 2 groups merged (R. Martens) • 1 group merged with DB group • Printshop, document support from ETT Division • Merged with user support (M. Draper) HEPiX Edinburgh: CERN site report
Computer Security • AFS password expiration in force • 12 months, like other services (e.g. NICE) • Insecure protocols (e.g. ftp) banned for off-site traffic • Biggest challenge are viruses and worms • Visitor / unmanaged PCs, dual-boot systems HEPiX Edinburgh: CERN site report
Fabric Infrastructure and Operations (1) • Quattor / Lemon deployment • Moving ahead, major benefits • German Cancio’s talk • Lemon status display, combined metrics Miroslaw Siket’s talk • LEAF • State Management System taking shape, being integrated with other tools • System administration • 2nd level of 3-level support model • Team now 7 people • In full swing, all major clusters under their responsibility HEPiX Edinburgh: CERN site report
Fabric Infrastructure and Operations (2) • CC refurbishment ongoing • Civil engineering work for bunker for new 18 kV substation completed • Right-hand side of large machine room ready • Emptying left-hand side has begun, to be completed by July • Disk and tape servers: re-installed with standard configuration • More details: Tim Smith’s talk HEPiX Edinburgh: CERN site report
Fabric Infrastructure and Operations (3) • Procurements • Tenders open for white boxes (presumably…) and SATA-based disk servers • Preparing tenders for disk arrays for autumn • Documents for Market Survey for purchases in 2006…2008 drafted • Serial consoles • All hardware procured (LSZH cables - major amount of work) • Decision to use SLAC software, collaboration with Chuck • First machines wired up, first users relying on service • Stress tests for new and repaired machines • Following implementation at SLAC • Now part of standard procedure • Handled by Sysadmin team HEPiX Edinburgh: CERN site report
Architecture and Data Challenges (1) • CERN Openlab • Workshops on total cost of ownership, and on security held • Oracle joined as full partner, Voltaire joined as a contributor • Detailed studies • Fast interconnects • Disk server performance: Choice of RAID, file system, ... • Jan Iven’s talk • Storage performance measurements at Caspur HEPiX Edinburgh: CERN site report
Architecture and Data Challenges (2) • Linux certification: CEL3 nearing completion • CERN recompile of RHEL 3 sources • Aim to provide lxplus/lxbatch like service by end May 2004 • CERN 7.3 will be supported until end 2004 • Jarek Polok’s talk • Considering (and testing) RHEL Panel and discussion • Data challenges of CMS and Alice completed, Atlas and LHCb started • New stager, Storage Resource Manager Olof Bärring’s talk HEPiX Edinburgh: CERN site report
Data Bases (1) • To date: too much depends on good will and good luck • Vision: Taking advantage of reorganisation • Re-evaluate services • Streamline architectures, configurations, processes • Simplify management, maintenance and trouble-shooting • Improve security, test regularly • SLAs (realistic, measurable) and quality control • Required for scalability in the future HEPiX Edinburgh: CERN site report
Data Bases (2) • Openlab work: 2 Oracle-funded fellows • Evaluation of Oracle 10g Database Application Server and Enterprise Manager • Features to evaluate: • Cross-platform transportable table spaces • Stand-by data bases • Replications/streams • Native numbers • EZ install HEPiX Edinburgh: CERN site report
Data Bases (3) • LCG services – data challenges • CMS DC04 (April/May 2004) • Smooth running on Physics Sun Cluster after close collaboration with users • First full-scale production usage of LCG file catalog (RLS): performance problem of middleware, IT-GD working • Other data challenges starting, continuing to stress the Physics Sun cluster and the RLS • Future: investigate Oracle Streams for LCG file catalog replication, deploy stand-by DB and (later) Oracle application server and DB clusters HEPiX Edinburgh: CERN site report
Data Bases (4) • POOL: Persistency framework for LHC • Use Root I/O to stream data to files • Successfully deployed during CMS DC04, Atlas and LHCb to test in their DCs • Alice keeps using Root directly • Workarounds implemented to cope with RLS middleware performance limitations • New Oracle contract • Based on named users • Platform and location independent • More Application Server licences • Maintenance costs reduced and fixed for 9 years • Extended to all CERN staff + users (includes remote usage for CERN related work) • Distribution has been prepared (support concerns) HEPiX Edinburgh: CERN site report
Grid Deployment • LCG status Oliver Keeble’s talk • LCG user registration, VO management Maria Dimou’s talk HEPiX Edinburgh: CERN site report
Internet Services • Mail migration (to Exchange servers) completed – 14000 users migrated • Windows terminal servers adopted as a service, 227 users during first months • Access to CERN DFS via WebDAV adopted as a service • Windows screen saver for production field-proven, deployment imminent • Fighting effects of Sasser worm HEPiX Edinburgh: CERN site report
Product Support (1) • Contract for distributed computing support retendered • Won by SERCo • Being prepared to be in full operation by 1st July 2004 • CVS services • Two flavours, one with repositories on AFS (more fail-safe), one with local repositories – 4 machines each • Major projects migrated to this service (e.g. Atlas offline) • Manuel Guijarro’s talk HEPiX Edinburgh: CERN site report
Product support (2) • Solaris • Status and plans Manuel Guijarro’s talk • Ximian connector • Source code released by Novell • Direct MAPI access, including advanced features such as shared calendaring • PS evaluating deployment on Linux and Solaris using Evolution as mail client • Busy preparing CHEP… HEPiX Edinburgh: CERN site report
User and Document Services • InDiCo: Web application for organising conferences Mick Draper’s talk • Major changes to Computing Helpdesk as a consequence of retendered Distributed Computing Support contract • More people at centralised (building 513) helpdesk, less local support HEPiX Edinburgh: CERN site report
Communication Systems • Internet Land Speed record: 6.25 Gb/s between Los Angeles and Geneva • Smooth migration to new GSM operator (Sunrise) • Portable registration enforced • ACB (Automatic call-back) shutdown: • 01-Jul-2004: Call-in only • 31-Dec-2004: End of service • Recommended replacement: ISPs on open market HEPiX Edinburgh: CERN site report