1 / 11

WP4-install task report

WP4-install task report. WP4 workshop Barcelona project conference 5/03 German Cancio. Agenda. General architectural overview for R3 R3 components description EDG testbed integration Issues. Release 3 architecture. R3 subsystems: Base Installation:

olga-marsh
Download Presentation

WP4-install task report

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. WP4-installtask report WP4 workshop Barcelona project conference 5/03 German Cancio

  2. Agenda • General architectural overview for R3 • R3 components description • EDG testbed integration • Issues

  3. Release 3 architecture R3 subsystems: • Base Installation: • AII (Automated Installation Infrastructure) • Software Distribution: • Software Repository (SWrep) • Software Package Management Agent (SPMA) • Node Configuration: • NCM (Node Configuration Manager) • (Ongoing: LCFGng)

  4. Packages (rpm, pkg) Software Package Mgmt Agent (SPMA) • SPMA manages the installed packages • Runs on Linux (RPM) or Solaris (PKG) • SPMA configuration done via an NCM component • Can use a local cache for pre-fetching packages (simultaneous upgrades of large farms) WP4-install R3 arch SWRep Servers http cache SPMA packages Mgmt API nfs SPMA.cfg (RPM, PKG) ACL’s Automated Installation Infrastructure • DHCP and Kickstart (or JumpStart) are re-generated according to CDB contents • PXE can be set to reboot or reinstall by operator ftp SPMA SPMA NCM Components NCM Node (re)install? Software Repository • Packages (in RPM or PKG format) can be uploaded into multiple Software Repositories • Client access is using HTTP, NFS/AFS or FTP • Management access subject to authentication/authorization Installation server Cdispd PXE CCM PXE handling Mgmt API Registration Notification ACL’s Node Install DHCP Node Configuration Manager (NCM) • Configuration Management on the node is done by NCM Components • Each component is responsible for configuring a service (network, NFS, sendmail, PBS) • Components are notified by the Cdispd whenever there was a change in their configuration DHCP handling KS/JS KS/JS generator Client Nodes CCM CDB

  5. SPM (Software Package Mgmt) (I) SWRep (Software Repository): • Client-server toolsuite for the management of software packages • Universal repository: • Extendable to multiple platforms and package formats (RHLinux/RPM, Solaris/PKG,… others like Debian dpkg) • Multiple package versions/releases • Management (“product maintainers”) interface: • ACL based mechanism to grant/deny modification rights • Current implementation using SSH • Client access: via standard protocols • HTTP (scalability), but also AFS/NFS, FTP • Replication: using standard tools (eg. rsync)

  6. SPM (Software Package Mgmt) (II) Software Package Management Agent (SPMA): • Runs on every target node • Configurable locally or via CDB (NCM component) • Multiple repositories can be accessed (eg. division/experiment specific) • Plug-in framework allows for portability • System packager specific transactional interface (RPMT, PKGT) • Can manage either all or a subset of packages on the nodes • Useful for add-on installations, and also for desktops • Configurable policies (partial or full control, mandatory and unwanted packages, conflict resolution…) • Addresses scalability • Packages can be stored ahead in a local cache, avoiding peak loads on software repository servers (simultaneous upgrades of large farms)

  7. NCM (Node Configuration Manager) Modules: • Components: • NCM ‘components’ (like LCFG components) are responsible for updating local config files, and notifying services if needed • Components are notified if there are changes in the configuration chunks they’re interested in • cdispd (Configuration Dispatch Daemon) • Monitors the config profile, and invokes components via the ncd if there were changes • ncd (Node Configuration Deployer): • framework and front-end for executing components (via cron, cdispd, or manually) • Component support libraries: • For recurring system mgmt tasks (interfaces to system services, sysinfo), log handling, etc • More details in NCM design document http://edms.cern.ch/document/372643

  8. AII (Automated Installation Infrastructure) • Subsystem to automate the node base installation via the network • Layer on top of existing technologies (base system installer, DHCP, PXE) • Modules: • AII-dhcp: • manage DHCP server for network installation information • AII-nbp (network bootstrap program): • manages the PXE configuration for each node (boot from HD/ start the installation via network) • AII-osinstall: • Manage OS configuration files required by the OS installation procedure (KickStart, JumpStart) • More details in AII design document: http://edms.cern.ch/document/374559

  9. Current status • SPM • First production version available (SPMA and SWRep) • Being deployed in the CERN Computer Centre (see my next talk) • Enhanced functionality for mid-October • NCM • Architectural design finished • Detailed (class) design progressing • first version of new build tools (replacing DICE tools) • First version expected mid July • Porting/coding of base configuration components completed mid September • AII • Architectural design finished • Detailed Design, implementation progressing • first version expected mid July • LCFGng: • Functionality freeze, but ongoing maintenance/support until the end of the project

  10. EDG testbed integration ? • A full stable and production replacement of the current fabric management tool (LCFGng) is not realistic before the end of the year: • LCFGng parts for base installation, software distribution and node configuration are tightly coupled – no drop-in replacement • LCFGng components have to be ported or rewritten (>60 components!) • Testbed configuration has to be migrated to CDB/Pan • Teaching of sysadmins to use new tools (installation&configuration) • Stability, stability, stability… • A fully functional NCM/SPM/AII solution will be provided, although with limited configuration components and pre-defined configurations: • AII functional for RH73 • SPM functional for RH73 and (maybe) Solaris * • NCM with ‘base’ components (for managing a basic RH73 system, maybe Solaris*) (*Solaris port by CERN/IT and Sun collaboration)

  11. Issues • LCFGng (and LCFG!!) support and maintenance • Maintenance, support and teaching for production legacy systems is an manpower consuming issue!

More Related