830 likes | 1.07k Views
Solaris Virtualization Methods (with Practical Exercise in Containers). Dusan Baljevic Sydney, Australia. Solaris - Four Types of Virtualization. Solaris Resource Management Solaris Containers Sun Fire Dynamic System Domains Logical Domains. Four Types of Virtualization - Diagram.
E N D
Solaris Virtualization Methods (with Practical Exercise in Containers) Dusan Baljevic Sydney, Australia
Solaris - Four Types of Virtualization • Solaris Resource Management • Solaris Containers • Sun Fire Dynamic System Domains • Logical Domains Dusan Baljevic
Four Types of Virtualization - Diagram Dusan Baljevic
Sun Partitioning SoftwareSolarisContainers HardwareDynamicDomains FirmwareLogicalDomains High-end and Midrange Only Cool ThreadsServers Only Single OS Image Dusan Baljevic
Solaris Resource Management Solaris Resource Management can control the CPU shares, operating parameters, and other aspects of each process, thereby allowing many applications to coexist in the same operating system environment Dusan Baljevic
Solaris Containers • Solaris Containers can create multiple virtualized environments (Solaris Zones) within one Solaris kernel structure, keeping the memory footprint low • Solaris Containers can be used with Solaris Resource Management to provide flexibility in fine-grained resource controls (good for consolidating dynamically resource-controlled environments within a single kernel or version of Solaris) • Only one OS instance - less instances to administrate • Best efficiency • Built into the O/S, no hypervisor • Physical storage management done in the global zone • Minimal overhead (typically < 1-2 %) • Unified TCP/IP stack for all zones • Up to 8191 (non-global) zones are supported within a single OS image Dusan Baljevic
Sun Fire Dynamic System Domains Sun Fire Dynamic System Domains can create electrically isolated domains on high-end Sun Fire systems, while offering the maximum security isolation and availability in a single chassis and combining many redundant hardware features for high availability Dynamic System Domains are great for consolidating a smaller number of mission critical services with security and availability Dusan Baljevic
Logical Domains (LDom) • Logical Domains fit somewhere between the Containers and Dynamic System Domains • Logical domains provide isolation between the various domains (achieved through a firmware layer), lowering the hardware infrastructure • They are available on CoolThreads servers only (Sun Fire T-1000 and T-2000) • Each guest domain can be created, destroyed, reconfigured, and rebooted independently • Virtual console, Ethernet, disk, and cryptographic acceleration • Live dynamic reconfiguration of virtual CPUs • Fault management architecture (FMA) diagnosis for each logical domain Dusan Baljevic
Solaris Containers – File System Models • Before creating any zones on a Solaris 10 server, all processes run in the global zone • After a zone is created, it has processes which are associated with it • Any zone which is not the global zone is called a non-global zone. Some call non-global zones "zones." Others call them "local zones“ (this is not recommended) • The default zone file system model is called sparse-root. It trades-off efficiency at the cost of some flexibility. Sparse-root zones optimize physical memory and disk space usage by sharing directories like /usr, /sbin, /platform, and /lib. Sparse-root zones have their own private file areas for directories like /etc and /var • Whole-root zones increase configuration flexibility but increase resource usage. They do not use shared file systems for /usr, /lib, and few others Dusan Baljevic
Zone types global zone /dev/dsk/c0t0d0s0 wholeroot zone sparse root zone ~ 4 GB ~ 4 GB ~ 100 MB Dusan Baljevic
Solaris Containers – States Dusan Baljevic
Solaris Containers – Storage • Direct Attached Storage (simple, but limits flexibility when moving a container or its workload to a different server) • Network Attached Storage (NAS). Currently root directory of a container cannot be stored on NAS. However, NAS can be used to centralize zone and application storage • Storage Area Networks (SAN) Dusan Baljevic
Solaris Containers – File Systems Storage can be assigned to containers through several methods: • Loopback File System (LOFS) • Zettabyte File System (ZFS) • Unix File System (UFS) • Direct device • Network File System (NFS) Dusan Baljevic
Solaris Containers – Firewalls and Networking • Currently, IP filters cannot be used to filter traffic passing between zones since it remains inside the system and never reaches firewalls and filters • IP Multi-Pathing (IPMP) and Sun Trunking can be used to improve network bandwidth and availability • VLANs available through IPMP too Dusan Baljevic
Solaris Containers – Subnet Masks • Zone’s network interfaces are configured by the global zone – hence, netmask information must be stored in the global zone • If non-default subnet masks are used for non-global zones, ensure that the mask information is stored in the global zone’s /etc/netmasks file • Subnet mask may also be specified via zonecfg(1) command using CIDR notation (10.99.64.0/28) Dusan Baljevic
Solaris Containers – Product Registry Database • Very important warning: ensure packages are registered in global zone correctly: zoneadm -z zone1 install Preparing to install zone <zone1>. ERROR: the product registry database </> does not exist ERROR: cannot determine the product registry implementation used by the global zone at </> ERROR: cannot create zone boot environment <zone1> zoneadm: zone 'zone1': '/usr/lib/lu/lucreatezone' failed with exit code 74 • Solution: pkgchk -l Software contents file initialized Dusan Baljevic
Solaris Containers – Dynamic Resource Pools • Processor sets, Pools and Projects • DRP are collections of resources reserved for exclusive use by an application or set of applications • Processor set is a grouping of processors. One or more workloads can be assigned to a processor set via DRP • Commands: pooladm(1), poolcfg(1), poolstat(1), poolbind(1), psrset(1), prctl(1), and others: # prtcl –n zone.max-swap –v 1g –t privileged –r –e deny –I zone zfszone1 Dusan Baljevic
Solaris Containers – Resource Capping Daemon • Major improvement in Solaris 10 Update 4 • The cap can be modified while the container is running: # rcapadm -E # rcapadm -z loczone3 -m 300m • Because the cap does not reserve RAM, one can over-subscribe RAM usage. Drawback is the possibility of paging • The cap can be defined when container is set up via zonecfg(1) command and “add capped-memory” option • Virtual memory (swap) can also be capped • Third new memory cap is “locked memory” (prevented from being paged out) Dusan Baljevic
Solaris Containers – Fair Share Scheduler • A commonly used method to prevent "CPU abuse" is to assign a number of “CPU shares” to each container. The relative number of shares assigned per zone guarantees a relative minimum amount of CPU power. This is less wasteful than dedicating a CPU to a container that will not completely utilize the dedicated CPUs • In Solaris 10 Update 4 only two steps are needed: a) The system must use FSS as the default scheduler (command tells the system to use FSS as the default scheduler the next time it boots): # dispadmin -d FSS b) The container must be assigned some shares: # zonecfg -z myzonex zonecfg:myzonex> set cpu-shares=100 zonecfg:myzonex> exit Dusan Baljevic
Solaris Containers – O/S Patching Methods • When only sparse-root zones are used (default) and zones provide one or a few types of services, install patches all packages and patches in all zones • In a system with zones provides many different types of services (Web development, database testing, proxy services), install packages and patches directly into the zones which need them Dusan Baljevic
Solaris Containers – patchadd(1) patchadd(1) has option “-G” Add patch(es) to packages in the current zone only. When used in the global zone, the patch is added to packages in the global zone only and is not propagated to packages in any existing or yet-to-be-created non-global zone When used in a non-global zone, the patch is added to packages in the non-global zone only Dusan Baljevic
Solaris Containers – O/S Patching Methods (continued) • Solaris 10 Update 4 adds the ability to use Live Upgrade tools on a system with containers. It is possible to apply an update to a zoned system and drastically reduce the downtime necessary to apply patches • Live Upgrade can create an Alternate Boot Environment (ABE). The ABE can be patched while the Original Boot Environment (OBE) is still running its containers. After the patches have been applied, the system can be rebooted into the ABE. Downtime is limited to the time it takes to re-boot the system • Additional benefit: if there is a problem with the patch, instead of backing it out, the system can be rebooted into the OBE while the problem is investigated Dusan Baljevic
Solaris Containers – Flash Archives • All zones must be stopped when the flash archive is made from the global zone • If the source and target systems use different hardware configurations, device assignments must be changed after the flash archive is installed • Soft partitions in SVM cannot be flash archived yet Dusan Baljevic
Solaris Containers and ZFS – Third-Party Backups • EMC Networker 7.3.2. backs up and restores ZFS file systems, including ZFS ACLs • Veritas NetBackup will provide Netbackup support in version 6.5, which is scheduled for release in the second half of 2007. Current versions of NetBackup can backup and restore ZFS file systems, but ZFS ACLs are not preserved • IBM Tivoli Storage Manager client software (5.4.1.2) backs up and restores ZFS file systems with both the CLI and the GUI. ZFS ACLs are also preserved • Computer Associates' BrightStor ARCserve product backs up and restores ZFS file systems, but ZFS ACLs are not preserved Dusan Baljevic
Solaris Containers, ZFS and Data Protector • Back Agents (Disk Agents) ZFS support: Solaris 10 (including ACL support) 6.0 Planned CY Q4’07 ZFS support: Solaris 10 (excluding ACL support) 5.5 Planned CY Q4’07 • Non-global zone/container support is still not in the planning for Data Protector Dusan Baljevic
Solaris Containers – zoneadm(1) • Zoneadm(1) administer zones. Many options • As part of security policy, prevent someone from runningDoS attack in a non-global zone. To do so, we typically add the following to a zone's configuration, using zonecfg(1M):add rctlset name=zone.max-lwpsadd value (priv=privileged,limit=1000,action=deny)end Dusan Baljevic
Solaris Containers – Root File System Dusan Baljevic
Solaris Containers – Root File System • Create a sparse-root Container: host-global# zonecfg -z zone1 zonecfg:zone1> create zonecfg:zone1> set zonepath=/zones/roots/zone1 zonecfg:luke> set autoboot=false zonecfg:zone1> add inherit-pkg-dir zonecfg:zone1:inherit-pkg-dir> set dir=/opt zonecfg:zone1:inherit-pkg-dir> end zonecfg:zone1> add rctl zonecfg:zone1:rctl> set name=zone.cpu-shares zonecfg:zone1:rctl> set value=(priv=privileged,limit=40,action=none) zonecfg:zone1:rctl> end zonecfg:zone1> add net zonecfg:zone1:net> set physical=qfe0 zonecfg:zone1:net> set address=192.168.1.200 zonecfg:zone1:net> end zonecfg:zone1> exit Dusan Baljevic
Solaris Containers – Root File System (continued) • Install the Container: host-global# zoneadm -z zone1 install • Create the sysidcfg file: host-global# cat /zones/roots/zone1/root/etc/sysidcfg system_locale=en_AU.ISO8859-1 terminal=xterm network_interface=primary { hostname=zone1 } timeserver=192.168.1.73 security_policy=NONE name_service=DNS {domain_name=mydom.com name_server=10.99.66.44,192.168.1.10} timezone=Australia/NSW root_password=Mxpy/32z032 Dusan Baljevic
Solaris Containers – Root File System (continued) • Create file (if using JumpStart): host-global#touch /zones/roots/zone1/root/etc/.NFS4inst_state.domain • Boot the Container: host-global#zoneadm -z zone1 boot • Log into the console, use zlogin(1): host-global# zlogin -C zone1 Dusan Baljevic
Traditional File Systems and ZFS Dusan Baljevic
Global vs Non-global Zone (ZFS) Dusan Baljevic
Global vs Non-global Zone (ZFS) (continued) Dusan Baljevic
Solaris Containers – ZFS • Use zpool(1) command: host-global# zpool create zoneroot c0t0d0 c1t0d1 • Create a new ZFS file system: host-global# zfs create zoneroot/zfszone1 host-global# chmod 700 /zoneroot/zfszone1 Dusan Baljevic
Solaris Containers – ZFS (continued) • Set the quota on the file system: host-global# zfs set quota=1024m zoneroot/zfszone1 • Create a sparse-root zone: host-global# zonecfg -z zfszone1 zonecfg:zfszone1> create zonecfg:zfszone1> set zonepath=/zoneroot/zfszone1 zonecfg:zfszone1> add net zonecfg:zfszone1:net> set physical=hme2 zonecfg:zfszone1:net> set address=192.168.7.40 zonecfg:zfszone1:net> end zonecfg:zfszone1> exit Dusan Baljevic
Solaris Containers – ZFS (continued) host-global# zoneadm –z zfszone1 install host-global# cat /zoneroot/zfszone1/root/etc/sysidcfg system_locale=C terminal=dtterm network_interface=primary { hostname=zfszone1 } timeserver=localhost security_policy=NONE name_service=NONE timezone=US/Eastern root_password="“ host-global# zoneadm -z zfszone1 boot host-global# zlogin -C zfszone1 Dusan Baljevic
Solaris Containers – UFS with SVM Dusan Baljevic
Solaris Containers – UFS with SVM The example assumes the following disk layout: c1t2d0s0 20MB Metadata DB c1t2d0s3 5GB Data partition c2t4d0s0 20MB Metadata DB c2t4d0s3 5GB Data partition Dusan Baljevic
Solaris Containers – UFS with SVM (continued) • Create the SVM database and the replicas of it: host-global# metadb -a –c 2 -f c1t2d0s0 c2t4d0s0 • Create two metadisks - virtual devices: host-global# metainit d11 1 1 c1t2d0s3 host-global# metainit d12 1 1 c2t4d0s3 Dusan Baljevic
Solaris Containers – UFS with SVM (continued) • Create the first part of the mirror: host-global# metainit d10 -m d11 • Add the second metadisk to the mirror: host-global# metattach d10 d12 Dusan Baljevic
Solaris Containers – UFS with SVM (continued) • Create a new soft partition. A "soft partition" is an SVM feature which allows the creation of multiple virtual partitions in one metadisk (requires the “–p” option to metainit(1)): host-global# metainit d100 -p d10 524M • Create the new UFS file system: host-global# mkdir -p /zones/roots/ufszone1 host-global# newfs /dev/md/dsk/d100 host-global# mount /dev/md/dsk/d100 /zones/roots/ufszone1 host-global# chmod 700 /zones/roots/ufszone1 Dusan Baljevic
Solaris Containers – UFS with SVM (continued) • Create a sparse-root zone: host-global# zonecfg -z ufszone1 zonecfg:ufszone1> create zonecfg:ufszone1> set zonepath=/zones/roots/ufszone1 zonecfg:ufszone1> add net zonecfg:ufszone1> set physical=ipge1 zonecfg:ufszone1> set address=10.99.64.12/28 zonecfg:ufszone1> end zonecfg:ufszone1> exit host-global# zoneadm -z ufszone1 install Dusan Baljevic
Solaris Containers – UFS with SVM (continued) host-global# cat /zones/roots/ufszone1/root/etc/sysidcfg system_locale=C terminal=vt100 network_interface=primary { hostname=ufszone1 } timeserver=localhost security_policy=NONE name_service=NONE timezone=Europe/Berlin root_password=“” Dusan Baljevic
Solaris Containers – UFS with SVM (continued) host-global# zoneadm -z ufszone1 boot host-global# zlogin -C ufszone1 Dusan Baljevic
Containers – Hostname Caveat • There does not seem to be any special requirement for zone naming • One caveat though, zones should use a naming convention that enables them to be seen individually with ps(1) commands (for example, "ps -elfyZ") • To do this, one would want the first eight characters of each zone name to be unique Dusan Baljevic
Containers – Hostname Caveat (continued) #zoneadm list -v ID NAME STATUS PATH 0 global running / 1 longhost-z1 running /zones/longhost-z1 2 longhost-z2 running /zones/longhost-z2 3 longhost-z3 running /zones/longhost-z3 4 longhost-z4 running /zones/longhost-z4 Dusan Baljevic
Containers – Hostname Caveat (continued) When a less experienced Unix admin runs a command to check which processes run in each zone, they get the following type of results... The following example is for daemon inetd, which runs in each zone: # ps -efZ | grep inetd global root 256 1 0 Mar 13 ? 0:39 /usr/lib/inet/inetd start longhost root 792 1 0 Mar 13 ? 0:39 /usr/lib/inet/inetd start longhost root 1129 1 0 Mar 13 ? 0:38 /usr/lib/inet/inetd start longhost root 1144 1 0 Mar 13 ? 0:39 /usr/lib/inet/inetd start longhost root 1394 1 0 Mar 13 ? 0:39 /usr/lib/inet/inetd start Dusan Baljevic
Containers – Username Caveat • Namespace is consistent across all zones. The UID which gets assigned with the "-u“ option is visible in the Global Zone. These users are not visible in peer non-global zones • Define unique UIDs across all zones (including the global zone): • # useradd -u 501 -g mygid -d /export/home myuser1 • # useradd -u 502 -g mygid -d /export/home myuser2 • # useradd -u 503 -g mygid -d /export/home myuser3 • These processes can now be viewed in the global zone: • # prstat -Z Dusan Baljevic
Using Project in Containers • Instead of defining global parameters in /etc/system, use project-based resource management. For example, to allocate 4 GB shared memory to user oracle: • # projadd -U oracle -K "project.max-shm-memory=(priv,4096MB,deny)" user.oracle Dusan Baljevic
Migrating Container – Root File System Method hostA-global# zlogin zone1 shutdown -y -i 0 hostA-global# zoneadm -z zone1 detach hostA-global# cd /zones/roots/zone1 hostA-global# pax -w@f /tmp/zone1.pax -p e * hostA-global# scp /tmp/zone1.pax root@hostB:/tmp/zone1.pax hostB-global# mkdir -m 700 -p /zones/roots/zone2 hostB-global# cd /zones/roots/zone2 hostB-global# pax -r@f /tmp/zone1.pax -p e hostB-global# zonecfg -z zone2 zonecfg:dusk> create -a /zones/roots/zone2 zonecfg:dusk> exit hostB-global# zoneadm -z zone2 attach hostB-global# zoneadm -z zone2 boot Dusan Baljevic