1.79k likes | 1.82k Views
Learn about the Pluggable Storage Architecture (PSA) in ESX 4, which allows third-party vendors to add support for new storage arrays and customize load balancing and failover mechanisms. Explore the benefits of PSA and how it enhances storage management in VMware.
E N D
Module 3 - vStorage Cormac Hogan Product Support Engineering Rev P Last updated 23rd March 2009 VMware Confidential
Agenda • Module 0 - Product Overview • Module 1 - VI Installation-Upgrade • Module 2 - vCenter • Module 3 - vStorage • Module 4 - Networking VI4 - Mod 3 - Slide
Module 3 Lessons • Lesson 1 - Pluggable Storage Architecture • Lesson 2 - SCSI-3 & MSCS Support • Lesson 3 - iSCSI Enhancements • Lesson 4 - Storage Administration & Reporting • Lesson 5 - Snapshot Volumes & Resignaturing • Lesson 6 - Storage VMotion • Lesson 7 - Thin Provisioning • Lesson 8 - Volume Grow / Hot VMDK Extend • Lesson 9 - Storage CLI Enhancements • Lesson 10 – Paravirtualized SCSI Driver • Lesson 11 – Service Console Storage VI4 - Mod 3 - Slide
Introduction • Before we begin, I want to bring to your attention some very new device naming conventions in ESX 4. • Although the vmhbaN:C:T:L:P naming convention is visible, it is now known as the run-time name and is no longer guaranteed to be persistent through reboots. • ESX 4 now uses the unique LUN identifiers, typically the NAA (Network Addressing Authority) id. This is true for the CLI as well as the GUI and is also the naming convention used during the install. • The IQN (iSCSI Qualified Name) is still used for iSCSI targets. • The WWN (World Wide Name) is still used for Fiber Channel targets. • For those devices which do not have a unique id, you will observe an MPX reference (which is basically stands for VMware Multipath X device). VI4 - Mod 3 - Slide
Module 3 Lessons • Lesson 1 - Pluggable Storage Architecture • Lesson 2 - SCSI-3 & MSCS Support • Lesson 3 - iSCSI Enhancements • Lesson 4 - Storage Administration & Reporting • Lesson 5 - Snapshot Volumes & Resignaturing • Lesson 6 - Storage VMotion • Lesson 7 - Thin Provisioning • Lesson 8 - Volume Grow / Hot VMDK Extend • Lesson 9 - Storage CLI Enhancements • Lesson 10 – Paravirtualized SCSI Driver • Lesson 11 – Service Console Storage VI4 - Mod 3 - Slide
Pluggable Storage Architecture • PSA, the Pluggable Storage Architecture, is a collection of VMkernel APIs that allow third party hardware vendors to insert code directly into the ESX storage I/O path. • This allows 3rd party software developers to design their own load balancing techniques and failover mechanisms for particular storage array types. • This also means that 3rd party vendors can now add support for new arrays into ESX without having to provide internal information or intellectual property about the array to VMware. • VMware, by default, provide a generic Multipathing Plugin (MPP) called NMP (Native Multipathing Plugin). • PSA co-ordinates the operation of the NMP and any additional 3rd party MPP. VI4 - Mod 3 - Slide
Console OS VM VM Console OS VM VM Config Util Agent Guest OS Guest OS Config Util Agent Guest OS Guest OS VMnix Scsi HBA Emulation Scsi HBA Emulation VMnix Scsi HBA Emulation Scsi HBA Emulation VMKernel VMKernel Raw Pass SCSI Disk Emulation Raw Pass SCSI Disk Emulation Thru Disk Thru Disk RDM RDM Non COW Flat Non COW Flat Pass Pass Thru C Thru Filesystem Switch C RDM Filesystem Switch O RDM O N VMFS3 VMFS2 N NFS NFS VMFS2 VMFS3 F F I I G SCSI Disk Memory Disk G U SCSI Disk Memory Disk U R R A Logical Device IO Scheduler A Logical Device IO Scheduler T T I I O PSA SCSI Mid-Layer: Multipathing, O N N lun discovery, pathmasking, and path policy code. Scanning MPP Adapter IO Scheduler Adapter IO Scheduler Linux Emulation Linux Emulation iSCSI Driver FC Device Driver SCSI Device Driver iSCSI Driver FC Device Driver SCSI Device Driver Pluggable Storage Architecture (ctd) ESX 3 ESX 4 VI4 - Mod 3 - Slide
PSA Tasks • Loads and unloads multipathing plugins (MPPs). • Handles physical path discovery and removal (via scanning). • Routes I/O requests for a specific logical device to an appropriate MPP. • Handles I/O queuing to the physical storage HBAs & to the logical devices. • Implements logical device bandwidth sharing between Virtual Machines. • Provides logical device and physical path I/O statistics. VI4 - Mod 3 - Slide
VM VM Guest OS Scsi Emulation Guest OS Scsi Emulation VMKernel Raw Pass SCSI Disk Emulation Thru Disk RDM Non COW Flat Pass Thru C Filesystem Switch RDM O N NFS VMFS2 VMFS3 F I G SCSI Disk Memory Disk U R A Logical Device IO Scheduler T I PSA Framework O Multi Pathing Plugin N NMP SATP PSP Scanning Linux Emulation Device Drivers Native Multipathing Plugin – NMP NMP is VMware’s Native Multipathing plugin in ESX 4.0. NMP supports all storage arrays listed on the VMware storage Hardware Compatability List (HCL). NMP manages sub-plugins for handling multipathing and load balancing. PSA sits in the SCSI mid-layer of the VMkernel I/O stack VI4 - Mod 3 - Slide
MPP Tasks • The PSA discovers available storage paths and based on a set of predefined rules, the PSA will determine which MPP should be given ownership of the path. • The MPP then associates a set of physical paths with a specific logical device. • The specific details of handling path failover for a given storage array are delegated to a sub-plugin called a Storage Array Type Plugin (SATP). • SATP is associated with paths. • The specific details for determining which physical path is used to issue an I/O request (load balancing) to a storage device are handled by a sub-plugin called Path Selection Plugin (PSP). • PSP is associated with logical devices. VI4 - Mod 3 - Slide
NMP Specific Tasks • Manage physical path claiming and unclaiming. • Register and de-registerlogical devices. • Associate physical paths with logical devices. • Process I/O requests to logical devices: • Select an optimal physical path for the request (load balance) • Perform actions necessary to handle failures and request retries. • Support management tasks such as abort or reset of logical devices. VI4 - Mod 3 - Slide
Console OS VM VM • Jumble of array specific code and path policies Config Util Agent Guest OS Guest OS VMnix Scsi HBA Emulation Scsi HBA Emulation VMKernel Raw Pass SCSI Disk Emulation Thru Disk RDM Non COW Flat Console OS VM VM Pass LegacyMP Thru C Filesystem Switch Config Util Agent Guest OS Guest OS RDM O N NFS VMFS2 VMFS3 VMnix Scsi HBA Emulation Scsi HBA Emulation F I G SCSI Disk Memory Disk U VMKernel • NMP R A Logical Device IO Scheduler Raw Pass SCSI Disk Emulation T Thru Disk I RDM PSA Non COW Flat O Pass SATP FASTT N PSP FIXED Thru C Filesystem Switch RDM Scanning MPP O SATP CX N NFS VMFS2 VMFS3 PSP RR F I Adapter IO Scheduler SATP EVA G SCSI Disk Memory Disk U R Linux Emulation A Logical Device IO Scheduler T I iSCSI Driver FC Device Driver PSA O SCSI Device Driver N LegacyMP Scanning Adapter IO Scheduler Linux Emulation iSCSI Driver FC Device Driver SCSI Device Driver NMP ESX 4.0 Native Multipathing Plug-in (NMP) ESX 3.5 LegacyMP VI4 - Mod 3 - Slide
Storage Array Type Plugin - SATP • An Storage Array Type Plugin (SATP) handles path failover operations. • VMware provides a default SATP for each supported array as well as a generic SATP (an active/active version and an active/passive version) for non-specified storage arrays. • If you want to take advantage of certain storage specific characteristics of your array, you can install a 3rd party SATP provided by the vendor of the storage array, or by a software company specializing in optimizing the use of your storage array. • Each SATP implements the support for a specific type of storage array, e.g. VMW_SATP_SVC for IBM SVC. VI4 - Mod 3 - Slide
SATP (ctd) • The primary functions of an SATP are: • Implements the switching of physical paths to the array when a path has failed. • Determines when a hardware component of a physical path has failed. • Monitors the hardware state of the physical paths to the storage array. • There are many storage array type plug-ins. To see the complete list, you can use the following commands: • # esxcli nmp satp list • # esxcli nmp satp listrules • # esxcli nmp satp listrules –s <specific SATP> VI4 - Mod 3 - Slide
SATP (ctd) • List the defined Storage Array Type Plugins (SATP) for the VMware Native Multipath Plugin (NMP). • # esxcli nmp satp list Name Default PSP Description VMW_SATP_ALUA_CX VMW_PSP_FIXED Supports EMC CX that use the ALUA protocol VMW_SATP_SVC VMW_PSP_FIXED Supports IBM SVC VMW_SATP_MSA VMW_PSP_MRU Supports HP MSA VMW_SATP_EQL VMW_PSP_FIXED Supports EqualLogic arrays VMW_SATP_INV VMW_PSP_FIXED Supports EMC Invista VMW_SATP_SYMM VMW_PSP_FIXED Supports EMC Symmetrix VMW_SATP_LSI VMW_PSP_MRU Supports LSI and other arrays compatible with the SIS 6.10 in non-AVT mode VMW_SATP_EVA VMW_PSP_FIXED Supports HP EVA VMW_SATP_DEFAULT_AP VMW_PSP_MRU Supports non-specific active/passive arrays VMW_SATP_CX VMW_PSP_MRU Supports EMC CX that do not use the ALUA protocol VMW_SATP_ALUA VMW_PSP_MRU Supports non-specific arrays that use the ALUA protocol VMW_SATP_DEFAULT_AA VMW_PSP_FIXED Supports non-specific active/active arrays VMW_SATP_LOCAL VMW_PSP_FIXED Supports direct attached devices VI4 - Mod 3 - Slide
SATP (ctd) • To filter the rules to a specific SATP: • # esxcli nmp satp listrules -s VMW_SATP_EVA • Name Vendor Model Driver Options Claim Options Description • VMW_SATP_EVA HSV101 tpgs_off active/active EVA 3000 GL • VMW_SATP_EVA HSV111 tpgs_off active/active EVA 5000 GL • VMW_SATP_EVA HSV200 tpgs_off active/active EVA 4000/6000 XL • VMW_SATP_EVA HSV210 tpgs_off active/active EVA 8000/8100 XL • This shows us all the models of controller/array in the EVA series from HP which are associated with the SATP_EVA Storage Array Type Plug-in. VI4 - Mod 3 - Slide
Path Selection Plugin (PSP) • If you want to take advantage of more complex I/O load balancing algorithms, you could install a 3rd party Path Selection Plugin (PSP). • A PSP handles load balancing operations and is responsible for choosing a physical path to issue an I/O request to a logical device. • VMware provide three PSP: Fixed, MRU or Round Robin. • # esxcli nmp psp list Name Description VMW_PSP_MRU Most Recently Used Path Selection VMW_PSP_RR Round Robin Path Selection VMW_PSP_FIXED Fixed Path Selection VI4 - Mod 3 - Slide
NMP Supported PSPs • Most Recently Used (MRU) — Selects the first working path discovered at system boot time. If this path becomes unavailable, the ESX host switches to an alternative path and continues to use the new path while it is available. • Fixed — Uses the designated preferred path, if it has been configured. Otherwise, it uses the first working path discovered at system boot time. If the ESX host cannot use the preferred path, it selects a random alternative available path. The ESX host automatically reverts back to the preferred path as soon as the path becomes available. • Round Robin (RR) – Uses an automatic path selection rotating through all available paths and enabling load balancing across the paths. VI4 - Mod 3 - Slide
NMP I/O Flow • When a Virtual Machine issues an I/O request to a logical device managed by the NMP, the following steps take place: • The NMP calls the PSP assigned to this logical device. • The PSP selects an appropriate physical path to send the I/O. • Load balancing the I/O if necessary. • If the I/O operation is successful, the NMP reports its completion. • If the I/O operation reports an error, the NMP calls an appropriate SATP. • The SATP interprets the error codes and, when appropriate, activates inactive paths and fails over to the new active path. • The PSP is then called to select a new active path from the available paths to send the I/O. VI4 - Mod 3 - Slide
NMP I/O Flow (ctd) Emulation & FS Switch NMP SATP PSP PSA Emulation & Drivers HBA 1 HBA 2 Active Dead Standby Active VI4 - Mod 3 - Slide
ESX 4.0 Failover Logs • By default, logging is minimal in ESX 4.0 RC. • The following test disables the 2 active paths to a LUN. • # esxcfg-mpath -s off -d naa.600601601d311f00e93e751b93b4dd11 --path vmhba3:C0:T1:L4 • # esxcfg-mpath -s off -d naa.600601601d311f00e93e751b93b4dd11 --path vmhba2:C0:T1:L4 • The message in the logs to indicate that a ‘failover’ has occurred is: Nov 19 17:25:17 cs-tse-h97 vmkernel: 2:05:02:23.559 cpu5:4111)NMP: nmp_HasMoreWorkingPaths: STANDBY path(s) only to device "naa.600601601d311f00e93e751b93b4dd11". VI4 - Mod 3 - Slide
Enabling Additional Logging on ESX 4.0 • For additional SCSI Log Messages, set: • Scsi.LogCmdErrors = "1“ • Scsi.LogMPCmdErrors = "1“ • These can be found in the Advanced Settings. VI4 - Mod 3 - Slide
ESX 4.0 Failover Logs With Additional Logging • 14:13:47:59.878 cpu7:4109)NMP: nmp_HasMoreWorkingPaths: STANDBY path(s) only to device "naa.60060160432017005c97aea1b32fdc11". • 14:13:47:59.887 cpu7:4374)WARNING: NMP: nmp_PspSelectPathForIO: Plugin VMW_PSP_MRU selectPath() returned path "vmhba0:C0:T1:L1" for device "naa.60060160432017005c97aea1b32fdc11" which is in state standby instead of ON. Status is Bad parameter • 14:13:47:59.887 cpu7:4374)WARNING: NMP: nmp_SelectPathAndIssueCommand: PSP select path "vmhba0:C0:T1:L1" in a bad state on device "naa.60060160432017005c97aea1b32fdc11". • 14:13:47:59.887 cpu7:4374)NMP: nmp_CompleteCommandForPath: Command 0x2a (0x410004237c00) to NMP device "naa.60060160432017005c97aea1b32fdc11" failed on physical path "vmhba0:C0:T1:L1" H:0x1 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0. • 14:13:47:59.887 cpu7:4374)WARNING: NMP: nmp_DeviceRetryCommand: Device "naa.60060160432017005c97aea1b32fdc11": awaiting fast path state update for failover with I/O blocked... • 14:13:47:59.887 cpu7:4374)WARNING: NMP: nmp_DeviceStartLoop: NMP Device "naa.60060160432017005c97aea1b32fdc11" is blocked. Not starting I/O from device. • 14:13:48:00.069 cpu7:4109)NMP: nmp_DeviceUpdatePathStates: Activated path "vmhba0:C0:T1:L1" for NMP device "naa.60060160432017005c97aea1b32fdc11". • 14:13:48:00.888 cpu1:4206)WARNING: NMP: nmp_DeviceAttemptFailover: Retry world failover device "naa.60060160432017005c97aea1b32fdc11" - issuing command 0x410004237c00 • 14:13:48:00.888 cpu2:4373)WARNING: NMP: nmp_CompleteRetryForPath: Retry command 0x2a (0x410004237c00) to NMP device "naa.60060160432017005c97aea1b32fdc11" failed on physical path "vmhba0:C0:T1:L1" H:0x0 D:0x2 P:0x0 Valid sense data: 0x6 0x29 0x0. • 14:13:48:00.888 cpu2:4373)WARNING: NMP: nmp_CompleteRetryForPath: Retry world restored device "naa.60060160432017005c97aea1b32fdc11" - no more commands to retry • 14:13:48:00.888 cpu2:4373)ScsiDeviceIO: 746: Command 0x2a to device "naa.60060160432017005c97aea1b32fdc11" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x6 0x29 0x0. VI4 - Mod 3 - Slide
ESX 4.0 Failover Logs – FC cable unplugged 14:13:32:16.716 cpu3:4099)<6>qla2xxx 003:00.1: LOOP DOWN detected mbx1=2h mbx2=5h mbx3=0h. 14:13:32:24.425 cpu6:4195)NMP: nmp_CompleteCommandForPath: Command 0x2a (0x410004286980) to NMP device "naa.60060160432017005c97aea1b32fdc11" failed on physical path "vmhba1:C0:T0:L1" H:0x5 D:0x0 P:0x0 Possible sense data: 0x2 0x3a 0x0. 14:13:32:24.425 cpu6:4195)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe: NMP device "naa.60060160432017005c97aea1b32fdc11" state in doubt; requesting fast path state update... 14:13:32:24.425 cpu6:4195)ScsiDeviceIO: 746: Command 0x2a to device "naa.60060160432017005c97aea1b32fdc11" failed H:0x5 D:0x0 P:0x0 Possible sense data: 0x2 0x3a 0x0. 14:13:32:26.718 cpu4:4198)<3> rport-4:0-0: blocked FC remote port time out: saving binding 14:13:32:26.718 cpu4:4198)<3> rport-4:0-1: blocked FC remote port time out: saving binding 14:13:32:26.718 cpu5:4101)NMP: nmp_CompleteCommandForPath: Command 0x2a (0x410004286980) to NMP device "naa.60060160432017005c97aea1b32fdc11" failed on physical path "vmhba1:C0:T0:L1" H:0x1 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0. 14:13:32:26.718 cpu5:4101)WARNING: NMP: nmp_DeviceRetryCommand: Device "naa.60060160432017005c97aea1b32fdc11": awaiting fast path state update for failover with I/O blocked... 14:13:32:26.718 cpu5:4101)NMP: nmp_CompleteCommandForPath: Command 0x2a (0x4100042423c0) to NMP device "naa.60060160432017005c97aea1b32fdc11" failed on physical path "vmhba1:C0:T0:L1" H:0x1 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0. 14:13:32:26.718 cpu3:4281)WARNING: VMW_SATP_CX: satp_cx_otherSPIsHung: Path "vmhba1:C0:T1:L1" MODE SENSE PEER SP command failed 0/1 0x0 0x0 0x0. 14:13:32:26.719 cpu1:4206)WARNING: NMP: nmp_DeviceAttemptFailover: Retry world failover device "naa.60060160432017005c97aea1b32fdc11" - issuing command 0x410004286980 14:13:32:26.752 cpu2:4237)NMP: nmp_CompleteRetryForPath: Retry world recovered device "naa.60060160432017005c97aea1b32fdc11" VI4 - Mod 3 - Slide
VMkernel Modules • # vmkload_mod -l | grep satp • vmw_satp_local 0x418017811000 0x1000 0x417fd8676270 0x1000 10 Yes • vmw_satp_default_aa 0x418017812000 0x1000 0x417fd8680e80 0x1000 11 Yes • vmw_satp_alua 0x41801783c000 0x4000 0x417fd8684460 0x1000 17 Yes • vmw_satp_cx 0x418017840000 0x6000 0x417fd868d9b0 0x1000 18 Yes • vmw_satp_default_ap 0x418017846000 0x2000 0x417fd868e9c0 0x1000 19 Yes • vmw_satp_eva 0x418017848000 0x2000 0x417fd868f9d0 0x1000 20 Yes • vmw_satp_lsi 0x41801784a000 0x4000 0x417fd86909e0 0x1000 21 Yes • vmw_satp_symm 0x41801784e000 0x1000 0x417fd86919f0 0x1000 22 Yes • vmw_satp_inv 0x41801784f000 0x3000 0x417fd8692a00 0x1000 23 Yes • vmw_satp_eql 0x418017852000 0x1000 0x417fd8693a10 0x1000 24 Yes • vmw_satp_msa 0x418017853000 0x1000 0x417fd8694a20 0x1000 25 Yes • vmw_satp_svc 0x418017854000 0x1000 0x417fd8695a30 0x1000 26 Yes • vmw_satp_alua_cx 0x418017855000 0x3000 0x417fd8696a40 0x1000 27 Yes • # vmkload_mod -l | grep psp • vmw_psp_fixed 0x418017813000 0x2000 0x417fd8681e90 0x1000 12 Yes • vmw_psp_rr 0x418017858000 0x3000 0x417fd8697a80 0x1000 28 Yes • vmw_psp_mru 0x41801785b000 0x2000 0x417fd8698aa0 0x1000 29 Yes There is no equivalent to the vmkload_mod command in the VI CLI 4.0. To list this information on ESXi, use the vicfg-module–l (list) RCLI command. VI4 - Mod 3 - Slide
PSA and NMP Terminology & Concepts • An MPP “claims” a physical path and “manages” or “exports” a logical device. • Only the MPP can associate a physical path with a logical device. • Which MPP claims the path is decided by a set of PSA rules. • All rules for the plugins and sub-plugins are stored in the /etc/vmware/esx.conf file on the ESX/ESXi server. • If the MPP is the NMP from VMware, then: • NMP “associates” an SATP with a path from a given type of array. • NMP “associates” a PSP with a logical device. • NMP specifies a default PSP for every logical device based on the SATP associated with the physical paths for that device. • NMP allows the default PSP for a device to be overridden. VI4 - Mod 3 - Slide
Viewing Plugin Information • The following command lists all multipathing modules loaded on the system. At a minimum, this command returns the default VMware Native Multipath (NMP) plugin & the MASK_PATH plugin. Third-party MPPs will also be listed if installed: # esxcfg-mpath -G MASK_PATH NMP • For ESXi, the following VI CLI 4.0 command can be used: # vicfg-mpath –G –-server <IP> --username <X> --password <Y> MASK_PATH NMP • LUN path masking is done via the MASK_PATH Plug-in. VI4 - Mod 3 - Slide
Viewing Plugin Information (ctd) • Rules appear in the order that they are evaluated [0 – 65535] • Rules are stored in the /etc/vmware/esx.conf file. To list them, run the following command: # esxcli corestorage claimrule list Rule Class Type Plugin Matches 0 runtime transport NMP transport=usb 1 runtime transport NMP transport=sata 2 runtime transport NMP transport=ide 3 runtime transport NMP transport=block 4 runtime transport NMP transport=unknown 101 runtime vendor MASK_PATH vendor=DELL model=Universal Xport 101 file vendor MASK_PATH vendor=DELL model=Universal Xport 65535 runtime vendor NMP vendor=* model=* Dell requested that we hide these array pseudo devices by default Any USB storage will be claimed by the NMP plugin The class column tells us if the rules are in the esx.conf (file) or if they are in the VMkernel (runtime). Any storage not claimed by a previous rule will be claimed by NMP VI4 - Mod 3 - Slide
Viewing Plugin Information (ctd) • Storage paths are defined based on the following parameters: • Vendor/model strings • Transportation, such as SATA, IDE, Fibre Channel, and so on • Location of a specific adapter, target, or LUN • Device driver, for example, Mega-RAID • The NMP claims all paths connected to storage devices that use the USB, SATA, IDE, and Block SCSI transportation. • The MASK_PATH module claims all paths connected to Universal Xport by Dell. • The MASK_PATH module is used to mask paths from your host. • The last rule of vendor=* model=* is a catch-all for any arrays that do not match any of the previous rules. VI4 - Mod 3 - Slide
Viewing Device Information • The command esxcli nmp device list lists all devices managed by the NMP plug-in and the configuration of that device, e.g.: # esxcli nmp device list naa.600601601d311f001ee294d9e7e2dd11 Device Display Name: DGC iSCSI Disk (naa.600601601d311f001ee294d9e7e2dd11) Storage Array Type: VMW_SATP_CX Storage Array Type Device Config: {navireg ipfilter} Path Selection Policy: VMW_PSP_MRU Path Selection Policy Device Config: Current Path=vmhba33:C0:T0:L1 Working Paths: vmhba33:C0:T0:L1 mpx.vmhba1:C0:T0:L0 Device Display Name: Local VMware Disk (mpx.vmhba1:C0:T0:L0) Storage Array Type: VMW_SATP_LOCAL Storage Array Type Device Config: Path Selection Policy: VMW_PSP_FIXED Path Selection Policy Device Config: {preferred=vmhba1:C0:T0:L0;current=vmhba1:C0:T0:L0} Working Paths: vmhba1:C0:T0:L0 NAA is the Network Addressing Authority (NAA) identifier guaranteed to be unique Specific configuration for EMC Clariion & Invista products mpx is used as an identifier for devices that do not have their own unique ids VI4 - Mod 3 - Slide
Viewing Device Information (ctd) • Get current path information for a specified storage device managed by the NMP. # esxcli nmp device list -d naa.600601604320170080d407794f10dd11 naa.600601604320170080d407794f10dd11 Device Display Name: DGC Fibre Channel Disk (naa.600601604320170080d407794f10dd11) Storage Array Type: VMW_SATP_CX Storage Array Type Device Config: {navireg ipfilter} Path Selection Policy: VMW_PSP_MRU Path Selection Policy Device Config: Current Path=vmhba2:C0:T0:L0 Working Paths: vmhba2:C0:T0:L0 VI4 - Mod 3 - Slide
Viewing Device Information (ctd) • Lists all paths available for a specified storage device on ESX: • # esxcfg-mpath -b -d naa.600601601d311f001ee294d9e7e2dd11 • naa.600601601d311f001ee294d9e7e2dd11 : DGC iSCSI Disk (naa.600601601d311f001ee294d9e7e2dd11) vmhba33:C0:T0:L1 LUN:1state:active iscsi Adapter: iqn.1998-01.com.vmware:cs-tse-h33-34f33b4b Target: IQN=iqn.1992-04.com.emc:cx.ck200083700716.b0 Alias= Session=00023d000001 PortalTag=1 vmhba33:C0:T1:L1 LUN:1 state:standby iscsi Adapter: iqn.1998-01.com.vmware:cs-tse-h33-34f33b4b Target: IQN=iqn.1992-04.com.emc:cx.ck200083700716.a0 Alias= Session=00023d000001 PortalTag=2 • ESXi has an equivalent vicfg-mpath command. VI4 - Mod 3 - Slide
Viewing Device Information (ctd) • # esxcfg-mpath -l -d naa.6006016043201700d67a179ab32fdc11 • iqn.1998-01.com.vmware:cs-tse-h33-34f33b4b-00023d000001,iqn.1992-04.com.emc:cx.ck200083700716.a0,t,2-naa.600601601d311f001ee294d9e7e2dd11 • Runtime Name: vmhba33:C0:T1:L1 • Device: naa.600601601d311f001ee294d9e7e2dd11 • Device Display Name: DGC iSCSI Disk (naa.600601601d311f001ee294d9e7e2dd11) • Adapter: vmhba33 Channel: 0 Target: 1 LUN: 1 • Adapter Identifier: iqn.1998-01.com.vmware:cs-tse-h33-34f33b4b • Target Identifier: 00023d000001,iqn.1992-04.com.emc:cx.ck200083700716.a0,t,2 • Plugin: NMP • State: standby • Transport: iscsi • Adapter Transport Details: iqn.1998-01.com.vmware:cs-tse-h33-34f33b4b • Target Transport Details: IQN=iqn.1992-04.com.emc:cx.ck200083700716.a0 Alias= Session=00023d000001 PortalTag=2 • iqn.1998-01.com.vmware:cs-tse-h33-34f33b4b-00023d000001,iqn.1992-04.com.emc:cx.ck200083700716.b0,t,1-naa.600601601d311f001ee294d9e7e2dd11 • Runtime Name: vmhba33:C0:T0:L1 • Device: naa.600601601d311f001ee294d9e7e2dd11 • Device Display Name: DGC iSCSI Disk (naa.600601601d311f001ee294d9e7e2dd11) • Adapter: vmhba33 Channel: 0 Target: 0 LUN: 1 • Adapter Identifier: iqn.1998-01.com.vmware:cs-tse-h33-34f33b4b • Target Identifier: 00023d000001,iqn.1992-04.com.emc:cx.ck200083700716.b0,t,1 • Plugin: NMP • State: active • Transport: iscsi • Adapter Transport Details: iqn.1998-01.com.vmware:cs-tse-h33-34f33b4b • Target Transport Details: IQN=iqn.1992-04.com.emc:cx.ck200083700716.b0 Alias= Session=00023d000001 PortalTag=1 Storage array (target) iSCSI Qualified Names (IQNs) VI4 - Mod 3 - Slide
Viewing Device Information (ctd) • Any of the following commands will display the active path: • esxcli nmp path list -d <naa.id> • esxcli nmp device list -d <naa.id> • esxcli nmp psp getconfig -d <naa.id> • This information can also be found in the multipathing information of the storage section in the vSphere client. VI4 - Mod 3 - Slide
Third-Party Multipathing Plug-ins (MPPs) • You can install the third-party multipathing plug-ins (MPPs) when you need to change specific load balancing and failover characteristics of ESX/ESXi. • The third-party MPPs replace the behaviour of the NMP and entirely take control over the path failover and the load balancing operations for certain specified storage devices. VI4 - Mod 3 - Slide
Third-Party SATP & PSP • Third-party SATP • Generally developed by third-party hardware manufacturers who have ‘expert’ knowledge of the behaviour of their storage devices. • Accommodates specific characteristics of storage arrays and facilitates support for new arrays. • Third-party PSP • Generally developed by third-party software companies. • More complex I/O load balancing algorithms. • NMP coordination • Third-party SATPs and PSPsare coordinated by the NMP, and can be simultaneously used with the VMware SATPs and PSPs. VI4 - Mod 3 - Slide
Troubleshooting the PSA • Scenario 1 • Customer has an issue with a third-party PSP or SATP • Recommendation might be to unplug the third party PSP and plug-in VMware’s array specific plug-in or the generic A/A or A/P plug-in. • This recommendation will work for SATPs only if ESX ships a SATP that supports the given array. This may not always be the case. • Scenario 2 • Customer has a problem with a third-party MPP plug-in, for example, EMC Powerpath. • Recommendation might be to unplug the third party product and to use the VMware NMP (Native Multipath Plug-in) to verify if it is the cause of the problem. • Having the customer switch out the module to do a 1st level triage of what is causing the problem is a reasonable course of action. VI4 - Mod 3 - Slide
PSA Case Studies • Before starting into these case studies, you need to be aware of some changes to SCSI support in version 4.0. • The ESX 4 VMkernel is now SCSI-3 compliant and is capable of using SCSI-3 specific commands & features. • Two new storage features are introduced: Target Port Group Support (TPGS) and Asymmetric Logical Unit Access (ALUA). • TPGS allows new storage devices to be discovered automatically and typically presents a single port to a server while handling load-balancing & failover at the back-end. • Since target ports could be on different physical units, ALUA allows different levels of access for target ports to each LUN. ALUA will route I/O to a particular port to achieve best performance. VI4 - Mod 3 - Slide
PSA Case Study #1 • Trying to present an iSCSI LUN from a NETAPP FAS 250 filer to ESX 4 beta 2 resulted in no LUN discovery and the following /var/log/vmkernel errors: ScsiScan: SCSILogPathInquiry:641: Path 'vmhba34:C0:T0:L0': Vendor: 'NETAPP ' Model: ‘ LUN ' Rev: '0.2 ' ScsiScan: SCSILogPathInquiry:642: Type: 0x0, ANSI rev: 4 ScsiClaimrule: SCSIAddClaimrulesSessionStart:291: Starting claimrules session. Key = 0x40404040. NMP: nmp_SatpMatchClaimOptions: Compared claim opt 'tpgs_on'; result 1. NMP: nmp_SatpMatchClaimOptions: Compared claim opt 'tpgs_on'; result 1. SATP_ALUA: satp_alua_initDevicePrivateData: Done with creation of dpd 0x41000e7c11a0. WARNING: SATP_ALUA: satp_alua_getTargetPortInfo: Could not find relative target port ID for path "vmhba34:C0:T0:L0" - Not found (195887107) NMP: nmp_SatpClaimPath: SAT "SATP_ALUA" could not add path "vmhba34:C0:T0:L0" for device "Unregistered device". Error Not found WARNING: NMP: nmp_AddPathToDevice: The physical path "vmhba34:C0:T0:L0" for NMP device "Unregistered device" could not be claimed by SATP "SATP_ALUA". Not found WARNING: NMP: nmp_DeviceAlloc: nmp_AddPathToDevice failed Not found (195887107). WARNING: NMP: nmp_DeviceAlloc: Could not allocate NMP device. WARNING: ScsiPath: SCSIClaimPath:3487: Plugin 'NMP' had an error (Not found) while claiming path 'vmhba34:C0:T0:L0'.Skipping the path. ScsiClaimrule: SCSIClaimruleRunOnPath:734: Plugin NMP specified by claimrule 65535 was not able to claim path vmhba34:C0:T0:L0. Busy ScsiClaimrule: SCSIClaimruleRun:809: Error claiming path vmhba34:C0:T0:L0. Busy. Problem associating the SATP called SATP_ALUA with paths for this target VI4 - Mod 3 - Slide
PSA Case Study #1 (ctd) # esxcfg-mpath –l . iqn.1998-01.com.vmware:cs-tse-f117-525f2d10-- Runtime Name: vmhba34:C0:T0:L0 Device: Device Display Name: Adapter: vmhba34 Channel: 0 Target: 0 LUN: 0 Adapter Identifier: iqn.1998-01.com.vmware:cs-tse-f117-525f2d10 Target Identifier: Plugin: (unclaimed) State: active Transport: iscsi Adapter Transport Details: Unavailable or path is unclaimed . Target Transport Details: Unavailable or path is unclaimed . Root Cause: The NetApp array in this case is an old model. It did not support ALUA. But the only SATP that we had in beta was for NetApp arrays which supported ALUA. This is why NMP could not claim the LUNs from this array. It expected them to be ALUA compatible. VI4 - Mod 3 - Slide
PSA Case Study #1 (ctd) • # esxcli nmp satp listrules -s SATP_ALUA • Name Vendor Model Driver Options Claim Options Description • SATP_ALUA HSV101 tpgs_on EVA 3000 GL with ALUA • SATP_ALUA HSV111 tpgs_on EVA 5000 GL with ALUA • SATP_ALUA HSV200 tpgs_on EVA 4000/6000 XL with ALUA • SATP_ALUA HSV210 tpgs_on EVA 8000/8100 XL with ALUA • SATP_ALUA NETAPP tpgs_on NetApp with ALUA • SATP_ALUA HP MSA2012sa tpgs_on HP MSA A/A with ALUA • SATP_ALUA Intel Multi-Flex tpgs_on Intel Promise • # grep -i NETAPP /etc/vmware/esx.conf • /storage/plugin/NMP/config[SATP_ALUA]/rules[0004]/description = "NetApp with ALUA" • /storage/plugin/NMP/config[SATP_ALUA]/rules[0004]/vendor = "NETAPP" • # esxcli nmp satp deleteRule -V NETAPP -s SATP_ALUA • # esxcli nmp satp listrules -s SATP_ALUA • Name Vendor Model Driver Options Claim Options Description • SATP_ALUA HSV101 tpgs_on EVA 3000 GL with ALUA • SATP_ALUA HSV111 tpgs_on EVA 5000 GL with ALUA • SATP_ALUA HSV200 tpgs_on EVA 4000/6000 XL with ALUA • SATP_ALUA HSV210 tpgs_on EVA 8000/8100 XL with ALUA • SATP_ALUA HP MSA2012sa tpgs_on HP MSA A/A with ALUA • SATP_ALUA Intel Multi-Flex tpgs_on Intel Promise • # grep -i NETAPP /etc/vmware/esx.conf • # VI4 - Mod 3 - Slide
PSA Case Study #1 (ctd) • Rescan the SAN for the Software iSCSI initiator • ScsiScan: SCSILogPathInquiry:641: Path 'vmhba34:C0:T0:L0': Vendor: 'NETAPP ' Model: 'LUN ' Rev: '0.2 ' • ScsiScan: SCSILogPathInquiry:642: Type: 0x0, ANSI rev: 4 • ScsiClaimrule: SCSIAddClaimrulesSessionStart:291: Starting claimrules session. Key = 0x40404040. • NMP: vmk_NmpPathGroupMovePath: Path "vmhba34:C0:T0:L0" state changed from "dead" to "active" • ScsiPath: SCSIClaimPath:3478: Plugin 'NMP' claimed path 'vmhba34:C0:T0:L0' • NMP: nmp_DeviceUpdatePathStates: The PSP selected path "vmhba34:C0:T0:L0" to activate for NMP device "Unregistered device". • ScsiDevice: vmk_ScsiAllocateDevice:1571: Alloc'd device 0x41000d83e600 • NMP: nmp_RegisterDevice: Register NMP device with primary uid 'naa.60a9800068704c6f54344b645a6d5876' and 1 total uids. • . • . • NMP: nmp_RegisterDevice: Registration of NMP device with primary uid 'naa.60a9800068704c6f54344b645a6d5876' and name of 'naa.60a9800068704c6f54344b645a6d5876' is completed successfully. VI4 - Mod 3 - Slide
PSA Case Study #1 (ctd) • # esxcfg-mpath -l -d naa.60a9800068704c6f54344b645a6d5876 • iqn.1998-01.com.vmware:cs-tse-f117-525f2d10-00023d000001,iqn.1992-08.com.netapp:sn.84228148,t,1-naa.60a9800068704c6f54344b645a6d5876 • Runtime Name: vmhba34:C0:T0:L0 • Device: naa.60a9800068704c6f54344b645a6d5876 • Device Display Name: NETAPP iSCSI Disk (naa.60a9800068704c6f54344b645a6d5876) • Adapter: vmhba34 Channel: 0 Target: 0 LUN: 0 • Adapter Identifier: iqn.1998-01.com.vmware:cs-tse-f117-525f2d10 • Target Identifier: 00023d000001,iqn.1992-08.com.netapp:sn.84228148,t,1 • Plugin: NMP • State: active • Transport: iscsi • Adapter Transport Details: iqn.1998-01.com.vmware:cs-tse-f117-525f2d10 • Target Transport Details: IQN=iqn.1992-08.com.netapp:sn.84228148 Alias= Session=00023d000001 PortalTag=1 # esxcli nmp device list -d naa.60a9800068704c6f54344b645a6d5876 naa.60a9800068704c6f54344b645a6d5876 Device Display Name: NETAPP iSCSI Disk (naa.60a9800068704c6f54344b645a6d5876) Storage Array Type: SATP_DEFAULT_AA Storage Array Type Device Config: Path Selection Policy: PSP_FIXED Path Selection Policy Device Config: {preferred=vmhba34:C0:T0:L0;current=vmhba34:C0:T0:L0} Working Paths: vmhba34:C0:T0:L0 Since there is no defined SATP for the NetApp, i.e. SATP_ALUA, NMP instead associates it with the default A/A SATP. VI4 - Mod 3 - Slide
PSA Case Study #2 • LUNs from the same NetApp filer are using different SATPs naa.60a98000486e5351476f4b3670724b58 Selected Paths : vmhba5:C0:T1:L5 Storage Array Type: SATP_ALUA Storage Array Type Device Config: {implicit_support=on;explicit_support=off;explicit_allow=on;alua_followover=on;TPG_id=1;TPG_state=ANO;RTP_id=4;RTP_health Path Policy: PSP_MRU PSP Config String: Current Path=vmhba5:C0:T1:L5naa.60a98000486e544c64344d3345425331 Selected Paths : vmhba2:C0:T0:L50 Storage Array Type: SATP_DEFAULT_AA Storage Array Type Device Config: Path Policy: PSP_FIXED PSP Config String: {preferred=vmhba2:C0:T0:L50;current=vmhba2:C0:T0:L50} VI4 - Mod 3 - Slide
PSA Case Study #2 (ctd) • In this beta version of ESX, NetApp LUNs can be claimed by either SATP_ALUA or SATP_DEFAULT_AA depending on whether the NetApp Initiator Group (igroup) is configured for ALUA mode or not on the array. • Looking at the vmkernel logs, the NetApp is misconfigured - one storage array controller (target) has ALUA mode turned ON, and the other storage array controller (target) has it turned OFF. • Therefore, the device gets claimed by either ALUA or DEFAULT_AA depending on which path is discovered first (controller with ALUA or controller without ALUA). • For ALUA we check that TPGS (Target Port Group Support) bit in the standard SCSI Inquiry is set, and this is specified by using "tpgs_on" claim options. VI4 - Mod 3 - Slide
PSA Case Study #2 (ctd) • In the vmkernel logs, the Compared claim opt 'tpgs_on' message tells us if TGPS is on or off. • The next Plugin 'NMP' claimed path message tells us which path it was testing. You can see that "'tpgs_on'; result 0" (not set) is followed by vmhbaX:C0:T0:LY and "'tpgs_on'; result 1" (set) is followed by vmhbaX:C0:T1:LY. Nov 23 10:10:07 vmshe-hp380-11 vmkernel: 2:18:04:11.022 cpu5:4110)NMP: nmp_SatpMatchClaimOptions: Compared claim opt 'tpgs_on'; result 0. Nov 23 10:10:07 vmshe-hp380-11 vmkernel: 2:18:04:11.031 cpu5:4110)NMP: nmp_SatpMatchClaimOptions: Compared claim opt 'tpgs_on'; result 0. Nov 23 10:10:07 vmshe-hp380-11 vmkernel: 2:18:04:11.051 cpu5:4110)ScsiPath: SCSIClaimPath:3478: Plugin 'NMP' claimed path 'vmhba2:C0:T0:L7' Nov 23 10:10:07 vmshe-hp380-11 vmkernel: 2:18:04:11.060 cpu5:4110)NMP: nmp_SatpMatchClaimOptions: Compared claim opt 'tpgs_on'; result 1. Nov 23 10:10:07 vmshe-hp380-11 vmkernel: 2:18:04:11.080 cpu5:4110)ScsiPath: SCSIClaimPath:3478: Plugin 'NMP' claimed path 'vmhba2:C0:T1:L7' Nov 23 10:10:07 vmshe-hp380-11 vmkernel: 2:18:04:11.089 cpu5:4110)NMP: nmp_SatpMatchClaimOptions: Compared claim opt 'tpgs_on'; result 0. Nov 23 10:10:07 vmshe-hp380-11 vmkernel: 2:18:04:11.097 cpu5:4110)NMP: nmp_SatpMatchClaimOptions: Compared claim opt 'tpgs_on'; result 0. Nov 23 10:10:07 vmshe-hp380-11 vmkernel: 2:18:04:11.118 cpu5:4110)ScsiPath: SCSIClaimPath:3478: Plugin 'NMP' claimed path 'vmhba2:C0:T0:L50' Nov 23 10:10:07 vmshe-hp380-11 vmkernel: 2:18:04:11.126 cpu5:4110)NMP: nmp_SatpMatchClaimOptions: Compared claim opt 'tpgs_on'; result 1. Nov 23 10:10:07 vmshe-hp380-11 vmkernel: 2:18:04:11.147 cpu5:4110)ScsiPath: SCSIClaimPath:3478: Plugin 'NMP' claimed path 'vmhba2:C0:T1:L50' Nov 23 10:10:07 vmshe-hp380-11 vmkernel: 2:18:04:11.156 cpu5:4110)NMP: nmp_SatpMatchClaimOptions: Compared claim opt 'tpgs_on'; result 0. Nov 23 10:10:07 vmshe-hp380-11 vmkernel: 2:18:04:11.164 cpu5:4110)NMP: nmp_SatpMatchClaimOptions: Compared claim opt 'tpgs_on'; result 0. Nov 23 10:10:07 vmshe-hp380-11 vmkernel: 2:18:04:11.185 cpu5:4110)ScsiPath: SCSIClaimPath:3478: Plugin 'NMP' claimed path 'vmhba2:C0:T0:L56' Nov 23 10:10:07 vmshe-hp380-11 vmkernel: 2:18:04:11.194 cpu5:4110)NMP: nmp_SatpMatchClaimOptions: Compared claim opt 'tpgs_on'; result 1. Nov 23 10:10:07 vmshe-hp380-11 vmkernel: 2:18:04:11.214 cpu5:4110)ScsiPath: SCSIClaimPath:3478: Plugin 'NMP' claimed path 'vmhba2:C0:T1:L56' VI4 - Mod 3 - Slide
Lesson 1 Summary • ESX 4.0 has a new Pluggable Storage Architecture. • VMware provides a default Native Multipathing Module (NMP) for the PSA. • The NMP has sub-plugins for failover and load balancing. • Storage Array Type Plug-in (SATP) handles failover & Path Selection Plug-in (PSP) handles load-balancing. • PSA facilitates storage array vendors and software vendors develop their own modules specific to individual arrays. VI4 - Mod 3 - Slide
Lesson 1 - Lab 1 • Lab 1 involves examining the Pluggable Storage Architecture • View PSA, NMP, SATP & PSP information on the student host • Replacing the Path Selection Plug-in (PSP) • Choose Fixed or MRU • Use the MASK_PATH plugin to mask LUNs from the host. VI4 - Mod 3 - Slide
Module 3 Lessons • Lesson 1 - Pluggable Storage Architecture • Lesson 2 - SCSI-3 & MSCS Support • Lesson 3 - iSCSI Enhancements • Lesson 4 - Storage Administration & Reporting • Lesson 5 - Snapshot Volumes & Resignaturing • Lesson 6 - Storage VMotion • Lesson 7 - Thin Provisioning • Lesson 8 - Volume Grow / Hot VMDK extend • Lesson 9 - Storage CLI enhancements • Lesson 10 – Paravirtualized SCSI Driver • Lesson 11 – Service Console Storage VI4 - Mod 3 - Slide
Windows Clustering Changes • In Windows Server 2003 Server Clustering, SCSI bus resets are used to break SCSI reservations to allow another controller take control of the disks. • The problem with SCSI bus resets is that they impact other disks sharing the SCSI bus which do not hold a SCSI reservation. • In Windows Server 2008 Failover Clustering, SCSI bus resets are no longer used. It uses persistent reservations instead. • This means that directly attached SCSI storage will no longer be supported in Windows 2008 for failover clustering. • Serially Attached Storage (SAS),Fibre Channel, and iSCSI will be the only supported technology for Windows 2008 by Microsoft, but this does not necessarily mean that VMware will support these technologies for clustering Virtual Machines. • Other SCSI devices on the same bus as the reserved LUN are not impacted by SCSI reservation clearing & pre-empting operations. VI4 - Mod 3 - Slide