340 likes | 556 Views
Leveraging file-based shared storage for enterprise applications. Mathew George Principal SDE 3-053. Agenda. Why use networked file storage? What’s new in Windows Server 2012? SMB 3.0 highlights How can apps make use of this? Programming considerations. Why use networked file storage?.
E N D
Leveraging file-based shared storage for enterprise applications Mathew George Principal SDE 3-053
Agenda • Why use networked file storage? • What’s new in Windows Server 2012? • SMB 3.0 highlights • How can apps make use of this? • Programming considerations
Why use networked file storage? • Why not ? • Historical reasons for not using file shares for I/O intensive server applications • Poor performance (due to network fabric and protocol limitations) • Unreliable connections • Unreliable storage on the file server • Lack of integrity guarantees • Limited tools for diagnostics and management
Why use networked file storage? • The world has changed • More reliable network fabrics, multiple integrated NICs • Ethernet speed is competitive with Fiber Channel • Robust storage behind the file server • Low cost • No need for specialized storage networking infrastructure or knowledge • Easier management and troubleshooting • File shares instead of LUNs and SAN zoning • Familiar access control and authentication mechanisms • Dynamic server/service relocation
What is new in Windows Server 2012? SMB 3.0 • Continuously available file server • Scale-out file server • Bandwidth aggregation (SMB multichannel) • Support for new network fabrics (SMB Direct) • RDMA on iWARP, RoCE, and Infiniband • SMB encryption • Storage Spaces & ReFS • Easier manageability and diagnosability • Application consistent backups • PowerShell based configuration • ETW events, performance counters
What is a continuously available file server? • Insulates applications from network and server failures • I/O issued by an application will be resilient to • Transient network failures • Failure of one or more network paths • Planned or unplanned server/storage failures • Most existing Windows applications which work on local storage will work without modifications against a continuously available file share
Continuously available SMB file server Normal operation Failover share – connections and handles lost, temporary stall of IO Connections and handles auto-recovered application continues with no errors 1 2 • Failover transparent to application • SMB client and server handle failover gracefully • Zero downtime – small IO delay during failover • Most* file and directory operations continue to work • Supports planned and unplanned failovers • Hardware or software maintenance • Hardware or software failures • Load rebalancing • Requires • SMB server in a failover cluster • SMB server and client must implement SMB 3.0 • Shares enabled for ‘Continuous Availability’ (CA) 3 Server application 1 3 \\fs\share \\fs\share 2 File Server Cluster File Server Node A File Server Node B
Scale-out SMB file server Application cluster • Active-active configuration • Unified view of the shares and the file system across all nodes • Clients can connect to any node • Clients can be moved from one node to another transparently • Targeted for server application storage • Virtualization and databases • Increase available bandwidth by adding cluster nodes • SMB 3.0 clients get scale-out and transparent failover • Limitations • Not suitable for a general purpose file server • SMB 2.x clients can connect, but no transparent failover • SMB 1.x clients cannot connect Datacenter network Single logical file server (\\FS\Share) Single file system namespace Cluster file system File Server Cluster
SMB multichannel • Use multiple connections to move data between SMB client and server • Tolerate failure of one of more NICs • Enhanced throughput • More bandwidth with multiple NICs • Spread interrupt load across CPU cores • Complements NIC teaming • ZERO configuration • Easy monitoring and troubleshooting Sample Configurations Multiple 10GbE/IB RSS-capable NICs Multiple 1GbE in LBFO team SMB client SMB client NIC 10GbE/IB NIC 10GbE/IB LBFO NIC 1GbE NIC 1GbE Switch 10GbE/IB Switch 10GbE/IB Switch 1GbE SMB server SMB server NIC 1GbE NIC 1GbE NIC 10GbE/IB NIC 10GbE/IB LBFO Vertical blue lines are logical channels, not cables
SMB multichannel: A CPU comparison SMB session, without multichannel SMB session, with multichannel Only one TCP/IP connection Only one NIC is used Only one CPU core engaged Can’t use full 20Gbps Multiple TCP/IP connections Receive Side Scaling (RSS) helps distribute load across CPU cores Full 20Gbps available SMB client SMB client CPU utilization per core CPU utilization per core NIC 10GbE/IB NIC 10GbE/IB NIC 10GbE/IB NIC 10GbE/IB Switch 10GbE/IB Switch 10GbE/IB Switch 10GbE/IB Switch 10GbE/IB SMB server SMB server NIC 10GbE/IB NIC 10GbE/IB NIC 10GbE/IB NIC 10GbE/IB Core 1 Core 1 Core 2 Core 2 Core 3 Core 3 Core 4 Core 4
SMB direct (SMB over RDMA) • New class of network hardware Speeds up to 56 Gbps • Full hardware offload, low CPUoverhead, low latency • Remote DMA moves large chunksof memory between serverswithout any CPU intervention • Requires RDMA capable interface • Supports iWARP, RoCE, and Infiniband File Client File Server Application User kernel SMB client SMB server Network w/RDMA support Network w/RDMA support NTFSSCSI R-NIC Disk R-NIC
How can I use SMB direct? • Applications continue to use existing Win32/.NET file I/O APIs • SMB client makes the decision to use SMB direct at run time • NDKPI provides a much thinner layer than TCP/IP • Remote direct-memory-access performed by the network interfaces File Server Client Memory 4 Memory Application RDMA 1 Unchanged API SMB Server SMB Client 2 TCP/ IP SMB Direct SMB Direct TCP/ IP NDKPI 3 NDKPI RDMA NIC RDMA NIC NIC NIC Ethernet and/or InfiniBand
How do apps access shared file storage? • Virtualized application instance running in a VM whose storage is on a remote file share • Application directly running against a remotely mounted VHD • Building apps on top of SQL Server 2012/SharePoint and hosting the databases on SMB 3.0 file shares • Directly accessing remote storage (via UNC paths or mapped drives) instead of a directly attached volume
Virtualized applications: Hyper-V over SMB • Hyper-V in Windows Server 2012 fully supports hosting virtual disks on a SMB 3.0 file share • Continuously available shares for fault tolerance • Bandwidth aggregation using SMB multichannel • Live storage migration over TCP/IP networks • Live migration of running VMs in a hyper-V cluster • No changes are required in the application • Application continues to run on a local drive within the VM
Virtualized applications: Hyper-V over SMB Hyper-V server File server Parent partition Child partition Application User Kernel SMB server NTFS SCSI/IDE VHD stack SMB client Network (RDMA option) NTFS SCSI Storage VSC Storage VSP Network (RDMA option) VM bus NIC NIC
Configuring Hyper-V over SMB • Full permissions on folder and share for administrator and computer account of Hyper-V hosts • REM Folder permissions. • MD F:\VMS • ICACLS F:\VMS /Inheritance:R • ICACLS F:\VMS /Grant Dom\HAdmin:(CI)(OI)F • ICACLS F:\VMS /Grant Dom\HV1$:(CI)(OI)F • ICACLS F:\VMS /Grant Dom\HV1$:(CI)(OI)F • REM Share permissions • New-SmbShare -Name VMS -Path F:\VMS -FullAccessDom\HAdmin, Dom\HV1$, Dom\HV2$ • Simply point the VHD to the UNC path • New-VHD -VHDFormat VHDX -Path \\FS\VMS\VM1.VHDX -VHDType Dynamic -SizeBytes127GB • New-VM -Name VM1 -Path \\FS\VMS -VHDPath \\FS\VMS\VM1.VHDX -Memory 1GB
Hyper-V over SMB: Standalone setup • Highlights • Simplicity (file shares, permissions) • Flexibility (migration, shared storage) • Low cost • Storage is fault tolerant (mirroring, parity) • Limitations • File server is not continuously available • VMs are not continuously available Hyper-V host Hyper-V host VM VM VHD VHDX File server Share Share Storage spaces Space Space
Hyper-V over SMB: Clustered setup • Ability to cluster both the file server and Hyper-V hosts • Highlights • Hyper-V VMs are highly available • File Server is continuously available • Storage is fault tolerant Hyper-V host Hyper-V host VM VM Failover cluster VHD VHDX File server File server Share Share Failover cluster Clustered spaces Space Space Shared JBOD SAS
SQL Server support for SMB file shares • SQL Server 2008 R2 • Formalized support for storing user databases on SMB file shares • Removed the trace flag requirement • SMB 2.1 and 3.0 officially supported • Integrated SMB scenarios into automated test infrastructure and labs • SQL Server 2012 • Added support for SQL Server clusters using SMB file shares • Adds flexibility to cluster configurations • Removes the drive-letter restriction for cluster groups • Added support for System DB on SMB file shares • Root of the installation can now be on the share
SQL Server 2012 and Windows Server 2012SMB transparent failover Normal operation Failover share - connections and handles lost, temporary stall of IO Connections and handles auto-recovered application continues with no errors • Failover transparent to server application • Zero downtime – small IO delay • OS guarantees timeliness and consistency of data • Planned maintenance and unplanned failures • HW/SW maintenance • HW/SW failures • Load rebalancing • Requires • Windows failover clusters • Both server running application and file server must be Windows Server 2012 1 2 3 SQL Server 1 3 \\fs\salesdb \\fs\salesdb \\fs\saleslog \\fs\saleslog 2 File Server Cluster File Server Node A File Server Node B
SQL Server 2012 over SMB configuration • Full permissions on the folder and share for SQL DBA and the SQL Server service account • REM Setup Folder permissions. • MD F:\Data • ICACLS F:\Data /Inheritance:R • ICACLS F:\Data /Grant Dom\SQLDBA:(CI)(OI)F • ICACLS F:\Data /Grant Dom\SQLService:(CI)(OI)F • REM Create Share. • New-SmbShare-Name SQLData -Path F:\Data -FullAccess Dom\SQLDBA, Dom\SQLService • Simply point to the UNC path of the share when creating the database
Direct access to shared file storage? • Any Win32/WinRT/.NET application can access SMB remote file storage via UNC paths (\\server\share) • Setup explicit credentials (optional) • credential manager [cmdkey OR control keymgr.dll] • net useor equivalent powershell command New-SmbMapping • Automatically get the benefits of SMB multichannel and SMB direct if appropriate hardware is available • Explicitly provision shares for continuous availability, scale-out access and encryption
Continuously available file handles • Guaranteed I/O fault tolerance on any file handle opened on a “continuously available” share • Data consistency is guaranteed in the event of server/network failures • The OS (SMB client/server) will re-establish connections, restore any lost state and retry I/O beneath the application • If the server/network is unreachable after a configurable timeout (60 seconds default), the I/O will fail and the file handle is lost • Consistency guarantees for metadata changing operationsCreate, delete, rename, file extension/truncation • Best effort guarantees for most directory operations
Continuously available file handles • What is the cost? • File server operates in write-through mode, resulting in added disk I/O • Need disk subsystems which correctly honor write-through • Additional disk I/O to track file handle state and metadata changes • What kind of apps will work well? • Apps that use long lived file handles • Apps that do a lot of I/O intensive processing • Apps that that are NOT metadata intensive
Continuously available file handles • Should applications explicitly care? • Most do not, but if needed you can explicitly query for the “persistent handle” flag • PFILE_REMOTE_PROTOCOL_INFO protocolInfo; • … • if (!GetFileInformationByHandleEx( hFile, • FileRemoteProtocolInfo, • &protocolInfo, • sizeof(protocolInfo) )) { • return TRUE; // Local filesystems do not support the query. • } • if (protocolInfo.Protocol == WNNC_NET_SMB && • (protocolInfo.Flags & REMOTE_PROTOCOL_INFO_FLAG_PERSISTENT_HANDLE) != 0) { • return TRUE; // File handle is continuously available • } • return FALSE;
Writing clustered client applications • Have a clustered app which stores data on an SMB share? • You need a way to tell the SMB server to abandon your file handles when your application instance is moved between servers • How? • Register an “AppInstance ID” for your process. All file handles opened by the process will be tagged with the ID RegisterAppInstance( __in HANDLE ProcessHandle, __in GUID* AppInstanceId, __in BOOL ChildrenInheritAppInstance ); • Drivers can attach an ECP with the AppInstance ID to a create request
Achieving high I/O throughput • Large I/O • Interested in network “throughput” (bytes / sec) • Fewer passes through the filesystem stack -> low CPU cost • Often sequential in nature -> fewer disk seeks • File copy, database logs • Small I/O • Interested in IO/sec (IOPS) • CPU intensive due to larger number of passes through the stack • Often tends to be random I/O
I/O throughput: To cache or not to cache • Caching helps with small, bursty I/O, but limits sustained throughput • Effectively makes ReadFile()and WriteFile()calls synchronous • Cannot achieve zero-copy • What are the available options? • Open the file with FILE_NO_INTERMEDIATE_BUFFERING to bypass filesystem caching on both the client and the server. • Open the file cached, and then disable client side caching • Issue IOCTL_LMR_DISABLE_LOCAL_BUFFERINGon file handle using DeviceIoControl() API • Able to do fully pipelined asynchronous I/O on the client • Server may be able to do zero-copy • Larger memory resources typically available on the server can help absorb some disk I/O • Pipeline enough I/O to fill the network pipe, sustain deep disk queues
How many IOPS can you push ? • Is having a “fast enough” network and disk sufficient? • No. It is mostly about managing CPU utilization • NUMA awareness (non-uniform memory access) • Multiprocessor systems where the cost of accessing memory / network varies based on on which physical CPU your code is running on • Application I/Os need to be managed “per CPU” • dedicated threads for each CPU to do I/O • dedicated buffers for each CPU • The OS and the SMB redirector manage the rest by distributing the I/O across all available network interfaces.
Half a million IOPS over SMB-Direct! File Client(SMB 3.0) SQLIO RDMA NIC RDMA NIC RDMA NIC RDMA NIC RDMA NIC RDMA NIC File Server (SMB 3.0) Storage Spaces SASHBA SASHBA SASHBA SASHBA SASHBA RAID Controller SAS SAS SAS SAS SAS SAS JBOD JBOD JBOD JBOD JBOD JBOD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD Results on Windows Server 2012 RTM using EchoStreamsserver, with 2 Intel E5-2620 CPUs at 2.00 Ghz Both client and server using three Mellanox ConnectX-3 network interfaces on PCIe Gen3 x8 slots Data goes to 6 LSI SAS adapters and 48 Intel SSDs, attached directly to the EchoStreams server. Data on the table is based on a 60-second average. Performance Monitor data is an instant snapshot.
Takeaways • File-based network storage is now faster and more reliable • All these technologies are now available Windows Server 2012 • Hyper-V and SQL have been validated against SMB 3.0 file shares • Applications don’t need any significant changes! • Go build!
Resources • Follow us on Twitter @WindowsAzure • Get Started: www.windowsazure.com/build Please submit session evals on the Build Windows 8 App or at http://aka.ms/BuildSessions