540 likes | 1.15k Views
Required Slide. SESSION CODE: WSV316. Hyper-V and Storage: Maximizing Performance and Deployment Best Practices in Windows Server 2008 R2. Robert Larson Delivery Architect Microsoft Corporation. David Lef Principal Systems Architect Microsoft Corporation.
E N D
Required Slide SESSION CODE: WSV316 Hyper-V and Storage: Maximizing Performance and Deployment Best Practices in Windows Server 2008 R2 Robert Larson Delivery Architect Microsoft Corporation David Lef Principal Systems Architect Microsoft Corporation
Session Objectives and Takeaways • Session Objectives: • Quick Review of Windows Server 2008 R2 storage features for scaling performance • Learn current benchmark numbers of Windows Server 2008 R2 to help dispel common myths around iSCSI and Hyper-V performance • Understand Hyper-V storage configuration options and benefits of different infrastructure models • Explore a real-life deployment of Hyper-V on an enterprise infrastructure in Microsoft IT’s server environment • Session Takeaways: • Understand the key factors to maximizing virtual machine density and hardware utilization • Apply Microsoft’s lab and IT production learning to you and your customer’s virtualization deployments
Agenda • Windows Server 2008 R2 Scalable Platform Enhancements • iSCSI Breakthrough Performance Results • Storage: Hyper-V Options and Best Practices • A Real World Deployment: Microsoft IT’s Server Environment • MSIT’s Hyper-V Deployment • MSIT’s “Scale Unit” Virtualization Infrastructure • Questions and Answers
Hyper-V™ Windows Server 2008 R2 Scalable Platform: Efficient scaling across multi-core CPUs Compute • 256 processor core support • Core Parking • Advanced Power Management Storage • 256 processor core IO scaling • Dynamic Memory allocation • iSCSI Multi-Core scaling (NUMA IO) Networking • 256 processor core support • NUMA awareness • VMQ and Virtualization performance Virtualization • Hot Add Storage • Intel EPT memory management support • Live Migration Intel® Xeon® Processor Intel® Ethernet Adapters
Extending the iSCSI platform iSCSI and Storage Enhancements in R2 Reliability, Scalability and Performance Management • iSCSI Multi-Core and Numa IO • DPC redirection • Dynamic Load Balancing • Storage IO Monitoring • CRC Digest Offload • Support for 32 paths at boot time • iSCSI Quick Connect • Configuration Reporting • Automated deployment • iSCSI Server Core UI PHYSICAL VIRTUAL
Agenda • Windows Server 2008 R2 Scalable Platform Enhancements • iSCSI Breakthrough Performance Results • Storage: Hyper-V Options and Best Practices • A Real World Deployment: Microsoft IT’s Server Environment • MSIT’s Hyper-V Deployment • MSIT’s “Scale Unit” Virtualization Infrastructure • Questions and Answers
iSCSI Performance Architectures Filesystem Provided by: Volume Manager Microsoft Class/Disk Firmware/ Driver Hardware Port Driver (Storport) IHV Native + Transport Offload iSCSI HBA Stateful Offload - Chimney MSISCSI.SYS iSCSI HBA miniport MSISCSI.SYS OS OS HW TCP Chimney TCPIP.SYS HW NDIS MINIPORT NDIS MINIPORT HW LRO LSO iSCSI HBA LRO LSO RSS RSS Native Used in Perf tests Note: TCP Chimney should be disabled for iSCSI traffic for best performance and interoperability
iSCSI Test Configuration – 2008 R2 Iometer Management System iSCSI Soft Targets 1 Gbps Switch Target 1 Gbps Software Initiator Target Target Switch Cisco*Nexus* 5020 (10 GbE) Target Target Target Single Port 10GbE Target Target Target • Performance factors • iSCSI initiator perf optimizations • Network stack optimizations • Receive Side Scaling (RSS) • Intel Xeon 5500 QPI and integrated memory controller • Intel® 82599: HW Acceleration, multi-core scaling with RSS, MSI-X • Server • Windows Server 2008 R2 • Microsoft iSCSI Initiator • Intel ®Xeon® Processor 5580, quad core, dual socket, 3.2 Ghz, 24GB DDR3, MTU 1500, Outstanding I/Os =20 Target LUN 2 10 Gbps per Target LUN 1 LUN 9 LUN 8 LUN 6 LUN 3 LUN 7 LUN 5 LUN 4 LUN 10 • Adapter • Intel® Ethernet Server Adapter X520 based on Intel® 82599 10GbE Controller. * Other names and brands may be claimed as the property of others.
Breakthrough Performance at 10GbE Intel® Xeon® Processor 5580 Platform, Windows Server 2008 R2 and Intel® 82599 10GbE Adapter Read/Write IOPs and Throughput Test • 1,030,000 IOPs • Single Port • 10GbE line rate • 10k IOPs per CPU point • Performance for real world apps • Future ready: Performance Scales • 552k IOPs at 4k represents • 3,100 Hard Disk Drives • 400x a demanding database workload • 1.7m Exchange mailboxes • 9x transactions of large eTailers • Jumbo frames: >30% CPU decrease is common for larger IO size (jumbo frames not used here) Read/Write IOPs and CPU Test Performance tests and ratings are measured using specific computer systems and/or components and reflect the approximate performance of Microsoft and Intel products as measured by those tests. Any difference in system hardware or software design or configuration may affect actual performance. Buyers should consult other sources of information to evaluate the performance of systems or components they are considering purchasing.
2008 R2 Hyper-V iSCSITest Configuration iSCSI Direct connection iSCSI initiator runs in VM Microsoft VMQ and Intel VMDq 1 Gbps Iometer Management System Switch 1 Gbps Target Target Target Target Switch Cisco*Nexus* 5020 (10 GbE) Target Target Target Target iSCSI Soft Lab Targets Target Target • Performance factors • iSCSI initiator perf optimizations • Microsoft network stack optimizations • Hyper-V scaling • Receive Side Scaling on host • Microsoft VMQ • Intel VMDq LUN 1 LUN 2 LUN 3 LUN 4 LUN 6 LUN 5 LUN 8 LUN 9 LUN 10 LUN 7 10 Gbps per Target Physical connection Virtual connection * Other names and brands may be claimed as the property of others. Single 10 Gbps Port Host
Breakthrough Performance – Hyper-V iSCSI Performance with Intel® 82599 10G NIC with VMDq, Intel® Xeon 5580 Platform, Windows Server 2008 R2 and R2 Hyper V Read/Write IOPs and Throughput Test • 715k IOPs -- 10GbE line rate • Intel VMDq and Microsoft VMQ accelerates iSCSI to the guest • Hyper-V achieves native throughput at 8k and above • Future ready: Scales with new platforms, OS and Ethernet adapters Near native iSCSI performance Performance tests and ratings are measured using specific computer systems and/or components and reflect the approximate performance of Microsoft Intel products as measured by those tests. Any difference in system hardware or software design or configuration may affect actual performance. Buyers should consult other sources of information to evaluate the performance of systems or components they are considering purchasing.
iSCSI Performance Test Conclusions • iSCSI protocol performance is only limited by the speed of the underlying bus and vendor implementation • iSCSI is ready for mission critical performance workloads • Use Receive Side Scaling to optimize for iSCSI performance in the host • Use VMQ/VMDq to optimize iSCSI performance within the VM for best IO scaling • VLANs offers logical separation of LAN/SAN traffic and additional performance isolation • Most applications use moderate IO and throughput
Agenda • Windows Server 2008 R2 Scalable Platform Enhancements • iSCSI Breakthrough Performance Results • Storage: Hyper-V Options and Best Practices • A Real World Deployment: Microsoft IT’s Server Environment • MSIT’s Hyper-V Deployment • MSIT’s “Scale Unit” Virtualization Infrastructure • Questions and Answers
Hyper-V Storage Connect Options • Boot volumes = VHDs or CSV • Can be located on Fibre Channel or iSCSI LUNs connected from parent or DAS disks local to parent • Data Volumes • VHD or CSV • Passthrough • Most applicable to Fibre Channel, but technically works for DAS and iSCSI • iSCSI Direct – only applicable to running iSCSI from guest
iSCSI Direct Usage • Microsoft iSCSI Software initiator runs transparently from within the VM • VM operates with full control of LUN • LUN not visible to parent • iSCSI initiator communicates to storage array over TCP stack • Supports advanced application requirements: • Supports application specific replication & array side replication utilities run transparently • LUNs can be hot added & hot removed without requiring reboot of VM (2003, 2008 and 2008 R2) • VSS hardware providers run transparently within the VM • Backup/Recovery runs in the context of VM • Enables guest clustering scenario • Inherits Hyper-V networking performance enhancements • Works transparently with VMQ • Performance boost with Jumbo Frames
iSCSIPerf Best Practices with Hyper-V • Standard Networking & iSCSI best practices apply • Use Jumbo Frames (Jumbo frames supported with Hyper-V Switch and virtual NIC in Windows Server 2008 R2) with higher IO request sizes • Benefits seen at 8K and above, the higher the IO size the higher the benefit of jumbo frames with 512K request size seeing the best benefit • Use Dedicated NIC ports or VLANs • iSCSI traffic (Server to SAN) • Multiple to scale • Client Server (LAN) • Multiple to scale • Cluster heartbeat (if using cluster) • Hyper-V Management • Unbind unneeded services from NICs carrying iSCSI traffic • File Sharing, DNS
Cluster Shared Volume Deployment Guidance CSV Requirements • All cluster nodes must use the same drive letter for the %SystemDrive% (example – C:\) • NT LAN Manager (NTLM) must be enabled on cluster nodes • SMB must be enabled for each network on each node that may be involved in CSV cluster communications • Client for Microsoft Networks • File and Printer Sharing for Microsoft Networks • The Hyper-V role must be installed on every cluster node
Cluster Shared Volume Deployment Guidance Storage Deployment Factors • For Server workloads, the application/role will dictate CSV configuration • The virtual configuration should closely resemble the physical configuration • If separate LUNs are required for OS, Data, Logs in a physical server then this data should reside on separate CSVs in a VM • The CSVs used for OS, Data or Logs can be used to store this data for multiple VMs that have this requirement • Generally, avoid disk contention by using separate CSVs for OS and application data • This can be mitigated by having a large number of spindles backing the CSV LUN, something typically seen in high-end SAN storage
Cluster Shared Volume Deployment GuidanceGeneral Guidance VM 1 Boot/OS VM 2 Data VM 3 Log Each VM required 3 separate CSVs However in this case: 3 X 3 = 3 VHD VHD VHD VHD VHD VHD VHD VHD VHD
Cluster Shared Volume Deployment Guidance With 16 Cluster nodes sending I/O to single LUN… • How many IOPS can your storage array handle?
Cluster Shared Volume Deployment Guidance Work with your storage provider • How many CSVs per cluster node? • As many as you need - there is no limit • How many VMs should I deploy on a CSV? • Dependent on rate/volume of storage I/O • What backup applications support CSV? • Microsoft System Center Data Protection Manager 2010 • Symantec Backup Exec 2010 • NetApp SnapManager for Hyper-V • Recently released or releasing soon: • HP Data Protector, EMC NetWorker
CSV and Hyper-V BackupProtect from parent or within the guest VM? • Answer – Both! • Backup from Hyper-V Parent Partition • Protect the virtual machine and associated VHDs • Includes non-Windows servers • Backup from Guest VM • Protect application data • SQL database • Exchange • SharePoint • Files • Same as protecting a physical server
iSCSI and CSV Demo DEMO
Redundant Fibre Channel Infrastructure with Microsoft MPIO • Increases uptime of Windows Server by providing multiple paths to storage • Increased server bandwidth over multiple ports • Automatic failure detection and failover • Dynamic load balancing • Works with Microsoft DSM provided inbox or 3rd party DSMs (PowerPath, OnTAP DSM, etc. ) Clients Windows Server Hosts Fibre Channel switch Network
MPIO and MCS with Hyper-V • Microsoft MPIO and MCS (Multiple Connections for iSCSI) are natively included in Windows Server and work transparently with Hyper-V • Especially important in virtualized environments to reduce single points of failure • Load balancing & fail over using redundant HBAs, NICs, switches and fabric infrastructure • Aggregates bandwidth to maximum performance • MPIO supported with Fibre Channel , iSCSI, Shared SAS • Two Options for multi-pathing with iSCSI • Multiple Connections per Session • Microsoft MPIO (Multipathing Input/Output) • Protects against loss of data path during firmware upgrades on storage controller • When using iSCSI direct, MPIO and MCS work transparently with VMQ
SAN Performance Considerations • Always use active/active multipathing load balance policies (round robin, least queue depth) vs. failover mode. • Follow vendor guidelines for timer settings • Queuedepth, PDORemovePeriod, etc. as these settings will have a direct impact on ability to deliver max IO to VMs and controlling failover times • Many array vendors include host utilities that automatically adjust settings to optimize for their array • Example: Dell “hit kit” host integration kit • Pay attention to spindle count for workload • SSDs (Solid State Drives) change the game on this • For NICs used for iSCSI in HBA mode with offload (iSOE), turn off TCP Chimney • http://support.microsoft.com/kb/951037
Agenda • Windows Server 2008 R2 Scalable Platform Enhancements • iSCSI Breakthrough Performance Results • Storage: Hyper-V Options and Best Practices • A Real World Deployment: Microsoft IT’s Server Environment • MSIT’s Hyper-V Deployment • MSIT’s “Scale Unit” Virtualization Infrastructure • Questions and Answers
MSIT Enterprise Virtualization - History • Determined that 30% of MSIT servers could be hosted as VMs on Virtual Server 2005 • Proof of concept started in 2004, with success leading to the “Virtual Server Utility” service in 2005 • VMs offered as an alternative to smaller physical servers • Hyper-V R1 adopted as commodity at RC, targeting up to 80% of new MSIT workloads • Achieved 60% VM adoption since RTM • Hyper-V R2 deployment began at M3, with nearly 50% of new capacity deployed on virtual machines within the six months after RTM • Host Failover Clustering with CSV used throughout the R2 beta • At least 80% VM adoption is possible with R2 and new virtualization hardware platform • “Physical by exception” and “one in, one out” policies • Native consolidation is encouraged as first step where possible, but we have virtualized a very wide cross section of workloads successfully • This includes SQL, Exchange, SharePoint, and core Windows infrastructure
MSIT Enterprise Virtualization - Current • VMs are approximately 40% of our total server population • About 1500 physical servers host more than 8000 VMs • Highest VM concentrations are in enterprise datacenters • Previous consolidation efforts have limited regional sprawl • Where field services and applications are needed, we deploy a “Virtual Branch Office Server” (VBOS) or a scaled-down virtualization infrastructure • Goal of the VM hosting service is to equal or better the service level and value of traditional physical servers • Availability averaged 99.95%, even prior to HA VM configurations • VM hosting is 50% of the cost of a comparable physical server • Targeting 50% virtualized in 2010 and greater than 80% in the 2011-2012 timeframe • On-boarding process steers appropriate candidates into VMs for all new growth and hardware refresh/migration requirements
MSIT’s Big Problem: “Discrete Unit” Proliferation “Discrete Unit” Definition • Capacity that is purchased, provisioned, and lifecycle-managed independently in the data centers • Compute – Isolated rack-mount servers, usually deployed for a single application or customer • Storage – DAS and small-to-midrange SAN, per-server or per-cluster and determined by short-term forecasting • Network – Resources averaged to the number of expected systems in a given location MSIT’s Challenges • Over-provisioning, under-utilization, and stranded capacity were the norm • Data center space and power constraints were increasing • Time and effort required to deploy was becoming unacceptable • Cost of basic server support services grew over time • Hardware lifecycle management was burdensome
MSIT’s Solution - The “Scale Unit” Virtualization Platform Definition • Compute, Storage, and Network resources deployed in bundles to allow both extensibility and reuse/reallocation - Dynamic Infrastructure • Compute Scale Unit – One rack of blade servers, enclosure switching elements, and cabling • Storage Scale Unit – Enterprise-class storage array with thousands of disks • Network Scale Unit – Redundant Ethernet and FC fabrics at the aggregation and distribution layers • Additional infrastructure resources – VLAN framework, IP address ranges, etc. Key Tenets • Utilize the same compute, storage, and networking elements across MSIT data center environments and customer sets • Centralized procurement of and provisioning of new capacity supports customer deployment requirements, with a minimum of stranded and used resources • Suitable for both net-new deployment and platform refresh
Scale Unit – Basic Design Elements • Comprehensive infrastructure, procured and deployed in fairly large capacity chunks • Compute – 64 blade servers per rack • Network – Highly aggregated and resilient • Per-port costs reduced by a factor of 10 • Storage – Enterprise class SAN arrays • Thin-provisioned storage for efficiency • High availability and fault tolerance part of the design • A level of redundancy at the blades and throughout the network and storage fabrics, coupled with logical resiliency for inherent high availability at a nominal additional cost • Large network and storage domains cover multiple compute racks - Highly-aggregated, but allowing mobility and enhanced flexibility • “Green IT” – Scale Unit V1 compared to an equivalent number of 2U discrete servers • 33% of the space, 55% of the power/cooling, 90% fewer cables • VM consolidation potential greater than 50x
MSIT Scale Unit – Technology Enablers • Blade Servers and Enclosures • Replaceable/interchangeable elements • Multiple system and cross-enclosure power management • OS Efficiency and Virtualization • Boot from SAN – Near stateless hosts • Windows Server 2008 R2 Server Core • Hyper-V host clustering and highly-available virtual machines • Enterprise Storage • Thin provisioning • Virtual pooling of storage resources • Converged Networking • 10Gb Ethernet – High-speed iSCSI and FCoE (Fibre Channel over Ethernet) • Virtualized networks – Trunks with multiple VLANs (802.1q), link aggregation with current (802.3ad, vPC) and future (TRILL, DCB, L2 overlay) technologies
Compute Scale Unit- Deployment Metrics Example * Equivalent number of traditional 2U rack-mounted servers ** Based-upon an average of 4 GB per VM
Compute Scale Unit- Logical Layout Overlay 16 Node Hyper-V CSV Cluster
MSIT R2 Production Clusters • Server virtualization 16 node clusters • 100 to 400 VMs per cluster (~8 to 32 per host) • Client virtualization (VDI) 8 node clusters • 250 to 300 VMs per cluster (~30-40 per host) • Standard LUNs are configured as CSVs, hosting multiple VMs per volume • 500 GB in V1 and up to 13 TB in V2 designs • Limited single VM CSVs or pass-through disks • Dedicated cluster VLAN on all hosts • Live Migration, CSV, and Cluster network traffic • IPsec exempt, jumbo frame capable
Storage Scale Unit - History and New Design Principles History • Shift from mid-range to high-end enterprise storage • 250 mid-range arrays reduced to 20 enterprise-classarrays • Lower administrative overhead and more flexibility • Operational Issues • “Islands of capacity” led to migration challenges • Volume growth issues • Performance Issues • Shared write cache constraints • Small disk pools New Design Principles • High-end, enterprise storage • Larger pools of array resources (disk, front-end ports, etc.) • Enabling of more dynamic storage • Enabling of data replication in the platform (local and remote) • Strong reporting infrastructure to facilitate high resource usage
Storage Scale Unit - Deployment Details Flexible Capacity Allotments • 400 disks per pool, with approximately four pools per array • Port groups of 16 x 8 Gb FC ports • Groups of ports are mapped to a pools of disks as-needed Thin Provisioning • R2 Hyper-V data LUNs presented as CSVs (Cluster Shared Volumes) as needed • Common volume sizes for dedicated storage (Hyper-V pass-through disks or dedicated blades) • 100 GB, 500 GB, 1000 GB, 2000 GB • Reduces “LUN grow” operations by 90% • Host side partitioning dictates actual storage consumption • Partition defined at size requested • Volumes take up zero space until user writes data • Application requirement: 150 GB • Storage delivered: 500 GB (potential) • OS boot partition is always limited to 50 GB
Storage Scale Unit - Key Metrics Total Storage Managed • 10 PB • TB per Admin: 950TB • Storage Utilization: 71% Total Servers Managed • Number of apps: ~ 2500 • Total server count: ~ 16,000 All numbers reported monthly to management
Tiers of CapacityDetermining resource assignment and service levels • Endless options for server/storage • Where do servers land? • Define tiers and stick with them!
Capacity Management Multiple dimensions of capacity management • Pools of shared and highly-aggregated resources • Shared front-end portsand disk groups • Shared arrays and fabric switching infrastructures • When do we stop adding servers and what are the safe thresholds?
Scale Unit - Future Storage Opportunities Storage Impacts from Windows and Application Changes • SQL Compression • 30-80% in initial test cases (YMMV) • OS and SQL Encryption • Data security controlled at the OS and application layers Scale Unit v3+ • Network • Fully converged networking (FCoEand iSCSI) • 40Gb/100Gb Ethernet • Compute • Dynamic server identity provisioning – Boot from SAN or network • Further uncoupling of the OS instance from the hardware • Storage • Small Form Factor (SFF) drives and enterprise-class SSDs • End-to-end encryption • De-duplication
Agenda • Windows Server 2008 R2 Scalable Platform Enhancements • iSCSI Breakthrough Performance Results • Storage: Hyper-V Options and Best Practices • A Real World Deployment: Microsoft IT’s Server Environment • MSIT’s Hyper-V Deployment • MSIT’s “Scale Unit” Virtualization Infrastructure • Questions and Answers
Required Slide Speakers, please list the Breakout Sessions, Interactive Sessions, Labs and Demo Stations that are related to your session. Related Content • Breakout Sessions • VIR207 – Advanced Storage Infrastructure Best Practices to Enable Ultimate Hyper-V Scalability • VIR303 – Disaster Recovery by Stretching Hyper-V Clusters Across Sites • VIR310 – Networking and Windows Server 2008 R2: Deployment Considerations • VIR312 – Realizing a Dynamic Datacenter Environment with Windows Server 2008 R2 Hyper-V and Partner Solutions • VIR319 – Dynamic Infrastructure Toolkit for Microsoft System Center Deployment: Architecture and Scenario Walkthrough • WSV313 – Failover Clustering Deployment Success • WSV315 – Guest vs. Host Clustering: What, When, and Why • Hands-on Labs • VIR06-HOL – Implementing High Availability and Live Migration with Windows Server 2008 R2 Hyper-V • Product Demo Stations • TLC-56 – Windows Server 2008 R2 Failover Clustering (TLC Red)
Required Slide Resources Learning • Sessions On-Demand & Community • Microsoft Certification & Training Resources www.microsoft.com/teched www.microsoft.com/learning • Resources for IT Professionals • Resources for Developers • http://microsoft.com/technet • http://microsoft.com/msdn
Required Slide Complete an evaluation on CommNet and enter to win!