380 likes | 541 Views
Microsoft in the Enterprise. Windows Scalability: Technology, Challenges and Limitations Andreas Kampert. Agenda. Scale-up and Scale-out Scale-Up CPU, Memory, Disks What does this mean for Windows applications Scale-Out Clones Partitioning Scale-Up and Scale-Out together
E N D
Microsoft in the Enterprise Windows Scalability: Technology, Challenges and Limitations Andreas Kampert
Agenda • Scale-up and Scale-out • Scale-Up • CPU, Memory, Disks • What does this mean for Windows applications • Scale-Out • Clones • Partitioning • Scale-Up and Scale-Out together • Application example Sieble Enterprise Application
Scale UP Scalable Systems • Scale UP: grow by adding components to a single system • Scale Out: grow by adding more systems Scale OUT
CPU 0 CPU 1 CPU 2 CPU 3 Main Memory Main Memory Controller System Bus PCI Bus 1 PCI Bus PCI Bus 2 Controller Controller Controller Everything starts with understanding your computer
Agenda • Scale-up and Scale-out • Scale-Up • CPU, Memory, Disks • What does this mean for Windows applications • Scale-Out • Clones • Partitioning • Scale-Up and Scale-Out together • Application example Sieble Enterprise Application
The Memory Hierarchy • Locality REALLY matters • CPU 2 Ghz, RAM at 5 MhzRAM is no longer random access • Organizing the code gives 3x (or more) • Organizing the data gives 3x (or more) • Level latency (clocks) • Registers 1 • L1 2 • L2 10 • L3 30 • Near RAM 100 • Far RAM 300
32-bit Windows Virtual Address Space 00000000 Application Code Global Variables .DLL code Unique per process, accessible in user or kernel mode 3 GB allows Extension Requires: Boot.ini Setting plus large_address_aware 7FFFFFFF 80000000 Exec, Kernel, HAL, drivers, per-thread kernel mode stacks, Win32K.Sys File system cache Paged pool System PTEs Non-paged pool… Per process, accessible only in kernel mode C0000000 Process page tables, hyperspace System wide, accessible only in kernel mode FFFFFFFF
Memory Mapping Virtual Memory Physical Memory Pagefile(s) Process 1 User Address Space System Address Space Process 2 User Address Space System Address Space
Physical Address Extension for IA32 • PAE required, if using >4GB physical memory • Makes additional memory available to the OS • Has no impact to applications • Applications require AWE (see later) • Enabling PAE [boot loader] timeout=30 default=multi(0)disk(0)rdisk(0)partition(1)\WINNT [operating systems] multi(0)disk(0)rdisk(0)partition(1)\WINNT= “Windows PAE" /PAE
Address Windowing Extension API’s • Allows Applications to bypass the 4 GB limit • Advantages of the AWE API’s • Small API Set utilizing a windowing technique • VirtualAlloc() with the MEM_PHYSICAL FLAG • AllocateUserPhysicalPages() • MapUserPhysicalPages() • FreeUserPhysicalPages()
AWE Mechanism Physical Memory Application Virtual Address Space 2 GB (or 3) GB Application Memory Space MapUserPhysicalPages() AllocateUserPhysicalPages() AWE Region Allocated Using VirtualAlloc() AllocateUserPhysicalPages()
Hot-Add Memory • Requires • Hardware and • BIOS support • SRAT • ACPI 2.0 • Reporting Memory at Post
Thread Scheduling 31 16 • Priority driven, preemptive • No attempt to share processor's “fairly” among processes, only among threads • Event-driven; no guaranteed execution period before preemption • Time-sliced, round-robin within a priority level • Simultaneous thread execution on MP systems • Any processor can interrupt another processor to schedule a thread • Tries to keep threads on same CPU (“ideal processor”) 15 1 0 i
Affinity • Threads can run on any CPU, unless affinity specified otherwise • Affinity specified by a bit mask • Each bit corresponds to a CPU number • Thread affinity mask must be subset of process affinity mask, which in turn must be a subset of the active processor mask • “Hard Affinity” can lead to threads’ getting less CPU time than they normally would • More applicable to large MP systems running dedicated server apps
Disks Are Becoming Tapes 150 GB • Capacity: • 150 GB, 300 GB, 2 TB • Bandwidth: • 40 MBps 150 MBps • Read time • 2 hours sequential, 2 days random 4 hours sequential, 12 days random 150 IO/s 40 MBps 1 TB 200 IO/s 150 MBps
Amdahl’s Balanced System Laws • 1 mips needs 1 MB ram and needs 20 IO/s • At 1 billion instructions per secondneed 4 GB/cpuneed 50 disks/cpu! • 64 cpus … 3,000 disks 1 bips cpu 4 GB RAM 50 disks 10,000 IOps 75 TB
Exchange Server Memory Management • Exchange Server does not use memory beyond 4GB efficiently • Exchange Server 2003 requires /3GB with more than 1GB RAM • Exchange Server 2003 has no advantage through the usage of PAE • AWE not used by Exchange Server MSExchangeIS\VM Largest Block Size MSExchangeIS\VM Total 16MB Free Blocks MSExchangeIS\VM Total Free Blocks MSExchangeIS\VM Total Large Free Block Bytes
Exchange Server Processors • Exchange Server Mailbox Server scales well up to 8 Processors • With more than 8 processors mostly hardware partitioning is recommended • With more than 8 processors use affinity mask to reduce to 8 processors for Exchange Server 2003 • Eventually additional processors for Virus Scanner, etc
SQL Server Memory Management • SQL Server 32-bit supports up to 64 GB • Usage of more than 4 GB requires fixed memory • Dynamic memory management is no longer possible • Access time not linear!!!! • Use 64-bit SQL Server • Same issues with other DBMS 16 GB 64 GB 4GB PAE N 3GB o AWE o PAE Y 3GB o AWE Y PAE Y 3GB N AWE Y
CPU 0 CPU 1 CPU 2 CPU n Fibers Write Directly to Clients Win NT Thread 0 Win NT Thread 1 Win NT Thread 2 Win NT Thread n Network Fibers Fibers Fibers Fibers NT Queues Reads Issued by Fibers to I/O Completion Port UMS Schedules Fibers UMS Work Queue UMS Work Queue UMS Work Queue UMS Work Queue Network NT I/O Completion Port Win Thread Network Handler Network Handler Notified When I/O Completes Understand what the CPU does for SQL Server
Terminal ServerHistoric Issues with Scalability • 32-bit systems • Servers often run out of kernel virtual memory rather than CPU • All applications must share the same 2 GB kernel address space • Adding RAM does not help • Most customers run 1Proc and 2Proc servers • Administrators must deploy and manage many servers • Reduces effectiveness of server consolidation • IA64 systems • Cannot run 32-bit applications without high overhead of WOW emulation • Incremental users/server outweighed by cost
x64 Editions “First mover” Workloads: Preliminary Testing • Key value • Core OS functionality & performance benefits (64-bit) • Runs most existing 32-bit apps with increased performance • Provides evolutionary path to 64-bit applications • Single code-base based on WS03 SP1 • AMD Opteron/Athlon 64 & Intel Xeon EM64T supported with one product • Compatibility • WS03 SP1 level compatibility • Application kernel mode code and drivers must be 64-bit
Agenda • Scale-up and Scale-out • Scale-Up • CPU, Memory, Disks • What does this mean for Windows applications • Scale-Out • Clones • Partitioning • Scale-Up and Scale-Out together • Application example Sieble Enterprise Application
Clones: Availability+Scalability • Some applications are • Read-mostly • Low consistency requirements • Modest storage requirement (less than 1TB) • Examples: • HTML web servers • LDAP servers • Replicate app at all nodes (clones) • Load Balance: • Spray& Sieve: requests across nodes • Route: requests across nodes • Grow: adding clones • Fault tolerance: stop sending to that clone
Partitions For Scalability • Clones are not appropriate for some apps. • State-full apps do not replicate well • high update rates do not replicate well • Examples • Email • Databases • Read/write file server… • Cache managers • chat • Partition state among servers • Partitioning: • must be transparent to client. • split & merge partitions online
Agenda • Scale-up and Scale-out • Scale-Up • CPU, Memory, Disks • What does this mean for Windows applications • Scale-Out • Clones • Partitioning • Scale-Up and Scale-Out together • Application example Sieble Enterprise Application
Siebel 7 Environment Server Manager GUI Web Client Wireless Client Mobile Web Client Handheld Client Dedicated Web Client Wireless Gateway Server Mobile DB SQL CE Web Server Siebel Web Server Extension Siebel Enterprise Server Siebel Gateway Server Connection Broker Name Server Server Manager Cmd Line Interface Siebel Server Siebel Server Siebel Server EAI & Data Loading Siebel Database Siebel File System
© 2004 Microsoft Corporation. All rights reserved. This presentation is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS SUMMARY.
Memory Latency And CPU Caches • CPUs are much faster than memory,gap continues to grow(100Mhz -> 2+Ghz vs. 80ns -> 50ns) • Caches needed to hide memory latency • Cache effectiveness depends onlocality of memory references(e.g. cached data & code must be reused >9x before being pushed out) • “cacheline” = 32, 64, ... bytes(unit of replacement & collision)
Effect Of Cache Hit RatioOn Performance 1 / ( (FastTime * HitRatio) + (SlowTime * (1-HitRatio) ) ) Fast: 7 cycles for L2 hit Slow: 150 cycles for RAM access Actual effect depends on memory accesses per instruction
Disks Are Becoming TapesConsequences • Use most disk capacity for archivingCopy on Write (COW) file system in Windows Server 2003 • RAID10 saves arms, costs space (OK!). • Backup to diskPretend it is a 100GB disk + 1 TB disk • Keep hot 10% of data on fastest part of disk • Keep cold 90% on colder part of disk • Organize computations to read/write disks sequentially in large blocks
12,000 User Benchmark on HP/Windows/SQL64 • Concurrent Users • Server Component Throughput SQL64 on a 4x 1.5 GHz Itanium2 HP Integrity used 47% CPU and 13.3 GB memory proving unprecedented price/performance for Siebel
12,000 User Benchmark on HP/Windows/SQL64 – resource utilization
Siebel Scalability On Available Platforms Note: 30,000 user tests are based on Siebel 7.0.3 and 32,000 test is based on 7.5.2; transaction mix is different between Siebel 7,0.3 and 7.5.2 test suites.
Resource Utilization by 30,000 and 32,000 Concurrent Users Test
Terminal Server Performance Windows Server 2003 x64 Windows Server 2003 (32-bit) 50% 600 Windows 2000 80% 400 200 0 Knowledge Worker (Hardware: 4P AMD 64 – HP DL 585) X64 Performance and Benefits • Lab testing indicates increased performance • Up to 50% improvement in users/server on comparable hardware • Knowledge worker simulation • Largest benefit will be with 4P servers in limited virtual kernel memory scenarios • Opportunity for server consolidation • Registry Setting to Reduce Microsoft® Outlook® 2003 Periodic Polling • HKEY_CURRENT_USER\Software\Microsoft\Office\11.0\Outlook\RPC ConnManagerPoll [dword] 0x600