580 likes | 673 Views
A scientific approach to XenApp farm sizing. Helge Klein. Who is. Helge Klein?. CTP, MVP Author of SetACL and Delprof2 Independent consultant and developer Architect of what later became Citrix Profile Management. What is he. talking about?.
E N D
A scientific approach toXenApp farm sizing Helge Klein
Who is Helge Klein? CTP, MVP Author of SetACL and Delprof2 Independent consultant and developer Architect of what later became Citrix Profile Management
What is he talking about? Scientifically sound farm sizing methodology How to calculate farm capacity
Methodology 1. Determine capacity of existing farm 2. Measure load and identify bottlenecks 3. Calculate capacity of new farm
Where to get the numbers? 1. Data collection 2. Observation 3. Measurements 4. Calculation
for /f %i in (AllFarmServers.txt) do wmic /node:%i cpu get name, maxclockspeed, systemname, description, manufacturer, revision /format:csv >> CPUs.txt Collect CPU data Create AllFarmServers.txt with qfarm Use resulting list to determine server model
Srv001,x86 Family 15 Model 4 Stepping 10,GenuineIntel,3400,Intel(R) Xeon(TM) CPU 3.40GHz,1034,Srv001 Srv001,x86 Family 15 Model 4 Stepping 10,GenuineIntel,3400,Intel(R) Xeon(TM) CPU 3.40GHz,1034,Srv001 Srv001,x86 Family 15 Model 4 Stepping 10,GenuineIntel,3400,Intel(R) Xeon(TM) CPU 3.40GHz,1034,Srv001 Srv001,x86 Family 15 Model 4 Stepping 10,GenuineIntel,3400,Intel(R) Xeon(TM) CPU 3.40GHz,1034,Srv001 Srv002,x86 Family 6 Model 44 Stepping 2,GenuineIntel,2666,Intel(R) Xeon(R) CPU E5640 @ 2.67GHz,11266,Srv002 Srv002,x86 Family 6 Model 44 Stepping 2,GenuineIntel,2666,Intel(R) Xeon(R) CPU E5640 @ 2.67GHz,11266,Srv002 Srv002,x86 Family 6 Model 44 Stepping 2,GenuineIntel,2666,Intel(R) Xeon(R) CPU E5640 @ 2.67GHz,11266,Srv002 Srv002,x86 Family 6 Model 44 Stepping 2,GenuineIntel,2666,Intel(R) Xeon(R) CPU E5640 @ 2.67GHz,11266,Srv002 Srv002,x86 Family 6 Model 44 Stepping 2,GenuineIntel,2666,Intel(R) Xeon(R) CPU E5640 @ 2.67GHz,11266,Srv002 Srv002,x86 Family 6 Model 44 Stepping 2,GenuineIntel,2666,Intel(R) Xeon(R) CPU E5640 @ 2.67GHz,11266,Srv002 Srv002,x86 Family 6 Model 44 Stepping 2,GenuineIntel,2666,Intel(R) Xeon(R) CPU E5640 @ 2.67GHz,11266,Srv002 Srv002,x86 Family 6 Model 44 Stepping 2,GenuineIntel,2666,Intel(R) Xeon(R) CPU E5640 @ 2.67GHz,11266,Srv002 Srv003,x86 Family 6 Model 44 Stepping 2,GenuineIntel,2666,Intel(R) Xeon(R) CPU E5640 @ 2.67GHz,11266,Srv003 Srv003,x86 Family 6 Model 44 Stepping 2,GenuineIntel,2666,Intel(R) Xeon(R) CPU E5640 @ 2.67GHz,11266,Srv003 Srv003,x86 Family 6 Model 44 Stepping 2,GenuineIntel,2666,Intel(R) Xeon(R) CPU E5640 @ 2.67GHz,11266,Srv003 Srv003,x86 Family 6 Model 44 Stepping 2,GenuineIntel,2666,Intel(R) Xeon(R) CPU E5640 @ 2.67GHz,11266,Srv003 Srv003,x86 Family 6 Model 44 Stepping 2,GenuineIntel,2666,Intel(R) Xeon(R) CPU E5640 @ 2.67GHz,11266,Srv003 Srv003,x86 Family 6 Model 44 Stepping 2,GenuineIntel,2666,Intel(R) Xeon(R) CPU E5640 @ 2.67GHz,11266,Srv003 Srv003,x86 Family 6 Model 44 Stepping 2,GenuineIntel,2666,Intel(R) Xeon(R) CPU E5640 @ 2.67GHz,11266,Srv003 Srv003,x86 Family 6 Model 44 Stepping 2,GenuineIntel,2666,Intel(R) Xeon(R) CPU E5640 @ 2.67GHz,11266,Srv003 Srv004,x86 Family 6 Model 26 Stepping 5,GenuineIntel,2666,Intel(R) Pentium(R) III Xeon-Prozessor,6661,Srv004 Srv004,x86 Family 6 Model 26 Stepping 5,GenuineIntel,2666,Intel(R) Pentium(R) III Xeon-Prozessor,6661,Srv004 Srv004,x86 Family 6 Model 26 Stepping 5,GenuineIntel,2666,Intel(R) Pentium(R) III Xeon-Prozessor,6661,Srv004 ... The result
RAM, NICs hard disks Could be determined via WMI, too Often knowing the server model is sufficient Components per model often identical
How many users are logged on? Load of CPU, RAM, NICs Individual processes with a lot of RAM or CPU?
Physical Disk\% Disk Time Physical Disk\Avg. Disk Queue Length „Time“ is equivalent to flickering of hard disk LED Disk queue length: Number of waiting IOs
Hypothesis: farm is memory limited Limiting factor will differ between farms
Measurements Tool: Perfmon Next slides: relevant counters
Terminal Services\Active Sessions Terminal Services\Inactive Sessions Terminal Services\Total Sessions System\Processes General system information
PhysicalDisk(_Total)\% Disk Time PhysicalDisk(_Total)\Avg. Disk Queue Length PhysicalDisk(_Total)\Disk Reads/sec PhysicalDisk(_Total)\Disk Writes/sec PhysicalDisk(_Total)\Avg. Disk sec/Transfer Hard disk activity Load, queue length, operations per second, latency
Processor(_Total)\% Processor Time Memory\Available MBytes Network Interface(*)\Bytes Total/sec CPU, RAM and network RAM: total amount must be known!
logman create counter TSPerf -f csv -cf C:\PerfLogs\Counters.txt -o C:\PerfLogs\Server13.csv -si 60 -rf 24:00:00 Automation Create and start data collector set. Format CSV, performance counters are read from C:\PerfLogs\Counters.txt, output file is C:\PerfLogs\Server013.csv, 60 second sampling interval, duration 24 hours.
logman create counter TSPerf -f csv -cf C:\PerfLogs\Counters.txt –o C:\PerfLogs\Server13.csv -si 60 -rf 24:00:00-s Server13 Execution on remote computer Server13
for /f %i in (Servers.txt) do logman create counter TSPerf -f csv -cf C:\PerfLogs\Counters.txt -o C:\PerfLogs\%i.csv -si 60 -rf 24:00:00 -s %i Many servers One computer name per line in Servers.txt
Analyzing the measured data
CPU and sessions Server 37 Moderate load during logon phase, afterwards even less A lot of overcapacity
CPU and sessions Server 89
Overlaying the CPU load of many servers Easily verify the analyis
HDD and sessions Server 37 Moderate load, peaks during logon phase Full load at aprox. 200
HDD and sessions Server 89
Overlaying the HDD load of many servers Easily verify the analyis
RAM and sessions Server 37 Continually increasing load, maximum in the afternoon Available RAM must not go near zero (because of disk cache) High load
RAM and sessions Server 89
Overlaying the memory load of many servers Easily verify the analyis
Network and sessions Server 37 200 = 2 MB/s Average rate < 200 KB/s Very low load, a lot of overcapacity
Network and sessions Server 89
Overlaying the network load of many servers Easily verify the analyis
Hypothesis confirmed: farm is limitedby available memory CPU load: low, network: negligible hard disk: moderate
Calculating farm capacity
Normalizing CPU performance How to compare performance of different CPUs? Benchmarking is difficult Better: Moore’s law (doubling of performance every 18-24 months) Surprisingly accurate (amongst other things because it is a self-fulfilling prophecy)
Performance after time (in months) Assumed performance doubling every 21 months Oldest CPU in farm = 1,0
Farm capacity: 1250 normalized CPUs 0.63 CPUs / user
Hard disk performance = • IOPS With many concurrent accesses transfer rate is mostly irrelevant More important: IOPS (operations per second) Exact number depends on measurement method do not believe vendors
PhysicalDisk(_Total)\Disk Reads/sec PhysicalDisk(_Total)\Disk Writes/sec IOPS measurement with Perfmon Read and write IOPS may be very different
Read and write IOPS Average: ~15 Including spikes: 30
Farm capacity: 7300 IOPS 3,7 IOPS / user
RAM – how much do we have? We need total RAM that is available for user sessions