640 likes | 820 Views
Debugging XenApp & XenDesktop. Lalit Kaushal Escalation Engineer EMEA. Agenda. Overview of Common Components Troubleshooting Utilities Common Issues Troubleshooting Tips. Overview of Common Components. Putting It All Together. Find “best” virtual desktop.
E N D
Debugging XenApp & XenDesktop Lalit Kaushal Escalation Engineer EMEA
Agenda • Overview of Common Components • Troubleshooting Utilities • Common Issues • Troubleshooting Tips
Putting It All Together Find “best” virtual desktop Acquire license and determine settings Authenticate Start VM Desktop Delivery Controller SAN PVS Register PXE-boot VM and stream OS Connect using ICA Log in XenServer Virtual Machines Full range of authentication methods supported through web interface technology Apply profile Deliver apps Full support for SmartAccess and ICA session policies Active Directory with roaming profiles XenApp
Common Components ICA Client Web Interface Active Directory XML IMA DDC/ZDC (Although roles are a bit different)
Understand the problem Where is the problem Network Server (all servers / one server) Client (one client machine/ one client version/client type) Data Store problem (corruption / inconsistency / configuration) Before you begin
Collect Information Frequency? Can I reproduce? Determine Possible Causes/Effects Get dumps, logs Tools Determine necessary tools Create a Setup Debug Tools and Information to solve problem Where to start?
Determine accurate reproduction steps Find appropriate starting point to debug Crashes – Determine state (using global, stack, etc.) Debug against working model Use appropriate tools Solving the problem
WINDBG – Windows Debugger CDFControl – CDF Tracing FILEMON – File Monitoring REGMON – Registry Monitoring PROCEXP – Process Explorer SYSTEMDUMP What tools are available?
Process Monitor • Combines Filemon and Regmon
Process Explorer • Process Explorer shows handles and DLLs processes • Helpful to troubleshoot: • Memory Optimization issues • Application Streaming • Access issues • Process Explorer is available from Microsoft
Network Trace - Packets • Sync Packet (SYN) • Start of TCP session. Three way handshake (Syn, Syn-Ack, Ack) • ICA session initialisation packets are transmitted next • Reset Packet (RST) • Something has gone wrong, TCP session failed, unhandled closure of session • Finish Packet (FIN) • Session is been closed in a handled manner • Push Packet (PSH) • Data is been sent to receiving process directly • Ack Packet (ACK) • Packet was received successfully by the remote device
Session (Network trace) Start of a session. End of session
User Mode versus Kernel Mode The Windows operating system can be conceptually divided into 2 parts: User Space (User Mode) Kernel Space (Kernel Mode) Applications run in User Mode System drivers run in Kernel Mode (Privileged Mode) Debugging
USER SPACE USER APPLICATION USER APPLICATION USER MODE USER APPLICATION USER APPLICATION USER APPLICATION KERNEL SPACE rusb2w2k.sys keyboard.sys win32k.sys tcpip.sys […]
Dump and Logs - BSOD • Microsoft definition: BSOD is a Fatal Exception Error or System failure • Fatal exception errors: • Access to an illegal instruction has been encountered • Invalid data or code has been accessed • The privilege level of an operation is invalid • In most cases the exception is non-recoverable • Dumps system memory to a file for debugging • Memory.dmp is placed on the System Drive • Requires free space equivalent to physical RAM + approx 12MB
User dump – process memory Live dump (snapshot) Post-mortem dump (after crash) Kernel dump – OS kernel memory Manual dump Post-mortem dump (after BSOD) Complete dump – physical memory (kernel memory + processes) Manual dump Post-mortem dump (after BSOD) Dumps & Logs - Types of Dumps
Dr Watson Debugger generates a log file (Drwtsn32.log) & User Dump (user.dmp) when an application exception or program error occur Log file is cumulative, user.dmp overwritten Set as the default debugger: drwtsn32.exe –I User Dump Generates memory dump of specific process Microsoft Knowledge Base Article – 241215 User Dump WINDOWS TASK MANAGER CAN CAPTURE USER DUMPS IN VISTA & 2008!!!
SystemDump - CTX111072 • Can generate a dump from a session • No keyboard required • Command line option available • 32 / 64 bit Description saved in dump
DumpCheck - CTX108825 • Citrix DumpCheck (Explorer Extension)
Common Problems Server\Application Crash Server\Application Hang CPU Spikes Web Interface Debugging
Capturing Application Crash Dumps • Some method of capturing the fault is needed • Ntsd - http://support.citrix.com/article/ctx108173 • Windbg - http://support.citrix.com/article/ctx107528 • Userdump - http://support.microsoft.com/kb/241215 • Dr Watson – http://support.citrix.com/article/ctx103209 • WER - http://www.microsoft.com/whdc/maintain/StartWER.mspx • Verify your chosen method works • TestDefaultDebugger – http://support.citrix.com/article/ctx111901 • Have one of these methods enabled
Debugging tools for Windows • Use tool analyze crash dumps • http://www.microsoft.com/whdc/devtools/debugging/debugstart.mspx • Latest version is part of WDK (620mb download) • Earlier version are available as standalone download
WinDbg Symbols – Huh?
Symbols • .PDB – Program Database • Generated during compilation of the application by the vendor • Necessary to translate memory into something human readable.. • 11010101001010101 = helloworld() • Microsoft symbols Server - Essential • http://msdl.microsoft.com/symbols • Citrix symbols • ftp://ftp.citrix.com/ • http://ctxsym.citrix.com/symbols • SRV*c:\symcache*http://msdl.microsoft.com/download/symbols;SRV*c:\symcache*http://ctxsym.citrix.com/symbols
Analyzing crashes • Can use similar method for Kernel or User Dump analysis • !analyze –v • lmv m suspicousmodule • Update suspiciousmodule to latest version • Search if known stack trace • Look at stack functions • Understand what the code was trying to do when it crashed
Systemdump_400000 makes a call into ntdll Read upwards • Component names • DLL • EXE • SYSTEM Driver Systemdump_400000 makes a call into USER32
Review of the stack • The top of the stack is the last function executed • What caused the crash • Look for non core OS components • Core OS module are usually not the fault • Closest to the top of the stack • Treat them as suspicious • Find out via lmv command • Version • Owner • Timestamp Case Study: Using WinDbg to analyze IMA Crash
Case Study • Issue Reported IMA frequently stopped unexpectedly on several server in the farm • Data Collected • Collected User Dump
Case Study • Issue Reported IMA frequently stopped unexpectedly on several server in the farm • Data Collected • Collected User Dump • Analysis Done • !analyze –v • lmv m <modulename> • Resolution • Uninstall Oracle Client 9.2 and update to 10.2
Server Hangs • Dumps are not created automatically • Full memory dumps are most useful • Need to force a dump • Systemdump - http://support.citrix.com/article/CTX111072 • If server is not fully hung • Keyboard - http://support.microsoft.com/kb/244139/en-us • Hardware NMI Switch • Configure for full memory dump instead of kernel
Analyzing Server Hangs • Automatic analysis • !analyze –v –hang • Not 100% reliable for full memory dumps • Lmv m suspectmodulename • Check for locks • The 3 step programme with two new commands • !locks • Look for exclusive waiters • Notice contention count • Look at the owner thread code • !thread <threadID>
Analyzing application hangs • Force a crash of process • userdump.exe - http://support.citrix.com/article/ctx466627 • Vista/2008 – Available from Task Manager • Same windbg commands again • Automatic analysis usually good • !analyze –v –hang • Try and understand what code is doing from function names • Might have to chase the hang from one process to another Case Study: Using WinDbg to analyze Server hang
Case Study • Issue Reported • XenApp server is hanging during logon • Data Collected • Collected Kernel Dump
Case Study • Issue Reported • XenApp server is hanging during logon • Data Collected • Collected Kernel Dump • Analysis Done • !analyze –v -hang • !locks • Resolution • Involved Microsoft and recommended relevant Microsoft Hotfix
Try to define a pattern (leverage perfmon) Determine offending Thread ID causing the spike (Process Explorer, QSlice) Obtain UserDump of offending process immediately after (Userdump.exe, WinDbg.exe) !runaway WinDbg command to view thread times Topmost thread is one to investigate Use application spy to look at what the application is doing (TracePlus, Logger) CPU Spikes ProcDump – New Microsoft Tool!!!
Microsoft command-line utility To monitor an application for CPU Spikes Generate a dump during spikes usage: procdump [-64] [[-c CPU usage] [-u] [-s seconds]] [-n exceeds] [-e] [-h] [-m commit usage] [-ma] [-o] [-r] [-t] < <process name or PID> [dump file]] | [-x <image file> <dump file> [arguments]> C:\>procdump -c 20 -n 3 -o pnamain c:\dump\pnamain ProcDump