1 / 62

Kapil Ramlal (KappA) Escalation Engineer

Kapil Ramlal (KappA) Escalation Engineer. Troubleshooting Tools and Methodology in a Citrix XenApp 5.0 Environment. Agenda. XenApp troubleshooting. The right tool, right place at the right time. Troubleshooting scenarios. Top utilities. Case studies. Additional resources/Q&A. Agenda.

chogan
Download Presentation

Kapil Ramlal (KappA) Escalation Engineer

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Kapil Ramlal (KappA) Escalation Engineer Troubleshooting Tools and Methodology in a Citrix XenApp 5.0 Environment

  2. Agenda XenApp troubleshooting The right tool, right place at the right time Troubleshooting scenarios Top utilities Case studies Additional resources/Q&A

  3. Agenda XenApp troubleshooting The right tool, right place at the right time Troubleshooting scenarios Top utilities Case studies Additional resources/Q&A

  4. Agenda XenApp troubleshooting The right tool, right place at the right time Troubleshooting scenarios Top utilities Case studies Additional resources/Q&A

  5. Agenda XenApp troubleshooting The right tool, right place at the right time Troubleshooting scenarios Top utilities Case studies Additional resources/Q&A

  6. Agenda XenApp troubleshooting The right tool, right place at the right time Troubleshooting scenarios Top utilities Case studies Additional resources/Q&A

  7. Agenda XenApp troubleshooting The right tool, right place at the right time Troubleshooting scenarios Top utilities Case studies Additional resources/Q&A

  8. XenApp troubleshooting Understanding the infrastructure The anatomy of a XenApp farm • Information:Static and Dynamic • Components: Where to focus troubleshooting Understanding what happens from logon to launch • Types of issues: Denial of service, bottlenecks • Troubleshooting: Medevac, performance monitoring, CDF…

  9. Types of Information Static Dynamic LHC • Data Store • Does not change frequently • Farm configuration • Changes made in the Management Console • Dynamic Store • Constantly changing information • Load management • Information required for application launch DATA STORE

  10. Logon to launch Active Directory XML Broker Client Web Interface Least Loaded Server Zone Data Collector Data Store

  11. MedEvac (CTX107935) • The XML Broker tests • Verifies that the XML Service is able to respond to an XML / client request • XML is able to contact the Zone Data Collector • Zone Data Collector tests • Verifies that the ZDC can provide the address of the least loaded server for the requested app • The IMA Service is able to respond • The IMA Service can read the Local Host Cache • The IMA Service can read it’s Dynamic Store • Least Loaded Server tests • Verifies that Terminal Service is able to respond • Verifies that the RPC Service is able to respond

  12. How to Monitor Farm Health using MedEvac? • See knowledge center article CTX119899

  13. RSOP CDF CDF Monitoring Active Directory XML Broker XML Threads Client Web Interface ASP Requests Zone Data Collector IMA Work Item Queues IMA %CPU time Zone Elections Won

  14. XenApp 5.0 Health Monitoring and Recovery • Enterprise & Platinum Editions of XenApp • Performs tests to monitor state and identify health risks • Terminal Services tests • XML Service test • Citrix IMA Service test • Logon Monitor test • Check DNS test • Local Host Cache test • XML threads test • Citrix Print Manager Service test • Microsoft Print Spooler test • ICA Listener test • See page 307 of the XenApp 5.0 Administrator’s Guide (CTX115519) for information

  15. Free the ZDC! Large Farm Tips • Limit additional roles on Zone Data Collectors • Limit the number of zones in the environment • Do not run management consoles on or pointed to the ZDCs • Read the Key Infrastructure Tuning article: CTX116492

  16. The evolution continues! • Citrix XenApp 5.0 opens the door for delivering resources on Windows Server 2008 • Clients are also adopting more Windows Vista users • Say hello to the next generation troubleshooting artillery for the XenApp 5 environment • Existing tools have been updated, and new tools introduced • The evolution continues!

  17. The right tool, right place at the right time • DON'T • Use troubleshooting tools just because you can • Recommend tools that are not relevant to the problem • Use troubleshooting tools without understanding their impact of the environment • DO • Use tools to help automate time consuming tasks • Use tools at the right time, such as when the problem is occurring and not afterwards • Understand what the tool is trying to accomplish, so that the right data is obtained • Use tools with a clear purpose • Maintain a local toolkit, so that the right tools are always available in times of crisis

  18. CDF Tracing & CDFControl 2.5

  19. Common Diagnostic Facility (CDF) • Provides the ability to collect traces for problem diagnosis on Citrix binaries without disrupting the services or users • Citrix’s standard debug tracing facility • Efficient and non-intrusive data collection process • Enabled without stopping and starting services • Faster & easier tracing for retail modules • Flexible & customizable troubleshooting facility • Consistency across most Citrix products

  20. CDF Basics • To better understand what a CDF trace message is, let’s look at the following pseudo code example • In the example, the function belongs to a service, which can be considered to be a Trace Provider (more on this later)

  21. The moral of the story • We could capture a CDF trace to determine if the CitrixFeatureDLL.dll loaded successfully • How difficult it would be to debug without having this tracing? • You need special symbol files to be able to read the trace messages (TMF files) • This allows certain information to remain private as needed (similar to .pdb files) • You get more by default!

  22. CDF Internals • To better understand CDF, let’s take a quick overview at how the Operating System supports Event Tracing (ETW)

  23. CONTROLLER CONSUMER CDFCONTROL Events Windows Events Event Tracing for Windows Enable/ Disable Events Buffers Events Events Trace File CDM.sys RadeSvc.exe WFShell.exe

  24. ETW Components • Providers: • Modules containing tracing, that can be enabled or disabled • Example: MF_Driver_Cdm (Cdm.sys) • Controllers: • Enables/Disables a provider • Configures trace capture settings • Starts/Stops a trace • Consumer: • Reads trace events from log file • Reads trace events real-time from a trace session

  25. CDFControl v2.5 • CDFControl is a hybrid controller and consumer • It can start/stop/enable and configure an ETW/CDF trace session • It can consume (read) trace events from a log file, or from a live real-time trace session • The original version operated only as a ETW Controller, and was published under CTX111961

  26. CDFControl 2.5 Demo

  27. Troubleshooting Scenarios

  28. Troubleshooting scenarios • Application Streaming • Seamless/Multi-Monitor • 3rd Party Applications • CPU Spikes • Deadlocks/Hangs • Database • Network • Black Hole Effect • XenApp Plugin (PNA) • Debugging

  29. Application Streaming What happens on the client side? • manifest file • executable • AIE rules • .dll’s • data files • other .exe’s • .dll’s • data files • other .exe’s • .dll’s • data files • other .exe’s RAD file End User Network File Servers Streaming Client and AIE • End user launches app from WI or PN Agent • RAD file is downloaded • RAD file launches client Application Isolation Environment (AIE) • RAD file instructs streaming client to download: • Manifest file | AIE rules | Application executable | Pre and post execution scripts • Streaming client launches executable according to instructions in manifest file and AIE rules including pre and post execution scripts and registers with the ctxsbx.sys (redirector) • Application is available to user • Streaming Client requests additional files as required, checking first in the client cache, then if necessary, downloading additional files from the file server

  30. Application Streaming • Isolate the Issue • When? • Profiling • Publishing • Streaming • How? • Streaming to Server • Streaming to Client • Versions? • WI 4.5, 5.0 • License server 4.5,5.0 • Client

  31. Application Streaming Streaming Client Troubleshooting: • Client installation is required on workstations • Verify the Citrix Streaming Service is started or restart • Reference CTX116483 – required permissions • Enable debug console • HKEY_LOCAL_MACHINE\Software\Citrix\Rade • REG_DWORD: “EnableDebugConsole” • Value: 1 to switch on, 0 to switch off

  32. Application Streaming Leverage realtime CDF tracing! • Run CDFControl on the client (where client is installed) • Choose the Application Streaming category • Enable realtime tracing • Provide a TMF path (CTX106233) • Start tracing and reproduce the launch failure

  33. Seamless/Multi-Monitor SEAMLESS HOST COMPONENTS Winlogon Default winlogon.exe sehook20.dll sehook20.dll wfshell.exe seamls20.dll icactls.dll icast.exe TWIWorker TWIReader TWISysTrayAgent ICA Client

  34. Seamless/Multi-Monitor SEAMLESS CLIENT COMPONENTS wfica32.exe vdtwin30.dll vdtwn.dll ctxsrcc.lib GAI LVB

  35. Seamless/Multi-Monitor Multi-Monitor • An optional component • Client provides a monitor layout via thinwire channel which is shared by all process loading mmhook.dll via shared memory • Work area change is always posted to host. This could be due to change in work area of the existing area or change in virtual screen size due to addition /deletion of monitors. • API hooks are controlled by flags and can be customized per process. Refer to CTX115637 for various configuration options

  36. Seamless/Multi-Monitor • Shift F2 to change to Full Screen mode • Reconnect as fixed size window session • Set global flags, 0x26DEA7, to see if it fixes the issue. • This is combination of following flags (See CTX101644 for details of each bit) • 0x1 (Disable session sharing), 0x2 (Disable modality check), 0x4 (Disable AA hook) • Analyze CDF trace for MF_DLL_CTXNOTIF and MF_SESSION_TWI • Analyze window information using SPY++/Window History/Message History • Try per-window exception flags • Analyze application logic (API flow) using TracePlus utility

  37. Seamless/Multi-Monitor • Get the Window class name which is exhibiting the problem • Collect the CDF traces for concerned module ONLY • CTXNOTIF, MMHOOK, TWCDS, TWI, TWI_HOOK • Analyze the behavioral aspect that could be affected by hooks??? • Enable disable/ Does it happen on single monitor too? If yes, chances are very little. Disable mmhook and see what happens? • Compare the window styles at host and client • For seamless specific issue, verify if it happens in ICA Desktop/RDP also.

  38. 3rd Party Applications • How does the application work? • Is it Native, or does it run on a Framework, such as .NET or Java? • Do you have the right versions of the Framework installed? • Are the correct dependencies present, and does it work at the console? • Does it require certain file and registry access? (Does it need Write permissions etc. ?) • Does it require component registration? • Inspect core functionality • View the application/process under an analysis tool such as ProcessExplorer or WinDbg • Inspect all loaded modules (DLLs) by the application • Validate any dependencies (missing DLL's?) • Inspect named events and handle usage (synchronization/resource problems?) • Validate file and registry access using ProcessMonitor • Run application under the AppVerifier utility to check for a multitude of issues

  39. 3rd Party Applications • Leverage the Global Flags for user-mode applications using the Gflags utility • Set 3rd party application to run under Image File Executions • Configure a debugger to invoke the application (such as WinDbg) • When the application launches, the debugger will automatically attach to the process and halt its execution! • This gives the opportunity to explore all application threads from process initialization (~*kb) • From here the internals of the application can be understood at the Native Windows API level (i.e. Which Windows API's are being used)

  40. 3rd Party Applications • Use ProcessExplorer to view the loaded modules for a process, and check for the presence of any hook modules (hooking DLL's) • Hook modules can alter the natural behavior of applications, which can sometimes cause problems • Try excluding the problem application from all Citrix hooks (CTX107825)

  41. CPU Spikes • Try to define a pattern (leverage perfmon) • Determine offending Thread ID causing the spike (Process Explorer, QSlice) • Obtain userdump of offending process immediately after (Userdump.exe, WinDbg.exe) • Check CDF trace for repeated (looping) messages (if Citrix component) • Use application spy to look at what the application is doing (TracePlus, Logger)

  42. Deadlocks • Windows Vista and Server 2008 offer the new Wait Chain Traversal (WCT) API! • This offers applications a mechanism to check internally for wait conditions, and also allows for custom tools to be created which can also check for application hangs – LIVE! • No cool WCT tools available? The debugger is your friend!  • Attach to hung process/service and generate a dump for post-mortem analysis: • .dump /ma c:\PathToDump\DeadlockedApp.dmp • Manually inspect thread states, and get the debugger's opinion with: • !analyze -hang -v THE WINDOWS TASK MANAGER CAN CAPTURE USER DUMPS IN VISTA & 2008!!!

  43. Slow logons • Understand the logon process and Identify the slowdown! • Validate via network trace that the connection between server to client is good • If the connection makes it to the server, check which processes exist • Use TaskManager and sort by session ID • Gather userdumps for each process for the slow session to try to identify any synchronization problems, such as LPC and ALPC wait chain conditions • Ensure Terminal Services is running (svchost.exe) and that the thread count appears normal • Ensure critical Citrix processes are okay, such as IMA, CpSvc and XML

  44. The XenApp client • PNAgent.exe starts up and communicates with PNAMain.exe to share application launch, and shortcut details • PNAMain.exe initiates communication with the Web Server for application requests and config.xml settings • WFCRun32.exe works with WFICA32.exe to launch an application • Best to use a live-debug approach as there is no inherent tracing readily available on the client

  45. The XenApp client For single sign-on problems ensure: • PNSSONis at the top of the network provider list • SSONSVR is running • Nothing is causing any logon delays (such as 3rd party monitoring applications etc.) as this would cause the SSON ticket to expire, therefore causing SSONSVR to exit • Enable a default debugger to look out for any unexpected termination of the client processes

  46. Debugging • User Mode versus Kernel Mode • The Windows operating system can be conceptually divided into 2 parts: • User Space (User Mode) • Kernel Space (Kernel Mode) • Applications run in User Mode • System drivers run in Kernel Mode (Privileged Mode)

  47. USER SPACE USER APPLICATION USER APPLICATION USER MODE USER APPLICATION USER APPLICATION USER APPLICATION KERNEL SPACE rusb2w2k.sys keyboard.sys win32k.sys tcpip.sys […]

More Related