310 likes | 832 Views
ITCAM for Application Diagnostics Update. Throughput. Health. Resources. Alerts & Take Actions. Problem Determination. Memory Analysis. Application Trace. Overview of ITCAM for Application Diagnostics Benefits. Identify performance and availability issues before they impact users
E N D
Throughput Health Resources Alerts & Take Actions Problem Determination Memory Analysis Application Trace Overview of ITCAM for Application Diagnostics Benefits • Identify performance and availability issues before they impact users • Enables you to analyze application performance by utilizing trend or historical analysis • Provides key performance metrics to the Tivoli Enterprise Portal to help operations and support teams spot trends and potential delays • Improve MTTR for business critical applications running on WebSphere or J2EE • Enables you to view all JEE transactions that are "in-flight" (have not finished execution) to uncover the root cause of bottlenecks and perform detailed memory analysis • Helps you correlate and profile transactions that span multiple subsystems • Can be used to set resource or application traps to detect and remedy potentially troublesome situations • Software consistency checker compares key system and JVM metrics on working and non-working systems to help isolate differences that may be causing problems • Improved application lifecycle management • Can exchange information with IBM Rational Performance Tester to help developers understand the performance of applications in test or production • Provides developers with a diagnostic tool to identify potential issues prior to rollout
ITCAM for Application Diagnostics – What’s New • Key product enhancements • Dramatically reduce TTV through significant install and config enhancements • Improved ease of use with summary workspaces and contextual drill down to deep diagnostics • Improved problem identification in WebSphere VE environments with new monitoring capabilities • Portfolio simplification for ease of purchase
ITCAM for AD 7.1 Install and Configuration Enhancements • Combine TEMA/DC install and configuration • Reduce the number of install panels • Provide clearer descriptions in the install panels • Silent install • Remote deploy • CAM Configuration Manager (CCM)
Common Diagnostic Scenarios • The following slides show some common diagnostics scenarios using the enhancements introduced in ITCAM for Application Diagnostics • Diagnose a slow or hung application using summary workspaces and launch in context • Diagnose a memory leak using summary workspaces and launch in context • Ensure desired service level for jobs processed by Compute Grid using the new WebSphere VE monitoring
Diagnose a slow or hung transaction The itcamdemo application is red. Flyover shows WASHighResponseTime situation was triggered. Click on itcamdemo icon. The user can launch in-context into the deep-dive features to examine individual transactions. “Diagnostic In-Flight Request Search” link is used in this flow. Flow continues on next slide
Diagnose a slow or hung transaction (cont.) Managing Server Visualization Engine (MSVE) is displayed in TEP workspace Server and request Information are carried over from TEP workspace to MSVE via the link Click on Thread ID to see Stack Trace data of this transaction
Diagnose a memory leak Suspected Memory Leak provides line number information in application code which may cause the memory leak problem.
Key Benefits of WebSphere VE Monitoring in AD 7.1 • ITCAM for AD 7.1 helps to ensure business requirements and SLAs for applications are met • Monitors ODR health and major KPIs to detect and prevent problems in achieving performance and prioritization goals • Example: out of the box alert if ODR queue length is high with contextual drill down to application, deployment targets, and servers to determine cause of queue backup • Deep dive capabilities reduce MTTR (Mean Time To Repair) • Example: out of the box alert that application failed to meet service policy goal with contextual drill down to server causing issue and transaction trace to identify method that is root cause • ITCAM for AD 7.1 reduces problem isolation and diagnosis time in WebSphere VE environments • Visualizes WebSphere Virtual Enterprise and Compute Grid topologies • Contextual drill down from components such as Dynamic Cluster Individual Server quickly narrows down the problem source • Example: out of the box alert that job is running too long with contextual drill down to job details and deep dive information on call stack shows offending method
Summary • Identify performance and availability issues before they impact users • Displays overall health and availability of web resources in summary views • Identifies performance and availability issues proactively using historical data collection and predictive analytics (predictive trending, baselining, dynamic thresholding) • Helps correlate and profile transactions that span multiple subsystems to isolate the bottleneck • Improve MTTR for business critical applications running on WebSphere or J2EE • Reduces troubleshooting effort with launch in context from operational views to deep dive problem determination tools • Traps on errors such as a slow running transaction and change levels automatically to get detailed problem information • Traces transactions across JVM’s for more precise problem determination • Shows stack trace information for precise problem resolution • Performs memory leak diagnosis • Provides software consistency checker to compare key system and JVM metrics on working and non-working systems to help isolate differences that may be causing problems • Improved application lifecycle management • Feeds trapped problem data directly into Rational test tools • Quicker time to value • Comes ready to monitor and expand into constantly changing environments • Automatically monitors any application changes for complete coverage • Monitors 100’s of JVM’s and applications with a single management server without having to trim critical agent data
ITCAM for Transactions November, 2010 IBM Confidential
Diagnose Workflow for Resolving Composite Application Problems Sense Repair Isolate Detect that a threshold has been breached and that a problem occurred, or is about to happen Pinpoint the problem to a specific part of the environment and hand-off to the appropriate specialist Drill down into the details and get to the root cause of the problem Fix the faulty component, validate the fix and roll back into production • ITM • ITCAM for AD • ITCAM for SOA • OMEGAMONs Deep-dive tools ITCAM for Transactions
Check all resources Response time is terrible; more than one minute. Everything looks normal … but performance is still bad • System Alerts • Health Monitors • OS Statistics • Network traffic • Application log files • Database metrics Bridge Call with Tiger Team Customer Pain – Sensing and Isolating a Problem Today Locate Source of Problem …maybe … • Finger-pointing: "It's the network guy’s fault“ • Recreating the problem is difficult • Problem frequently only discovered “by accident” • Lack of problem isolation capability wastes time, increases MTTR, and costs money
Customer Value – Demonstrating ROI Every customer case will be different … …what do you lose each year due to poor performance?
Composite Application Management and Resource Monitoring • Monitor application response to ensure business expectations are met • Understand transaction flows over complex topologies • Monitor infrastructure performance and availability • Diagnose application performance issues • Increase application availability and customer satisfaction • Improve MTTR and MTBF IT Staff IT Staff Transactions Applications Servers IT Customer 15
End-to-End Monitoring, Tracking and Diagnosis 3. Deep Dive Diagnostics Launch in context to SME tools where appropriate. In this scenario, the problem is a WebSphere JEE memory leak. 0.01sec 3.71sec 0.21sec 0.97sec 1.31sec 1.31sec 1. Response Time Measurement Start by monitoring transaction performance and end-user problems 0.32sec 2. Transaction Tracking Correlate data from app server, MQ, CICS, IMS, custom instrumentation, etc. to show topology and isolate problems Transaction Root Cause Analysis Sense End User Experience and alert on threshold violation Isolate by measuring performance data against baseline through entire infrastructure Diagnose and repair through launch-in-context into deep-dive diagnostics
Problem Isolation Through Transaction Tracking • Unified, end-to-end transaction tracking • Heterogeneous environments • fully integrated across distributed and System z • Support for asynchronous transactions • Extensible, modular framework • Integrated response time and transaction tracking
Enterprise-Wide Tracking • Track inside domains with correlated techniques • Track between domains through stitching WAS Domain MQ Domain CICS Domain Client Servlet Request JMS Request MQ CICS EJB Request MQ MQ Link correlated sections with dynamic correlation “Stitching” links correlated sections through dynamic correlation CICS MQ CICS Domain MQ Domain WAS Domain MQ Domain MQ Domain Builds topology mappings using token-based and dynamic correlation Link Link DC DC DC Link Link Link Link CICS Domain CICS Domain
Transaction Tracking Topology Green arrow indicates start node Red “hot spot” indicates bottleneck Synchronous transactions
WAS Deep-dive Drill Down In Context OMEGAMON XE for Messaging ITCAM for Transactions • Launch-In-Context allows SME to quickly and easily drill down to the problem • Speeds MTTR ITCAM for Application Diagnostics
Why Monitor End-User Response? Transactions A majority of IT problems are still being identified by customer complaints • See what your users are experiencing • Validate production system performance • Identify problems before they affect SLA’s • If you have a problem, find out about it before the customers start complaining
Two Techniques for Response Time Monitoring • Web Response Time Monitoring • Monitors actual customer experience • Agentless solution • Client Response Time Monitoring • Monitors real-user client desktop applications • Detailed response measurement for VIP customers Real End User Transactions • Robotic Response Time Monitoring • Repeatable testing of high-priority transactions • Early warning of failures or performance problems • Internet Service Monitoring • Periodic testing of services that make systems run • Simple and lightweight Robotic Transactions
Client Network Server Web Server AppServer “Click” Real User Monitoring Web Applications - Agentless • Captures performance and availability data of actual users for SLA reporting • Completely non-invasive, agentless monitoring • Monitors network traffic for HTTP(S) requests to the web server Windows Applications - Agent • Monitors selected Windows applications • Agent on client workstation providesdetails response time analysis Total Transaction Time Measure
Internet Service Monitors - Protocols Monitored • RPING - Remote Ping for Cisco and Juniper Routers • RTSP – Real-time Streaming Protocol (RFC 2326) • SAA – Cisco Service Assurance Agent • SNMP - Simple Network Management Protocol (RFC 1441-1452, 1901-1908 & 275) • SMTP - Simple Mail Transport Protocol (RFC 821 & 822) • TCP PORT - Transmission Control Protocol • TFTP – Trivia File Transport Protocol (RFC1350) • TRANSX - Transaction Monitor • WMS – Microsoft Windows Media Server Recent additions: • SIP – Session Initiated Protocol (RFC 3261) • SOAP • SNMP v3 • DHCP - Dynamic Host Configuration Protocol (RFC 2131) • DIAL - Dial up Service • DNS - Domain Name Service (RFC 1035) • FTP - File Transport Protocol (RFC 959) • HTTP - Hypertext Transport Protocol (RFC 1945) • HTTPS - HTTP Secure Socket Layer (RFC 1945) • ICMP - Internet Control Message Protocol (RFC 792) • IMAP4 - Internet Message Access Protocol (RFC 2060 & 822) • LDAP - Lightweight Directory Access Protocol (RFC 2251) • NNTP - Network News Transport Protocol (RFC 977 & 850) • NTP - Network Time Protocol (RFC 2030) • POP3 - Post Office Protocol (E-mail) (RFC 1081 and 822) • RADIUS - Remote Authentication Dial-In User Service (RFC 2138 and 2139)
CICS IMS ITCAM for Transactions - Current Domain Coverage IBM WAS • WebSphere 5/6/7 tracking supported through BCI technology embedded in ITCAM for AD – distributed and z/OS • Non-WAS JEE support (Weblogic, JBoss, Tomcat, SAP NetWeaver) • MQ 5.3 and up tracked by ITCAM for Transactions natively – distributed and z/OS • CICS 2.3+ transactions and services, including any CICS hosted applications • (C++, COBOL, Natural, etc.) • CICS IMSDB • CICS 4.1 SOAP support • CICS Transactions Gateway (CTG) 7.1+ • IMS, including IMS Connect and IMS Batch • WebSphere Message Broker v6.0 (distributed) • JDBC tracking through WAS (supports all databases) • DB2 tracking from CAMfCICS and CAMfIMS • Tuxedo Server (FML32 over ATMI) v9/10 • MQI Client (used to enable Tuxedo to MQ) Other JEE MQ 5/6/7 CICS CTG IMS WMB Database Tuxedo MQI
TTAPI Current Domain Coverage (cont.) • Integrated Service Tracking support through ITCAM for SOA – WebSpere ESB – WebSphere Process Server – WebSphere CE – WebSphere Datapower – .NET Web Services – Weblogic – AXIS – CICS Web Services – SAP Netweaver • ARM 2.0/4.0 instrumentation supported via native library linkages (libarm) • Siebel SARM • Non-BCI WAS tracking (ARM based) • Customer instrumentation possible through our published Transaction Tracking API (TTAPI), available for a range of languages on both distributed and z/OS systems. Current language bindings include: • C, C++, Java (distributed) • C, C++, Java, COBOL, PL/I, Assembler (z/OS, including CICS) • .NET SOA ARM Siebel IBM WAS
THANKS 33