330 likes | 504 Views
Ensemble Performance Troubleshooting. Three Problem Areas. Message Searching Management Portal Message Processing and Delivery. Message Searching. Message Searching. Some searches will always take a long time Routine searches should always be quick Why are queries slow?
E N D
Three Problem Areas • Message Searching • Management Portal • Message Processing and Delivery
Message Searching • Some searches will always take a long time • Routine searches should always be quick • Why are queries slow? • Ensemble generate a bad query plan • The constraints are too broad • You don’t have an appropriate search table
Finding the SQL query • Set ^Ens.Debug("UtilEnsMessages","sql")=1 • New button ’Show Query’ • Use the SQL management page to determine the execution plan • Warning : If the page times out this will be the wrong query!!! • If the page times out you must looked for the cache query • In 2014.2 it will show multiple recent queries even if he page times out
Looking at the cached query • ‘Keep source for cached queries’ must be turned on • Purge cached queries for Ens.MessageHeader in your namespace • Execute your search • Look at cached query and execution plan
When you have the query plan • If the plan is not very good • Report it to InterSystems!!!! • Tune table probably won’t help • Try varying the query – e.g. don’t only search session starts • If the plan is good and you are searching a lot of data • Narrow criteria • Consider adding additional search table fields • If the query takes longer than expected • See Vik’s presentation
Management Portal • If everything is slow • Look for system problems (PC, server or network) • If certain things are slow • Large Rules, Large Productions, Large DTL • Observe pattern of CPU activity on client and server • not easy on a busy system • If CPU is used on server, use PERFMON and MONLBL • Firebug shows JS callback to server
Understand The Workload • Run ^pButtons routinely • Capture message statistics routinely • Low message rates (~10 k per day) – SQL • High message rates (1 M per day) – custom report • Record disk space routinely
Message Processing and Delivery • High Resource Usage? • Long Internal Latency? • Long External Latency?
Three approaches • Analyze the message flow • Analyze the system • Guess
Ensemble Business Services Inbound Adapters
Ensemble Business Processes
Ensemble Business Operations Outbound Adapters
Ensemble Business Services Business Processes Business Operations Inbound Adapters Outbound Adapters
Ensemble Business Services Business Processes Business Operations Inbound Adapters Outbound Adapters
Ensemble Business Services Business Processes Business Operations Inbound Adapters Outbound Adapters Message Body Message Header
Visual Trace • Gives time of message at each stage • Indentifies time spent in each BP • Identifies external latency for synchronous requests to BO • Time differences do include queuing
Message Header Properties and SQL Source Target Message type Time Created Time Processed Direction Message ID Session ID …. …. SELECT ID, TimeCreated, TimeProcessed from Ens.MessageHeader where SessionID=45972
Business Services Business Processes Business Operations
Time not included • Inbound adapter • Business Service Execution • Asynchronous Business Operation Execution
Other Sources of WebTiming • ^%ISCLOG • CSP Gateway Log
External Latency • Most common cause of poor throughput • Characterized by • Extended time in session trace • low CPU usage in BO • Can be helped by increasing pool size • Only if external resource can be run in parallel
Internal Latency • Using or waiting on a resource • What is the CPU usage? • Look for lock contention • Examine system wide performance statistics
Queues • Queuing is a symptom not a cause • Queuing across the board means a general resource is constrained • Localized Queuing highlights which component can’t handle the throughput
Pool Size greater than 1 • Multiple jobs obscure the picture • FIFO will not be guaranteed • Only helps if resource can be shared • Don’t increase unless you have a good reason
Identifying Busy Configuration Items • Use OS tools to identify busy jobs • Use Ensemble Management portal to link to configuration items
Drilling into the system usage • ^PROFILE • ^PERFMON • Where is the application spending time • ^%SYS.MONLBL • Look at time ‘line by line’
Key Points • Capture Metrics all the time • Use Cache tools to verify the system health • Characterize the problem • Drill down into specific issues
Questions You can reach me at loveluck@intersystems.com Questions?