1 / 40

Service Manager Performance Tuning Deep Dive

Service Manager Performance Tuning Deep Dive. Eric Krueger Principle Consultant StrataCom, Inc. Who Am I?. ServiceCenter/Service Manager consultant since 2001 Prior Software Artistry partner Former Oracle and Sybase DBA Former AIX administrator. Agenda. Hardware overview Memory issues

Download Presentation

Service Manager Performance Tuning Deep Dive

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Service Manager Performance Tuning Deep Dive Eric Krueger Principle Consultant StrataCom, Inc.

  2. Who Am I? • ServiceCenter/Service Manager consultant since 2001 • Prior Software Artistry partner • Former Oracle and Sybase DBA • Former AIX administrator

  3. Agenda • Hardware overview • Memory issues • Database architecture and tuning • SAN and RAID • Common Development and Configuration mistakes that can cause problems • Trace logs and how to read them

  4. Hardware • Hardware requirements • Memory-2 GB per servlet • Servlets support 50 users per servlet • 100 users per servlet on windows 64 bit (per hp) • 1-2 GB for the OS • 100 concurrent users on 32 bit windows would require 2 servlets (4 GB) + 2 GB for OS=6 GB Processors: HP SM is not very processor intensive, but you’ll need multiple cores. I would recommend minimum 4 cores for 100 concurrent users.

  5. Memory requirements • Vertical Scaling • Multiple servlets on single machine • Mandatory for more than 50 concurrent users if not horizontally scaled. • Limited only by RAM on machine and possibly number or processes (but you’ll probably run out of memory first)

  6. Memory requirements • Horizontal Scaling • Vertical scaling, but across multiple servers • Good for larger implementations • Good for fail-over • Same memory requirements as vertical scaling • Additional component needed-the load balancer

  7. Other Memory requirements in SM • Shared memory setting • Shared memory is used to cache commonly used items • Also used to cache IR files • Having more shared memory will not improve performance BUT • Having too little shared memory WILL negatively affect performance

  8. So how do I tell if I have memory issues? • General system slowness under load is normally a symptom of memory starvation (or db issues) • On windows machines, you can tell if you are low on memory by simply looking at the task manager (Windows 2008 has much better tools for looking at memory thatn windows 2003 server) • Follow the 2gb per servlet rule for best performance

  9. So how do I tell if I have memory issues cont… • Servlets will start to crash if there is not enough physical memory. • If you start losing servlets randomly, you are probably running into memory issues • Servlets are Java containers so there is a hard 2 gb limit. • Older versions of SM don’t manage memory very well inside the servlet. This is really cleared up in late patches of SM 7.11 and 9.20

  10. Adding Shared Memory • Run sm –reportshm under heavy load • A large discrepancy between free and unused space (>20 %) indicates a problem. • If the unused space falls below 25%, HP Software recommends that you increase the amount of shared memory defined in the sm.ini file while the system is down • User http://www.t1shopper.com/tools/calculate/ to calculate MB from that big number in sm.ini • i.e. shared_memory:32000000 = 30.5  megabytes

  11. Enough Shared Memory

  12. Houston, We have a problem

  13. Final thoughts on memory • Windows 32 bit versions have hard 4 GB memory limitation (32 memory space issue) and must do contortionist memory tricks to use more than 4 GB of RAM, so 64 bit windows recommended • Memory is cheap, you can’t have too much

  14. Database Tuning • P4 vs. SQL Architecture • Various ways to push Array fields • Indexing • How SQL indexes work • When Service Manager forces full table scan

  15. P4 vs. Sql Architecture • The Service Manager data layer was originally built on the P4 database • P4 was file based • ISAM database • Extremely fast record retrieves because there were no joins. One db call retrieved entire record. • Instead of Relational “look up tables” P4 stored those tables INSIDE the record structure

  16. Example of structures inside of record: • Operator record contains • Approval groups (array of characters) • Review groups (array of characters • So when ServiceCenter or Service Manager 7.0 retrieves records out of P4 filesystem (not pushed to sql) there is no “left outer join” to approval groups or review groups table. Extremely efficient and fast. A single retrieve is performed instead of multiple (dozens, hundreds?) or retrieves.

  17. What happens to those arrays in SQL Push? • The structures are pushed out one of 5 ways: • Blob in main table –stored as binary object, which mirrors how it is stored in P4—very fast retrieves • Blob in Alias table – stored as binary object, but in alias table, i.e. operatora1 • Field in Main table – stored as text object in main table. Still pretty fast for retrieves, but some parsing must be done at data layer • Field in Alias table –stored as text object in alias table, similar to blob in alias table • Multi-row array table –Each record gets it’s own row in an alias table. Most like SQL Model for our example.

  18. Thanks for the useless background data on database retrievals. Why is my system so slow!!!!!?????

  19. Most major performance issues are database mapping issues. • Common complaints from SM users on slowness that is db mapping related: • Login times • Changes I need to approve • Searches for workgroups I belong to • Operators with ‘x’ capability word • Eventin records processed on this date

  20. All issues are related to fields stored in Main table • If you run a query from Service Manager, i.e. “look for a record” and the value you are looking for is not available to the db engine…what happens? • A full table scan of the records by the database engine? • An index scan instead of an index seek? • A retrieval of every record in the table by Service Manager, even if there are 1,000,000 records in the table?

  21. A retrieval of every record in the table by Service Manager, even if there are 1,000,000 records • Why? • The database engine cannot search on fields that are stored as binary data in the table. So, for example, if you store the Approval Groups in the operator table as a blob field, if you search for Approval Group Xyz, the Service Manager data layer is smart enough to know the db will not be able to find that data, so it issues a query that will return every row so Service Manager can take that field apart and make a list of each Operator where Xyz group exists.

  22. How do I fix that? • Remap the Approval Group array field in the dbdict to a ‘multi-row array table’ • Simply open the dbdict and redo the mappings • DO NOT use the sql to sql mapping! • Must have every one logged out to do it this way • If you map the entire table to multi-row array table, it will map every array to a multi-row array table, including fields that are text fields! If these are mapped this way, every line in the text field will get it’s own db entry (I have seen this happen!). So a cut/paste of an email with 500 lines would result in 500 rows being added to a lookup table.

  23. How to I figure out if I’m having that issue? • The alert log is a good place to look. The message will look something like: • “Out of 567 records checked only 5 matched the query” • Debugdbquery:x where x is amount of time in seconds a query is allowed to run before it is reported in the log. 1 second is a pretty long time for a query. • Check code for queries, jscript code, views, etc for items that are included in ‘where’ clause that are mapped to binary fields in SM.

  24. Other Database performance related issues • Indexes—In MS SQL Servers, indexes are not created as clustered. MS Sql performs much better if primary keys are created as clustered indexes. • In a normal index, two operations are performed…an index seek or scan, then a reference ID lookup where the db engine locates the record in the db via a pointer in the index. The RID lookup can take longer than the index seek or scan • Clustered indexes sort the data in the table according to the index. The index is searched and no RID lookup is performed.

  25. More Database related issues • Oracle indexes: • Oracle indexes are created as Unique, but not Unique and ‘not null’ • This causes some issues with primary keys and full tale scans. For example: • DBFIND^F^contacts(oracle10)^1^0.000000^ ^0^2.668000^"contact.name=NULL"^ ^0.000000^0.000000 ( [ 1] format.cque calc.queries.2.select ) • Wait a minute…2.668 seconds to retrieve no records from the contacts table? • Contact.name=NULL = full table scan • Contact.name is NULL = index seek

  26. A note on general DB performance • A large majority of your queries should be pulling either from shared memory on the SM Server or out of cache on the db server. You should be seeing a lot of this in your trace logs: • DBACCESS - Cache Find against file scmessage found 1 record • This means you pulled a record out of shared memory instead of going to the db

  27. Help! My app is slowing down during busy times of the day! • Well, we’ve already talked about the memory issue, so I’m going to assume that is good to go. • The 2nd most common cause is disk latency on the db server • Typically hosted on SAN • Probably bullet-proof, but how did they configure the drives? • Check ‘average disk queue length’. If you are running over ‘2’ per disk, you are in trouble

  28. SAN Disk Configuration Options • Which disk configuration is ideal for database applications? • RAID 0 (striping-data spread across multiple drives) • RAID 1 (Mirroring-data copied to a redundant drive) • RAID 5 (Striping with Parity-data spread across multiple drives with a parity bit stored) • RAID 6 (Striping with redundant parity-2 copies of parity stored) • RAID 1+0, also known as RAID 10 (striped and mirrored)

  29. To Parity or not to Parity • RAID 5 and RAID 6 disk arrays offer a high level of redundancy and protection for disk arrays. • In a 10 drive RAID 5 array, the data is spread across 9 drives and the 10th drive holds the parity bit. • RAID 6 uses 2 drives to hold parity • Parity: 101000101=4 • Parity: 1x1000101=4 • X=?

  30. Math is hard! • RAID 5 and RAID 6 should never, ever, ever be used for database applications. • RAID 1+0 is ideal. No math. • Yes, I have had clients with RAID 5 raid on, for example, an EMC Clarion SAN. • “It’s the cheapest option”…You get what you pay for. • RAID 5 or 6 (even worse) will suffer under heavy write loads as the parity bit is calculated. The application will slow to a crawl as database queries await their turn at the disk I/O plate.

  31. Common Mistakes that slow down Service Manager Clients • Queries (or other functions) on Display in Format Control • Running IR Expert in ‘instant’ mode instead of background mode • Drop-down boxes with large value and display lists • Virtual Joins to large record lists

  32. Queries on display • Dis in Format Control is ‘display’. This is called every time the form is refreshed. • If you have code in ‘dis’ it will be called: • When the format is first display • Every time you use a ‘fill’ button • Every time you click the ‘save’ button and the form is reloaded. • Don’t use unless you absolutely have to!

  33. IR Expert • Should be run in the background • For every record that has an IR Key, this will add a lot of time if run in foreground • I have seen this add 20 seconds to save times

  34. Drop-down boxes with large value and display lists • The web client AND the windows client both receive XML documents from the Service Manager Server. • Because it’s XML, a value/display list with 1000 entries is 1000 lines in the XML document. • This can add significant time to screen redraws as the time to move the data from the server to the client can take a long time. • I have seen this take 30 seconds when someone added a $G.operator with 8000 operators to a drop down field on search screen (“why does it take my change search screen 30 seconds to display?”)

  35. Virtual Joins to large record lists • If you have a virtual join form on a form and it joins to another table, be careful. • For example, joining the device table to the assigment.g format to display all the devices a group manages, can pull back thousands of records • This basically crashes service manager. • Fix in SM 7.11 patch 15 – new param to limit VJ results. Available in SM 9.20 patch 1.

  36. How to use logging • Set up custom listener Sm.exe –httpPort:12345 –debugnode:1 –RTM:3 –debugdbquery:999 –msglog:1 –log:trace_file.log • Let’s look at some log examples • Use Textpad for easy searching • Dbfind, dbquery, dbinsert, dbupdate

  37. Questions? WWW.VIVIT-WORLDWIDE.ORG

More Related