340 likes | 439 Views
Numbers, We don’t need no stinkin ’ numbers. Adam Backman Vice President DBAppraise, Llc. About the Presenter. Progress user from when dinosaurs roamed the earth (nearly) President - White Star Software Consulting: performance, coding, problem solving
E N D
Numbers, We don’t need no stinkin’ numbers Adam Backman Vice President DBAppraise, Llc
About the Presenter • Progress user from when dinosaurs roamed the earth (nearly) • President - White Star Software • Consulting: performance, coding, problem solving • Training: Programming, System and Database Administration • Vice President – DBAppraise • Managed database services
Agenda • Why performance is important? • Components of performance • Perception vs. reality • Who is most important?
Why is performance important? • Time is literally money • Many idle hands cost real money • A delayed customer is a lost customer • Delayed support equals lost confidence
Put a value on performance • Users wait 10 seconds • Does not sound bad • Users do the operation many times per day • 10 seconds per transaction, 10 per hour, 8 hours a day. 800 seconds wasted per user per day. • 13 minutes wasted per user per day times then number of users (500 users) • That is over 100 hours of wasted time per day
Components of performance • Network • Disk • Memory • CPU • Goal: Push the bottleneck to the fastest resource
Network • Slowest resource • Temp files going to network drive • Need to minimize traffic • -Mm (Remember to increase frame size everywhere when increasing –Mm) • -Mn, -Mpb, -Mi, -Ma
Disk • Most frequent offender • People focus on wrong metrics • Queue depth and service time are generally good indicators of congestion
Memory • Move things off disk into memory • -B (DB to shared memory) • -Bt (temp disk files to temp buffers) • OS and Disk array caches
CPU • The “right” type of CPU activity • User – what you paid for • System – System overhead • Wait – Waiting on I/O (What type of I/O) • Idle – You need idle but having zero does not mean there is an issue
Numbers are good but … • Performance stinks • Performance is perception • User experience is king
First, look for record locks or other application issues • ProTop has a screen for blocked sessions • Record locks can completely stop activity • The user sees record in use by …. • The administrator does not • Additionally, I look for very high db requests from a single connection
ProTop: Blocked Sessions Blocked Sessions Usr Name Note ----- ------------ --------------------------------------------- 24 tom REC XQH 102 [Order] Adam
Promon: Block Access Block Access: Type Usr Name DB Requests DB Reads BI Reads \Writes \Writes Acc 999 TOTAL... 6415644367 54341274 6284 423828 521056 Acc 0 adam 165004 1317 5657 1245 93125 Acc 5 adam 1 0 0 191480 6 Acc 6 adam 1 0 0 184629 7 Acc 7 dbapprai 3549613 1 0 0 0
Buffer Hit Percentage • Generally a good metric • But … • A single table small table scan can vastly skew results • Low volume buffer hit percentage is nearly meaningless
How to Make Buffer Hit Rate Useful • Know which tables are being read • Large tables • Small tables • Know what is “normal”
How to Make Buffer Hit Rate Less Useful • Bring up promon activity screen and only use the first sample • Use really small sample sizes (seconds vs. minutes) • Use really large sample sizes (hours vs. minutes)
Benchmarks Lie • Do not test real-world • All read (Readprobe) • All write (ATM) • Wrong mix of read and write • Time slicing can make results more attractive
CPU – Wait • The CPU always blames everyone else • If you have wait and idle it is generally no issue • If you have wait and no idle you likely have an issue. Look at disk first
CPU - Idle • If you have a single core then a single program can use 100% of the CPU • This is a good thing. The process will use it’s CPU and complete
The network is never more than 10% busy • Every network admin in the world uses this line • They get this from the manufacturers • They sample and provide a single sample for a large time frame. • How about 100% busy 10% of the time
Setting –spin based on a calculation • Gus said that it should be … • Unless Gus is at your site any calculation is wrong • Gus said this some time ago and was misquoted at that time • Generally stated as # * CPUs • This is nearly always wrong (you could get lucky by accident)
Percentage full on extents • Is it 99% full or 327% full • Important to look at allocated (actual growth) versus percentage fill of the last extent • Hint it never shows 100% as it preallocates space for future extends of the area
Now we know how people lie but how do I determine if our performance is acceptable?
Method: Measuring Performance • Determine your 5-10 most time critical portions of the application • Time them in isolation • Time them during the day when everyone says performance is OK. They will never say it’s good. • These timings should be close if not exactly the same
Method: Determine importance • Customer visible • Done many (thousands+) times a day • Users “wait” for screen/output
Timings • Need not be exact • Wrist watch or cell phone timer is fine • Keep track of these timings • When people complain about performance redo the timings
If the timings are bad • Look for bottlenecks • Network • Disk • Memory • CPU • It will likely be one of the first two solved by using more of the second two
If the timings are good • Smack the users around for wasting your time or • Reevaluate timings, no really just smack the users
Conclusion • Performance is perception • Reason for “working …” • Focus on user experience • Know what is normal • In stored statistics • In response times
Still more Conclusion • Know what is important • Customer facing • Benchmarks lie • Buffer Hit Rate • You can make it whatever you want • Need to understand how to make it useful
Questions? Adam Backman adam@wss.com