1 / 53

Database Monitoring With

Database Monitoring With. Tom Bascom President, Greenfield Technologies. Agenda. Why do you need a monitor? Monitoring Alternatives What Are VSTs? A Monitoring Architecture Customizing And Extending The Code Basic Capabilities Advanced Features. Why Do You Need A Monitor?. Baselining

fadey
Download Presentation

Database Monitoring With

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Database Monitoring With Tom Bascom President, Greenfield Technologies

  2. Agenda • Why do you need a monitor? • Monitoring Alternatives • What Are VSTs? • A Monitoring Architecture • Customizing And Extending The Code • Basic Capabilities • Advanced Features

  3. Why Do You Need A Monitor? • Baselining • Benchmarking • Interactive troubleshooting • Capacity management • Resource Optimization

  4. Agenda • Why do you need a monitor? • Monitoring Alternatives • What Are VSTs? • A Monitoring Architecture • Customizing And Extending The Code • Basic Capabilities • Advanced Features

  5. Monitoring Alternatives • SAR, vmstat, iostat • Glance, TOPAS, Navisphere, Measureware, PerfMon … • TOP, NMON • PROMON • Fathom • ProMonitor • ProTop!

  6. Progress Focused • Interactive, Real-Time • Sample Oriented • Multi-platform • VST Based • 4GL Code • Open Source • Free!

  7. Agenda • Why do you need a monitor? • Monitoring Alternatives • What Are VSTs? • A Monitoring Architecture • Customizing And Extending The Code • Basic Capabilities • Advanced Features

  8. What Are VSTs? • Virtual System Tables • A 4GL View of Progress Data Structures (the same as those shown in PROMON.) • No Performance Impact (unless you do some really dumb things!) • Primarily Read-Only • Not Terribly “User Friendly” • Quirky at times…

  9. Some VST Quirks • Updateable: • _startup._spin • Private buffers • APW settings • Table & Index Ranges: • -tablebase, -tablerangesize • -indexbase, -indexrangesize • Table & Index Window can be reset! • Quirky Keys: • _myconnection… • _tablestat & _indexstat

  10. User Number/Id VST Confusion… find _myconnection no-lock. find _connect no-lock where _connect-usr = _myconn-userid. display _connect-usr _connect-id _myconn-userid. find _userio no-lock where _userio-usr = _connect-usr. display _userio-id _userio-usr. User-Id _Connect-Id MyConn-UserId _UserIO-Id Usr ======= =========== ============= =========== =========== 253 254 253 254 253

  11. Table Stats /** This does NOT work if –tablebase <> 1!!! find _File no-lock where _File._File-num = p_tbl. find _TableStat no-lock where _TableStat-id = p_tbl. display p_tbl _file-num _TableStat-id. **/ /*** instead, use the following: ***/ find _TableStat no-lock where _TableStat-id = p_tbl. find _File no-lock where _File._File-num = _TableStat-id. display p_tbl _file-num _TableStat-id.

  12. Index Name find _IndexStat no-lock where _IndexStat-id = p_idx. find _Index no-lock where _Index._Idx-num = _IndexStat-id. find _File where recid( _File ) = _Index._File-recid. tt_index.idxnote = _File._File-name + “.” + _Index._Index-name + ( if _file._prime-index = recid(_index) then “ P" else “ " ) + ( if _index._unique then "U" else "" )

  13. Agenda • Why do you need a monitor? • Monitoring Alternatives • What Are VSTs? • A Monitoring Architecture • Customizing And Extending The Code • Basic Capabilities • Advanced Features

  14. A Monitoring Architecture • VST Based • Multi-Platform • UNIX Character • HTML • Windows GUI • Using Publish & Subscribe • More than just a VST Browser! • Customizable!

  15. A Monitoring Architecture

  16. Agenda • Why do you need a monitor? • Monitoring Alternatives • What Are VSTs? • A Monitoring Architecture • Customizing And Extending The Code • Basic Capabilities • Advanced Features

  17. Customizing And Extending The Code • Events That A Module Handles • Structure Of A Module • Defining the Display • Maintaining State • Adding Help • Making A Module Available

  18. Events That A Module Handles • Mon-Restart • Empty Temp-Table • Remove self from memory • Mon-Init • Empty Temp-Table • Define Display Data Elements • Mon-Update • Refresh Data • Calculate intervals, rates and so forth • Update UI Temp-Table with results

  19. Structure Of A Module {lib/protop.i}def var support as character no-undo initial “Resources”.{lib/tt_xstat.i}procedure mon-restart: empty temp-table tt_xstat. delete procedure this-procedure.end.procedure mon-init:empty temp-table tt_xstat. /* define labels */end.procedure mon-update: /* the real work */end.subscribe to “mon-restart” anywhere run-procedure “mon-restart”.subscribe to “mon-init” anywhere run-procedure “mon-init”.subscribe to “mon-update” anywhere run-procedure “mon-update”.publish “register-disp-type” ( input support ).

  20. Defining the Display ui-define-label( support, 1, 1, "xid", " Id" ). ui-define-label( support, 1, 2, "xname", "Resource " ). ui-define-label( support, 1, 5, "stat1", " Locks" ). ui-define-label( support, 1, 6, "stat2", " Waits" ). ui-define-label( support, 1, 8, "stat-ratio", " Lock%" ). ui-define-label( support, /* display type */ 1, /* variant */ 8, /* order */ "stat-ratio", /* data element name */ " Lock%“ /* label value */ ).

  21. Maintaining State define temp-table tt_xstat no-undo field xid as integer field xvalid as logical field xname as character field misc1 as character field misc2 as character field stat1 as integer extent 5 field stat2 as integer extent 5 field stat3 as integer extent 5 field stat-ratio as decimal index xid-idx is unique primary xid.

  22. Sample, Summary, Rate & Raw Data • BaseValue • LastValue • ThisValue • SampleTime • SummaryTime • SampleRate = (ThisValue – LastValue) / SampleTime. • SummaryRate = (ThisValue –BaseValue) /SummaryTime. • SampleRaw = (ThisValue – LastValue) / 1. • SummaryRaw = (ThisValue – BaseValue) / 1.

  23. Updating Data for each dictdb._Resrc no-lock: run update_xstat ( input _Resrc-Id, input _Resrc-name, input "", input "", input _Resrc-lock, input _Resrc-wait, input 0 ). end. ui-det(support, 1, i, 1, "xid", string(tt_xstat.xid, ">>9")). ui-det(support, 1, i, 2, "xname", string(tt_xstat.xname, "x(20)")). ui-det(support, 1, i, 5, "stat1", string((tt_xstat.stat1[x]/z), ">>>>>>>>>9")). ui-det(support, 1, i, 6, "stat2", string((tt_xstat.stat2[x]/z), ">>>>>>>>>9")). ui-det(support, 1, i, 8, "stat-ratio", string(tt_xstat.stat-ratio, ">>9.99%")).

  24. Adding Help • Help files are in the “hlp” directory. • File name is value(“hlp/” + support + “.hlp”) • Title the screen. • Provide an overview of the screen. Try to explain why the metrics are important and how they are related to other metrics. • Define each label and give some insight into its meaning. • Provide explanations of any codes that might appear under a label.

  25. FileIO.hlp IO Operations to Database Extents Id:The extent id number. Extent Name:The file name of the extent. Mode: The "mode" in which the file is opened. Possible values are: BUFIOThe extent is opened for buffered IO. UNBUFIOThe extent is opened for un-buffered IO. BOTHIOThe extent is opened for both buffered and un-buffered IO. Variable extents are opened with BOTHIO (there are two file descriptors unless you're using -directio.) BlkSz: The Block size for the extent. This potentially varies between data, before-image and after-image extents. Values are expressed in bytes.

  26. Making A Module Available • Drop it into the “mon/” directory. • “mon/mymetric.p” • If it is OS specific use the “os/” directory • “os/AIX/df.p” • “os/Linux/netstat.p” • Send me a copy so that I can include it in the base distribution!

  27. Agenda • Why do you need a monitor? • Monitoring Alternatives • What Are VSTs? • A Monitoring Architecture • Customizing And Extending The Code • Basic Capabilities • Advanced Features

  28. Basic Capabilities • Summary Data • Blocked Clients & Open Transactions • Table & Index Activity • User Activity • Estimating Big B • Latches & Resources • Storage Area Capacity • Balancing IO • Clients & Servers

  29. Summary Data 11:32:52 ProTop xvi -- Progress Database Monitor 07/05/05 Sample sports [/db/sports] Rate Hit Ratio: 182:1 195:1 Commits: 149 195 Sessions: 2057 Miss% : 0.549% 0.512% Latch Waits: 13 16 Local: 953 Hit% : 99.45% 99.48% Tot/Mod Bufs: 60002 3167 Remote: 956 Log Reads: 76342 80927 Evict Bufs: 2 1 Batch: 1045 OS Reads: 419 414 Lock Table: 1516 3 Server: 97 Rec Reads: 23789 23619 LkHWM|OldTrx: 1392 00:00 Other: 51 Log/Rec: 3.2091 3.4264 Old/Curr BI: 54 54 TRX: 26 Area Full: 1 98.60% After Image: Disabled Blocked: 0

  30. BI Clusters for each _Trans no-lock where _Trans-usrnum <> ?: if _Trans-counter <> ? and _Trans-counter > 0 then do: if oldbi = 0 or _Trans-counter < oldbi then oldbi = _Trans-counter. currbi = max( currbi, _Trans-counter ). end. end. find _BuffStatus no-lock. currbi = _BfStatus-LastCkpNum. if oldbi = 0 then oldbi = currbi. /* if no TRX is active… */

  31. Blocked Sessions Blocked Sessions Usr Name Waiting Note --- -------- -------- ---------------------------------- 24 tom 00:00:32 REC XQH 102 [Order] julia, peter 22 tucker 00:00:02 REC XQH 201 [Cust] astro, tiger 321 julia 00:00:00 BKSH:83524928:

  32. Locked Records for each _Lock no-lock while _Lock-usr <> ?: if _Lock-recid = _Connect-wait1 then do: find _file where _file._file-num = _Lock-table. bxtbl = _file._file-name. end. if _Lock-usr = _Connect-usr then bxwait = bxwait + “ “ + _Lock-flags. else bxque = bxque + " " + _Lock-name. end. bxnote = bxtbl + bxwait + bxque.

  33. Open Transactions Open Transactions Usr Name TRX Num BI Clstr Start Trx Stat Duration Wait ---- ----- -------- -------- -------- -------- -------- ---------- 9 tom 2432897 1024 15:39:05 ACTIVE 00:00:01 -- 29440 20 jami 2432896 - ALLOCATE 00:00:00 -- 20115 5 emily 2432898 1024 15:39:06 ACTIVE 00:00:00 -- 21952 7 peter 2432899 1024 15:39:06 ACTIVE 00:00:00 -- 19040 23 julia 2418661 - ALLOCATE 00:00:00 -- 0 22 astro 2417938 - ALLOCATE 00:00:00 -- 0

  34. Table Activity Table Statistics Tbl# Table Name Create Read Update Delete ---- ---------------- --------- --------- --------- --------- 4 OrderLine 1 28715 11 1 18 Order 0 2384 1 0 24 POLine 0 848 1 0 23 PurchaseOrder 0 627 40 0 21 Bin 0 216 0 0 2 Customer 18 175 20 20 1 Invoice 1 148 3 0

  35. Index Activity Index Statistics Idx# Index Name Create Read Split Delete BlkDel ---- --------------- -- ------ ------ ------ ------ ------ 904 usage 14 31597 0 13 0 78 journal P 0 21011 0 0 0 435 keyindex 0 7376 0 0 0 388 icest PU 0 1995 0 0 0 1251 keyindex 0 1991 0 0 0 1247 warehs U 0 945 0 0 0 900 stuff PU 1 783 0 1 0

  36. User IO Activity UIO Usr Name Flags PID DB Access OS Rd OS Wr Hit% ---- ------- ----- ------ --------- ----- ----- ------- 13 tom SB 13590 2266 200 1 91.13% 10 jami SB 13584 190 6 1 97.10% 16 julia SB 13596 185 6 1 97.03% 17 peter SB 13598 181 5 1 97.07% 15 emily SB 13594 177 5 1 97.12% 11 tiger SB* 13586 166 4 0 97.58% 14 tucker SB 13592 159 5 1 97.10% 19 granite SB 13602 146 1 0 99.25% 7 astro SB 13578 145 4 1 97.16%

  37. Estimating Big B Big B GuessTimator Pct Big B % db Size Hit:1 Miss% Hit% OS Rd ----- --------- --------- ----- ------ ------- ----- 10% 6000 0.124% 30 3.306% 96.694% 1343 25% 15001 0.311% 48 2.091% 97.909% 849 50% 30001 0.622% 68 1.479% 98.521% 601 100% 60002 1.243% 96 1.046% 98.954% 425 <= 150% 90003 1.865% 117 0.854% 99.146% 347 200% 120004 2.486% 135 0.739% 99.261% 300 400% 240008 4.973% 191 0.523% 99.477% 213

  38. Big B http://www.peg.com/lists/dba/history/200301/msg00509.html MissPct = 100 * ( 1 – ( LogRd – OSRd ) / LogRd )). HitPct = 100 – MissPct. OSRd = LogRd * ( MissPct / 100 ). m2 = m1 * exp(( b1 / b2 ), 0.5 ).

  39. Resource Waits Resource Waits Id Resource Locks Waits Lock% --- -------------------- ---------- ---------- ------- 10 DB Buf S Lock 2661 0 100.00% 6 Record Get 658 0 100.00% 7 DB Buf Read 40 0 100.00% 2 Record Lock 21 0 100.00% 11 DB Buf X Lock 11 0 100.00% 19 TXE Share Lock 11 0 100.00% 8 DB Buf Write 3 0 100.00% 21 TXE Commit Lock 2 0 100.00% 1 Shared Memory 0 0 0.00% 3 Schema Lock 0 0 0.00%

  40. Latch Waits Latch Waits Id Latch Requests Waits Lock% --- -------------------- ---------- ---------- ------- 28 MTL_BF4 5540 33 99.40% 17 MTL_BHT 4205 106 97.49% 21 MTL_LRU 4154 55 98.68% 10 MTL_LHT 1800 24 98.65% 15 MTL_LKF 1798 0 100.00% 26 MTL_BF2 1218 6 99.48% 27 MTL_BF3 1184 10 99.13% 25 MTL_BF1 1150 10 99.16% 4 MTL_OM 913 4 99.60%

  41. Storage Area Capacity Area Statistics A# Area Name Alloc Var Hi Water Free %Used Note -- ------------ ------- ----- -------- ------ ------- ------- 68 order_idx 16 1998 1927 87 12044% i(3) 67 order 256 14670 14860 66 5805% t(1) 6 Schema Area 256 1454 1391 319 543% i(25) * 3 BI Area 32000 13070 45056 14 141% 13 customer 512000 55565 567515 50 111% t(15) 92 After Image 0 5199 5191 8 100% Busy 49 order-line 32000 2 25164 6838 79% t(1) 61 inventory 128000 2 94897 33105 74% t(1) 55 discount 1024000 0 755885 268114 74% t(1) 57 employee 2048000 0 1442919 605076 70% t(1)

  42. Storage Area Capacity for each _AreaStatus no-lock, _Area no-lock where _Area._Area-num = _AreaStatus._AreaStatus-Areanum: bfree = _AreaStatus-Totblocks - _AreaStatus-Hiwater. if ( _AreaStatus-Freenum <> ? ) then bfree = bfree + _AreaStatus-Freenum. if bfree = ? then bfree = _AreaStatus-totblocks. used = (( _AreaStatus-totblocks - bfree) / _AreaStatus-totblocks ) * 100. end.

  43. Storage Area Contents for each _storageobject no-lock where _storageobject._area-number = xid and _storageobject._object-num > 0 and _storageobject._object-associate > 0: if _storageobject._object-type = 1 then so_tbl = so_tbl + 1. else if _storageobject._object-type = 2 then so_idx = so_idx + 1. end. /* ianum = initial area number… */

  44. Balancing IO Database File IO Id Ext Name Mode Blksz Size Read Wrt Ext ---- ---------- ------- ----- ------- ----- --- --- 63 s2k_29.d1 F UNBUF 8192 2048000 11828 0 0 64 s2k_29.d2 F UNBUF 8192 2048000 7790 0 0 124 s2k_55.d2 F UNBUF 8192 2048000 432 0 0 125 s2k_55.d3 F UNBUF 8192 2048000 367 8 0 123 s2k_55.d1 F UNBUF 8192 2048000 220 0 0 67 s2k_30.d1 F UNBUF 8192 2048000 106 0 0 57 s2k_26.d1 F UNBUF 8192 1024000 26 2 0 128 s2k_56.d1 F UNBUF 8192 2048000 19 1 0 135 s2k_57.d6 F UNBUF 8192 2048000 12 0 0 140 s2k_58.d2 F UNBUF 8192 1024000 11 1 0 121 s2k_54.d1 F UNBUF 8192 256000 7 0 0 139 s2k_58.d1 F UNBUF 8192 1024000 6 0 0 134 s2k_57.d5 F UNBUF 8192 2048000 5 0 0 69 s2k_31.d1 F UNBUF 8192 128000 4 0 0 73 s2k_33.d1 F UNBUF 8192 128000 3 0 0 3 s2k.b2 V UNBUF 16384 0 0 0 0

  45. Servers and Clients Servers Srv Type Port Con Max MRecv MSent RRecv RSent QSent Slice --- ----- ----- --- --- ----- ----- ----- ----- ----- ----- 1 Login 7150 0 1 0 0 0 0 0 0 2 Auto 1026 10 55 0 0 0 0 0 0 3 Auto 1027 10 55 23 13 0 6 10 86 Server IO Srv Type Port Con Max DB Access OS Rd OS Wr Hit% --- ----- ---- --- --- --------- ----- ------ ------- 19 Auto 1043 10 55 5041 2 0 99.96% 20 Auto 1044 10 55 1348 1 0 99.96% 18 Auto 1042 10 55 157 1 0 99.51% 16 Auto 1040 10 55 42 1 0 98.70%

  46. Agenda • Why do you need a monitor? • Monitoring Alternatives • What Are VSTs? • A Monitoring Architecture • Customizing And Extending The Code • Basic Capabilities • Advanced Features

  47. Drill Down User Details Usr#: 23 Name: tom PID: 18570 Device: /dev/pts/3 Transaction: Jul 7 15:20:36 2005 ACTIVE 00:00:45 REC 5892 Blocked On: REC XQH 5892 [Customer] peter User 23's Other Sessions Usr Name Flags PID DB Access OS Rd OS Wr Hit% ---- ------- ----- ------ ---------- ------- ------ ------- 23 tom S * 18570 9 2 0 81.61% 0 tom O 18017 0 0 0 0.00% 22 tom S 18542 8534 134 15 98.43% 24 tom S 18576 3964 64 31 98.38%

  48. ProTop Alerts

  49. Alerts & Alarms # $PROTOP/etc/alert.cfg # # Metric Type ? Target Message Action # ========= ==== == ====== =========== ==================== LogRd num > 100000 "&1 &2 &3" alert-log OSRd num > 500 "&1 &2 &3" alert-log BufFlsh num > 0 "&1 &2 &3" alert-log,alert-mail Trx num > 200 "&1 &2 &3" alert-log,alert-mail LatchTMO num > 200 "&1 &2 &3" alert-log,alert-mail ResrcWt num > 200 "&1 &2 &3" alert-log,alert-mail

More Related