880 likes | 1.24k Views
BID208 Tools for Monitoring the IQ Server. Dan Kernaghan Principal Systems Consultant danielk@sybase.com August 7, 2003. Tools for Monitoring IQ. Administrative Monitoring System Events Basic Auditing SQL statements Space Utilization Versions Performance Monitoring IQ Monitor OS Tools.
E N D
BID208 Tools for Monitoring the IQ Server Dan KernaghanPrincipal Systems Consultantdanielk@sybase.comAugust 7, 2003
Tools for Monitoring IQ • Administrative Monitoring • System Events • Basic Auditing • SQL statements • Space Utilization • Versions • Performance Monitoring • IQ Monitor • OS Tools
Administrative Monitoring • Understanding the status of the server • What users are coming and going? • What statements are being run? • What is the Main or Temp space utilization? • Why are the versions out of control?
IQ System Events • Events can be added to automate system functions CREATE EVENT event-name ... [ TYPE event-type ... [ WHERE trigger-condition [ AND trigger-condition ], ... ] | SCHEDULE schedule-spec, ... ] ... [ ENABLE | DISABLE ] ... [ AT { CONSOLIDATED | REMOTE | ALL } ] ... [ HANDLER BEGIN ... END ]
Example Event • Example CREATE EVENT IncrementalBackup SCHEDULE START TIME ’1:00AM’ EVERY 24 HOURS HANDLER BEGIN BACKUP DATABASE INCREMENTAL TO ’/backups/daily.incr’ END
Basic Auditing • A rudimentary auditing function can be added through events
Starting the IQ Monitor • Started using IQ UTILITIES command • iq utilities [main|private] into <dummy_tblname> • start monitor “options” • The table name is never used, output is to an ASCII file. • The table name is only present for syntactical reasons
Counter Output • The counters are output to an ASCII file, in the directory in which IQ is running • <dbname>.<conn#>-[main|temp]-iqmon • In ASIQ 12.4.2 the suffix “-iqmon” can be changed to reflect user requirements Example : asiqdemo.2-main-iqmon
Stopping the IQ Monitor • Stopped using the same basic command • iq utilities [main|private] in <dummy_tblname> • stop monitor • Again as with the start command the “dummy_tblname“ is only present for the syntactical analyser in ASA
Issues in Monitor Operation • You can have 2 monitors running • One monitoring IQ_MAIN • One monitoring IQ_TEMP • They will write to different files • asiqdemo.2-main-iqmon • asiqdemo.2-temp-iqmon • Also they must be explicitly started and stopped with the correct syntax
File options • The –file_suffix option allows you to change the –iqmon suffix to the output file name • This is so you can keep multiple copies of the output showing what, when and why you ran it • The –truncate and –append options allow you to specify whether the output file is truncated (default) or appended to when the monitor starts up
Graphic of Operation Monitor Temp Monitor Main Language Processor Catalog Catalog Store IQ Store IQ Temp Store Catalog Store IQ Store Temp Counters O/P Main Counters O/P IQ Temp Store
Primary Options • There are two “sets” of options • interval • Collected counters to display • Interval is the time interval that IQ Monitor collects the information • The first iteration of Monitor displays counters from the start of the server • The subsequent displays are deltas of the information from the last display
Interval • Interval can be from 2 seconds to 4,294,967,295 seconds (2^32-1) – as this is around 136 years I do not suggest we use this • The default interval is 60 seconds - 1 minute • Beware of setting the interval too small the data may not be too meaningful
Basic Options • -summaryReports the key statisticsfor both the main andtemporary cache • -cacheMore detailed reports onspecified cache • -cacheByTypeReports the cachestatistics by • -cache_by_typetype of buffer • -ioReports IO subsystemcounters
12.4.2 Options • In AS IQ-M 12.4.2 there are a series of new monitor options • -contention • -bufalloc • -threads • -debug • These are described after the basic options
-summary Reports - 1 • Find RateNo. of times a buffer was requested • Hit RatePercentage of cache hits • Read RateNo. of read operations • Write RateNo. of write operations • Pin Percentage No. of pinned buffers – generally buffers that need to be modified • Note – these are for both caches – for the specified interval
-summary Reports - 2 Sybase Adaptive Server IQ Performance Monitor --------------------------------------------- Version 3.1 Options string for Main cache: "-summary -file_suffix summary-iqmon -append -interval 10" Summary 2000-01-24 10:51:17 Active| Main Cache | Temp Cache Users| Finds HR% Reads/Writes GDirty Pin% Dirty% InUse%| Finds HR% Reads/Writes GDirty Pin% Dirty% InUse% 0 15 46.7 8/1 0 0.1 0.1 0.5 20 100.0 0/0 0 0.0 0.3 0.3 0 627 64.1 237/0 0 5.8 0.1 15.4 230 100.0 0/0 0 0.1 0.6 0.6 1 957 60.7 376/0 0 29.5 0.1 39.0 200 100.0 0/0 0 1.4 2.1 2.1 1 3 66.7 1/0 0 17.8 0.1 39.1 0 0.0 0/0 0 1.4 2.1 2.1 1 0 0.0 0/0 0 6.8 0.1 39.1 0 0.0 0/0 0 1.4 2.1 2.1
Time Out! – LRU/MRU Chain • The layout of the caches is similar to ASE in that they are one long MRU/LRU chain • There is a wash marker • One major difference is there are multiple sweeper threads to write dirty pages to disk Buffer Movement Wash Marker LRU MRU Sweeper Write Activity
-cache Reports - 1 • Find Rateas summary • CreatesThis is the number of times the create buffer operation was called • DestroyThis is the number of times the destroy operation was called • Dirty This is the number of times the dirty operation was called • Hit Rateas summary
-cache Reports BWAIT • This is the count of the number of times a thread tries to get a buffer • but the thread had to hold before the buffer could be made available • This may be because the buffer was locked or whatever • Note-Bwait should be small
-cache Reports – REREADS • This is a count of number of times a buffer was requested for read, after the same thread had been written • If this is the case it means that • 1) the cache activity is so high that the write transaction cannot hold a written buffer in memory • 2) Or, the transaction is very long and hits the same page multiple times, but with a large time interval between hits • This should be very low – except, possibly in case 2 above.
-cache Reports – F.MISS • This is an internal counter that counts the number of times that a hash operation had to probe a hash table twice • Generally this indicates that there has been a rollback shortly before • If there is no rollback activity – watch this counter – it should be zero
-cache Reports - CLONED • This is a count of the number of times a buffer was cloned • This is an example of versioning in operation • For each page that must be modified (not new pages) a cloned operation takes place • A page will only CLONE if there are other users looking at that page – if not then an Opportunistic clone takes place – we update the page in place
Time Out! - Cloning Page 001 If page 004 need to be “modified” Then the system will check to see If other users are using the page Page 002 Page 003 If no user is using the page – it is modified, marked as dirty – and written to a new location (maybe as page 005) Page 004 If a user is using the page – then the Page has to be cloned – and the CLONE Count is incremented If a user now needs the (old) page 004 thenthis is recorded as a cache miss the the pageis read from disk
-cache Report - Prefetch • prefetch Count of how manyprefetchoperations are requested • Prefetch Read count of how may prefetch operations result indisk reads • Generally we should hope that a good part of the prefetch operations can be satisfied in cache – too many reads indicate that we might be able make use of more memory
-cache Report - GDIRTY • We really don’t want to see this count above zero • This counts the number of times that a buffer get operation has had to stall whilst a dirty page is flushed to disk, effectively this means cache is full of dirty pages • If this is the case then we need to move the wash marker - or increase the number of sweeper threads • Note this is GrabbedDirty not Got Dirty
-cache Report - 8 • Pin Percent Percentage of buffers that are pinned (lockedinto memory) • Dirty Percent Percentage of pages that are dirty • Try not to let the Dirty Percent get above 85-90% otherwise the GDirty will start to go above zero – not good.
-cache Report - 9 Sybase Adaptive Server IQ Performance Monitor --------------------------------------------- Version 3.1 Options string for Main cache: "-cache -file_suffix cache-iqmon -append -interval 10" Main Shared Buffer Cache 2000-01-24 10:52:26 Finds Creats Dests Dirty HR% BWaits ReReads FMiss Cloned Reads/Writes PF/PFRead GDirty Pin% Dirty% Mn: 15 0 0 2 46.7 0 8 0 0 512/64 8/1 0/0 0 0.1 0.1 Mn: 1000 0 0 0 71.0 1 284 0 0 7600/0 289/0 0/0 0 1.4 0.1 Mn: 227 0 0 0 74.0 0 74 0 0 3420/ 72/0 15/13 0 1.9 0.1
Cache Reports By Type • This reports the same basic counters as the –cache option, but in this case the counts are broken out by block type in memory • These may be of slightly less interest than the –cache reports – but over a longer interval they can provide some interesting statistics on the performance of the buffer manager
4 Variable Length B- Tree 5 Fixed Length B- Tree 7 VDO – used bythe catalogue to find indices 8 Dbspace header 9 Database header 10 Sort buffer (temp) 12 G-Array (HG) 13 B-Array (FP) 14 Block Map 15 Hash Bucket 16 Checkpoint Block 17 Bitmap (of anysort!) 18 Reserved Object Types in Cache
-cache_by_type Sybase Adaptive Server IQ Performance Monitor --------------------------------------------- Version 3.1 Options string for Main cache: "-cachebytype -file_suffix cachebytype-iqmon -append -interval 10" Main Shared Buffer Cache 2000-01-24 10:53:26 Btype Finds Creats Dests Dirty HR% BWaits ReReads FMiss Cloned ReadKB/WriteKB Reads/Writes PF/PFRead GDirty Pin% Dirtyy% Mn: 0 0 0 0 0 0.0 ------- 0 0 0 0/0 0/0 0/0 0 -- -- Mn: 1 0 0 0 0 0.0 ------- 0 0 0 0/0 0/0 0/0 0 -- -- Mn: 2 0 0 0 0 0.0 ------- 0 0 0 0/0 0/0 0/0 0 -- -- Mn: 3 0 0 0 0 0.0 ------- 0 0 0 0/0 0/0 0/0 0 -- -- Mn: 4 34 0 0 0 91.2 ------- 0 0 0 36/0 3/0 0/0 0 -- -- Mn: 5 7 0 0 0 71.4 ------- 2 0 0 24/0 2/0 0/0 0 -- -- Mn: 6 0 0 0 0 0.0 ------- 0 0 0 0/0 0/0 0/0 0 -- -- Mn: 7 153 0 0 0 98.0 ------- 3 0 0 40/0 3/0 0/0 0 -- -- Mn: 8 0 0 0 0 0.0 ------- 0 0 0 0/0 0/0 0/0 0 -- -- Mn: 9 0 0 0 0 0.0 ------- 0 0 0 0/0 0/0 0/0 0 -- -- Mn: 10 0 0 0 0 0.0 ------- 0 0 0 0/0 0/0 0/0 0 -- -- Mn: 11 0 0 0 0 0.0 ------- 0 0 0 0/0 0/0 0/0 0 -- -- Mn: 12 0 0 0 0 0.0 ------- 0 0 0 0/0 0/0 0/0 0 -- -- Mn: 13 17 0 0 0 0.0 ------- 17 0 0 972/0 17/0 0/0 0 -- -- Mn: 14 187 0 0 0 93.6 ------- 12 0 0 768/0 12/0 0/0 0 -- -- Mn: 15 0 0 0 0 0.0 ------- 0 0 0 0/0 0/0 0/0 0 -- -- Mn: 16 0 0 0 0 0.0 ------- 0 0 0 0/0 0/0 0/0 0 -- -- Mn: 17 131 0 0 0 15.3 ------- 124 12 0 5196/0 124/0 39/13 0 -- -- Mn: 18 0 0 0 0 0.0 ------- 0 0 0 0/0 0/0 0/0 0 -- -- Mn: 19 0 0 0 0 0.0 ------- 0 0 0 0/0 0/0 0/0 0 -- -- Mn: other 0 0 0 0 0.0 ------- 0 0 0 0/0 0/0 0/0 0 -- -- Mn: total 529 0 0 0 72.0 13 158 12 0 7036/0 161/0 39/13 0 0.3 0.1
-IO Reports • IO simply reports the input and output operations the server conducted during the specified interval • This is not broken out by device, the counters relate to entire server activity
-IO Report - Input • Reads The number of read requests • LRd(KB)The page size multiplied by the number of requests • PRd(KB)The actual number of bytes read • Rratio The ratio of logical to physical • The last figure relates the efficiency of the compression to disk - for reads
-IO Report - Output • Writes The number of write requests • LWrt(KB)The page size multiplied by the number of requests • PWrt(KB)The actual number of bytes written • Wratio The ratio of logical tophysical • The last figure relates the efficiency of the compression to disk - for writes
-IO Report Sybase Adaptive Server IQ Performance Monitor --------------------------------------------- Version 3.1 Options string for Main cache: "-io -file_suffix io-iqmon -append -interval 10" Main Shared Buffer Cache 2000-01-24 10:54:21 Input Output Reads LRd(KB) PRd(KB) Rratio Writes LWrt(KB) PWrt(KB) Wratio Mn: 8 512 512 1.00 1 64 64 1.00 Mn: 102 6528 4804 1.36 0 0 0 0.00 Mn: 102 6528 4976 1.31 0 0 0 0.00 Mn: 114 7296 5392 1.35 0 0 0 0.00 Mn: 270 17280 8220 2.10 0 0 0 0.00 Mn: 161 10304 3660 2.82 0 0 0 0.00 Mn: 0 0 0 0.00 0 0 0 0.00 Mn: 0 0 0 0.00 0 0 0 0.00 Mn: 0 0 0 0.00 0 0 0 0.00 Mn: 0 0 0 0.00 0 0 0 0.00
12.4.2 (and beyond) Options • The following slides detail the output from the following new (in 12.4.2) options • -contention • -threads • -bufalloc • -debug
-contention • The –contention option basically displays all of the lock and mutex counters • These are counters that show the activity within the buffer cache and how quickly these locks were resolved • Contention shows statistics for both the main and temp buffer caches and the “heap” memory • Heap is the load and user memory
-contention described – 1 • AU Active Users • LRULks# of times a lock was requested • woTO# of times a lock did not have to time out (withoutTimeOut) • LoopsHow many lock request had to spin waiting for the lock to be granted • TO# of Timeouts (opposite of woTO)
-contention described – 2 • BWaits (as described above) • IOLock # of locks taken while looking for place to write the data • IOWait how many had to wait • HTLock # of locks on the internal Hash Table • HTWait how many had to wait • FLLock # of locks on the free list • FLWait how many had to wait
-contention – Heap Memory • The heap is any memory that IQ-M has to use that is not within either of the 2 caches • This memory is composed of Load Memory, User Memory, thread stacks etc. • MemLks # of locks taken on the heap • MemWts how many had to wait
-contention Sybase Adaptive Server IQ Performance Monitor --------------------------------------------- Version 3.1 Options string for Main cache: "-contention -file_suffix contention-iqmon -append -interval 10" Contention 2000-01-24 10:57:03 AU| Main Cache | LRULks woTO Loops TOs BWaits IOLock IOWait HTLock HTWait FLLock FLWait 0 66 0 0 0 0 1 0 5 0 4 1 2958 0 0 0 0 160 0 1117 0 6 1 1513 0 0 0 1 378 0 2 0 8 1 370 0 0 0 0 94 0 2 0 10 1 156 0 0 0 0 46 0 2 0 12 1 885 0 0 0 0 248 0 2 0 14 1 1223 0 0 0 0 332 1 2 0 16 1 346 0 0 0 0 66 0 2 0 18 | Temp Cache | Memory Mgr |LRULks woTO Loops TOs BWaits IOLock IOWait HTLock HTWait FLLock FLWait |MemLks MemWts 0 70 0 0 0 0 1 0 4 0 5 0 55483 13 0 466 0 0 0 0 2 0 15 0 12 0 5705 0 0 963 0 0 0 0 2 0 8 0 20 1 2048 0 0 1186 0 0 0 0 2 0 2 0 23 1 186 4 0 357 0 0 0 0 2 0 2 0 25 1 2 0 0 444 0 0 0 0 2 0 3 0 29 1 137 0 0 884 0 0 0 0 2 0 2 0 31 1 22 0 0 1573 0 0 0 0 2 0 5 0 37 1 203 3
-threads Option • The –threads option details the counters held by the thread manager • This tells you actually what is going executing (sort of) inside the server • This option can be selected for main or private, however it is irrelevant what you select, the thread manager is server wide!
-threads described • CPU# of CPUs IQ-M is using (maybe not the total number in the box) • Limitmax # of threads IQ-M can run • NTeams # of teams currently running • MaxTeamsHigh Water Mark for teams • NThreadsCurrent # of threads in existence • Resrvd # of threads reserved for IQ-M • Free# of free threads inc. Resrvd • Locks# of locks on the thread manager • Waitshow many had to wait
-threads Sybase Adaptive Server IQ Performance Monitor --------------------------------------------- Version 3.1 Options string for Main cache: "-threads -file_suffix threads-iqmon -append -interval 10" Threads 2000-01-24 10:59:24 CPU Limit Nteams MaxTeams Nthreads Resrvd Free Locks Wait 10 100 4 12 100 13 68 106 590 10 100 6 12 100 12 63 4 6 10 100 6 12 100 12 63 0 0 10 100 7 12 100 12 62 1 1 10 100 7 12 100 12 62 0 0 10 100 7 12 100 12 58 1 5 10 100 7 12 100 12 58 0 0
-bufalloc • -bufalloc prints the statistics from the buffer allocation routine • This is the control of how many new pages / buffers were requested and by what • This can be executed for both main and temp cache – and is useful for both
-bufalloc described - 1 • OU User Resource Allocation number • AU # of Active Users • MaxBuf # of buffers on the cache • Avail# of buffers available in the cache • AvPF# of buffers available to the pre-fetchmechanism • Slots# of buffers active for pre-fetch • PinUsr # of users using the buffer manager
-bufalloc described - 2 • PFUsr# of pre-fetch users • Posted # of posted users. Users who think they know how many buffers they willuse • UnPost # of unposted users. See Above • Quota# buffers reserved for use • Locks# of locks in the buffer manager • WaitsHow many had to wait
-bufalloc Sybase Adaptive Server IQ Performance Monitor --------------------------------------------- Version 3.1 Options string for Main cache: "-bufalloc -file_suffix bufalloc-iqmon -append -interval 10" Buffer Allocation 2000-01-24 10:58:39 OU/AU MaxBuf Avail AvPF Slots PinUsr PFUsr Posted UnPost Quota Locks Waits 1/0 1592 1592 20 0 0 0 0 0 0 1 0 1/1 1592 1592 20 0 0 0 0 0 0 1 0 1/1 1592 1592 20 0 0 0 0 0 0 1 0
-debug - 1 • This is an undocumented (sort of) option for the IQ Monitor. • Basically it displays everything we could think of • Beware – this option generates a serious amount of paper, but can be very useful • It can be run for either the main or temp caches – although some of the information is the same for both • Some of the same counters are displayed as described above, but they are called by their internal counter names – do not be confused