1 / 49

DB2 War Stories and Scary Tales (Part 1)

Session: A11. DB2 War Stories and Scary Tales (Part 1). Robert Goodman Certified DB2 Professional Certified Business Continuity Planner. May 9, 2007 3:00 p.m. – 4:00 p.m. Platform: DB2 for z/OS. DB2 War Stories & Scary Tales. Major Areas Database Foundation Stones

don
Download Presentation

DB2 War Stories and Scary Tales (Part 1)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Session: A11 DB2 War Stories and Scary Tales (Part 1) Robert Goodman Certified DB2 Professional Certified Business Continuity Planner May 9, 2007 3:00 p.m. – 4:00 p.m. Platform: DB2 for z/OS

  2. DB2 War Stories & Scary Tales Major Areas • Database Foundation Stones • SQL Exposures • Questionable Strategies • DDL Exposures • Operational Exposures

  3. DB2 on z/OS – A World Class Act! The Fastest Most Dependable Most Secure Highest Performance Most Scalable Database on the Planet! DB2 for z/OS We're Bullish on DB2!

  4. Data Recoverability Data Security Performance & Scalability Data Concurrency Data Integrity Database Foundation Stones New Features & Combinations Can Undermine These Foundations! Understanding the Risks Is Critical!

  5. DB2 War Stories And Scary Tales By DATABASE BOB Think Tank Publications Where Shall We Begin? Chapters • Fear of Commitment • Trigger Happy • Rotate Roulette • Extents Wide Open

  6. Fear of Commitment Chapter 1

  7. Cycling DB2 Start DB2 Apps Up Apps Up DB2 Stop DB2 Crash When DB2 Does Lots of Recovery Start: IRLM DM DDF Recover: Unresolved URs Doesn't deferred restart fix all issues? Deferred Restart - Gives Consistent Startup Times

  8. Once Upon A Time In the kingdom of DB2, the time came for a systems downtime. Many tasks had been busy during the week. It came time to cycle DB2. It was cancelled, but was naughty and wouldn’t come down. After two hours, the operator forcefully killed DB2. Maintenance was applied and the system was brought up. DB2 lingered and wouldn’t wake up. After three hours, the operator killed DB2 for a second time. We tried once again to wake up DB2. This time we asked the wizards at IBM how to slay the problem. They declared that we should just let it run. Twelve hours later DB2 came up. Why did it take so long? And can this happen even today?

  9. What Happened and Why? Force DB2 DB2 Up Sparse Updating Long URs X X X X X X X DAY1 DAY2 DAY3 DB2 Restart: Recover Incomplete URs Log1 Log2 Log3 Log1 Log2 A1 A2 A Long Running UR Had to Roll Back Using Archive Tapes & Active Logs - DB2 Log Records - 1) Do Records 2) Undo Records

  10. Without Deferred Restart Applications Access DB Log Log A1 Forward Recovery Backward Recovery Start DB2 Restart Complete App App App App App App App App All Recoveries Must Complete Deferred Restart Applications Access DB (Quicker) Log Log A1 Forward Recovery Backward Recovery Start DB2 Restart Complete X X App App App App App App App RECP DB2 Restart Deferred Restart Gets the Subsystem Up Quicker

  11. Turning On Deferred Restart • LIMIT BACKOUT Parameter • AUTO – Automatically recovers once DB2 is up • YES – RECOVER POSTPONED command resumes recovery • NO – Process all inflight and inabort URs • BACKOUT DURATION Parameter • log records during backward recovery before deferring Are These Parameters Set Correctly?

  12. Does deferred restart always work? In rare cases it fails Won’t deferred restart fix all my problems? Normally Deferred pagesets still need recovery What is status of pagesets after restart? Most pagesets are available Deferred pagesets are unavailable What is the exposure? Applications which use deferred pagesets will fail DB2 Restart Questions • How can we detect long running URs? • DB2 log message - DSNR035I UNCOMMITTED UR AFTER ### CHECKPOINTS • Can I automatically cancel long running URs? • Netview can be used to do this

  13. Trigger Happy Chapter 2

  14. The Trigger Concept SQL SQL With Trigger Program1 Program1 SQL1 SQL1 • Trigger Facts: • Programs are unaware • Synchronous • Part of UR • Firing cost of a FETCH • Add in trigger SQL • Plus cost of trigger work • Adds to SQL elapsed Master SQL SQL Database Invokes Trigger Create Trigger After Trigger Firing A Trigger - About the Cost of A FETCH

  15. Once Upon A Time The king declared that dashboards would help rule the kingdom. This was a daunting task for many programs had to be changed. Triggers came to the rescue, they were quick and easy. Since 1 trigger was good, many were even better. They multiplied like rabbits and soon the whole kingdom was full of triggers. The word came down from on high that things were slow. Many programs were dragging but no changes had been made. They noticed that when triggers were added, darkness descended upon the kingdom. What had gone wrong? And how could it be fixed?

  16. Multiple Triggers Multiple Triggers • Multiple Triggers • All are synchronous • All in UR • Multiple trigger’s SQL • Fire one after another • Fired in timestamp order • Serially add to elapsed time Program SQL SQL Master Database Invokes Trigger Before Trigger After Trigger After Trigger SQL Elapsed Time Multiple Triggers Run in Sequence

  17. Triggers & Stored Procedure Program SQL SQL DB2MSTR Database Invokes Trigger After Trigger • Triggers & Stored Procedures • Synchronous • SP program load time • SP execution time • Can make calls outside DB2 • Greatly extends total times DDF Stored Procedure Invoke SQL Elapsed Time Stored Procedures Add Significant Overhead to Triggers!

  18. Stored Procedure w/Transition Tables • Using Transition Tables • Synchronous • Transition tables • Create table time • Use table time • Delete time • SP program load time • SP execution time • Calls outside DB2 time • Adds significantly to times Program SQL SQL MSTR Database Invokes Trigger After Trigger Stored Procedure DDF Invoke DM SQL Elapsed Time Transition Tables Add Significant Overhead

  19. How Expensive Are Triggers? Statement Triggers – Cheapest Row Triggers – Cheap SP Triggers – Expensive SP Triggers w/Trans Vars - Priceless Fire Trigger – Cost of a FETCH +Trigger SQL – Cost of SQL + WHEN – Invoked every time trigger event happens + Transition Variables – Cost of transition table + Invoke Stored Procedure – DDF, Start thread +Resident Stored Procedure – work in SP +Non-Resident Stored Procedure –start SP + work in SP The Costs Add Up! We Used All of the Expensive Options

  20. Generally Poor Reasons to Use Triggers • Just because they’re quick • Lazy man’s solution • Easier than changing programs • Temp fixes that become permanent • For data replication • To populate summary tables • To enforce simple value constraints • To enforce RI constraints • 9) To maintain dashboards (oops!) Misusing Triggers Can Impact Performance!

  21. Modifying Triggers The Reality One Would Think Update TRIGGER Refresh TRIGGER DROP TRIGGER CREATE TRIGGER DROP & CREATE TRIGGER How Triggers Are Maintained Triggers - Easy to Create, Can Be Tough to Drop!

  22. Trigger – Firing Sequence Firing Order Firing Order Trigger 1 2007-01-01 Trigger 1 2007-01-01 1 1 DROP & CREATE On 2007-02-15 Is Sequence Important? Trigger 2 2007-01-15 Trigger 3 2007-01-30 2 2 Trigger 3 2007-01-30 Trigger 2 2007-02-15 3 3 Order of Creation is Firing Order

  23. Who Is Aware of Triggers? Source Awareness • Yes - DB2 • Should Be - DBAs • Maybe – Programmers • NO – DB2 Utilities Triggering Awareness • NO – Applications • NO – SQL • NO – DB2 Optimizer • NO – Explain • NO – Resource Limit Facility • NO – Constraints Triggers - A Run-Time Event Triggers Work Outside Programs & Triggering SQL

  24. Invisible Program Dependencies Programs A B C D . . . • Invisible Causes of Breakage • Trigger • Trigger SQL • DDF • Transition tables • Stored procedure • Application tables • Calls outside DB2 • Unavailable resources • RI & check constraints • Utilities • Deadlocks & timeouts Any Break Causes All Triggering SQL to Fail! SQL MSTR SQL Trigger DDF Stored Procedure Calls Outside DB2 DBM1 AT TT AT AT AT Invisible to Programs Dependencies Make Administration Challenging!

  25. Trigger Traps When DDF reloaded, the resident SP-X the transition variables no longer matched the trigger and SP-X. The trigger had to be dropped & recreated to correct this. SP-X had to be changed to include new column in TableA The Scenario 1) Update trigger on TableA Starts resident stored procedure(SP-X) Inserts before image into log – TableB 2) DBA adds column to TableA 3) Days Later - SQL updating TableA starts failing Why? DBA Corrective Actions 1) Drop the trigger (may require down-time) 2) Drop the stored procedure 3) Add column to parameter lists & SQL 4) Recreate the stored procedure 5) Recreate the trigger Alters Triggers DBA's Must Know Interdependencies to Avoid Trigger Traps & Outages!

  26. RotateRoulette Chapter 3

  27. The ROTATE Concept ROTATE DDL Command 1) Delete Oldest Partition Rows 2) Reuse Oldest Partition 1 Oldest Limitkey A 1 Newest Limitkey E 2 Old Limitkey B 3 New Limitkey C 4 Newer Limitkey D New Last Part Limitkey Delete Old Data Does This Match Your Needs?

  28. Rotating Partitions ALTER TABLE table ROTATE PARTITION FIRST TO LAST ENDING AT (limitkeys) RESET; Simple, Beautiful & Non-Disruptive(?)

  29. ROTATE In Action ROTATE DDL catg.DSNDB.db.ts.I0001.A001 LP 1 LP 4 First Last LP 2 LP 1 catg.DSNDB.db.ts.I0001.A002 First LP 3 LP 2 catg.DSNDB.db.ts.I0001.A003 LP 3 LP 4 catg.DSNDB.db.ts.I0001.A004 Last Logical Partition SYSTABLEPART(V8) Physical Partition Dataset Logical Partition After The ROTATE After Rotate - Logical & Physical Parts Don't Match!

  30. P4 P1 P3 P2 A Series of ROTATEs ALTER TABLE ROTATE ... ENDING AT (‘E’) RESET; ALTER TABLE ROTATE ... ENDING AT (‘F’) RESET; ALTER TABLE ROTATE ... ENDING AT (‘G’) RESET; ALTER TABLE ROTATE ... ENDING AT (‘H’) RESET; 1st Rotate 2nd Rotate 3rd Rotate 4th Rotate Limitkey catg.DSNDB.db.ts.I0001.A001 P1 P1 P1 P1 ‘A’ ‘E’ L1 L4 L3 L2 L1 catg.DSNDB.db.ts.I0001.A002 P2 P2 P2 P2 ‘F’ ‘B’ L2 L1 L2 L4 L3 catg.DSNDB.db.ts.I0001.A003 P3 P3 P3 P3 ‘C’ ‘G’ L3 L2 L1 L4 L3 catg.DSNDB.db.ts.I0001.A004 P4 P4 P4 P4 ‘H’ ‘D’ L4 L2 L1 L3 L4 SYSTABLEPART Maps Logical Partitions

  31. Once Upon A Time Version 8 was up and running well. The call came to convert to table based partitioning and reuse the oldest partition. The ROTATE command was chosen to do this non-disruptive deed. Suddenly the phone began to ring and thick darkness covered the database cubicles. A quick check revealed that two parts were mired in REORP status. User processing ground to a halt their workloads were in peril. The database guardians countered with concurrent REORG to fix the problem. This crashed and burned. Share level NONE REORG was called upon. When it finished, the sun came out and life was good again. What happened to disrupt the peace of this database kingdom?

  32. Why Did ROTATE Set REORP? Last Part Limitkey (‘2007’) The 1st time only, ROTATE converts Indexed Based partitioning to Table Based partitioning. Because limitkey is not enforced in Index Based, the 1st and last parts have to be put in REORP status to eliminate this potential issue. Table_IBP Maximum Limitkey in last partition is not enforced by “index based partitioning” SQLCODE = 0 Index Based Partitioning INSERT INTO tableX (D_YEAR) VALUES (‘2008’) Last Part Limitkey (‘2007’) Table_TBP SQLCODE = -327 Table Based Partitioning Don't Convert to Table Based Partitioning Using ROTATE!

  33. Converting to Table-Controlled Partitioning ALTER INDEX clustering_indexNOT CLUSTER; (conversion to table-controlled partitioning) COMMIT WORK; ALTER INDEX clustering_index CLUSTER; (clustering index reestablished) It’s Simple to Do This Before Rotate Limitkey of Last Partition Will Be Converted to Extreme Value!

  34. Lowering Limitkeys in Last Part Conversion to Table Based Partitioning Causes the last part limitkey to be MAXVALUE ALTER TABLE table ALTER PARTITION # ENDING AT (limitkeys); (This & next partition put in REORP, data outage!) REORG TABLESPACE tablespace SCOPE PENDING SHRLEVEL NONE STATISTICS COPYDDN (Data keys beyond limitkey, discarded during REORG!) Data Outage You Will Have An Outage to Alter Limitkey

  35. Which Partition Number is Used? -DIS DB(db) SPACENAM(ts) NAMETYPEPARTSTATUS SRG9700 TS 0002 RW -THRU 0004 SRG9700 TS 0001 RW You Must Know Which Part To Use!

  36. Advisable Data Outage Rotating Logical Partitions ?PP SELECT PARTITION, LOGICAL_PART FROM SYSIBM.SYSTABLEPART WHERE DBNAME = ‘db' AND TSNAME = ‘ts' AND LOGICAL_PART = 1 1st LP Logical Part = ?PPPhysical Part 1 2 UNLOAD TABLESPACE tablespace PART ?PP LOAD TABLESPACE tablespace PART ?PP REPLACE using dummy SYSREC -START DB(db) SPACE (ts) PART(?PP) ACCESS FORCE COPY TABLESPACE tablespace DSNUM ?PPSHRLEVEL CHANGE (rotated partition now recoverable) 3 4 ALTER TABLE table ROTATE PARTITION FIRST TO LAST ENDING AT (limitkeys) RESET; (1stLP data deleted, becomes last LP) 5 Advisable 6

  37. Avoiding Data Outages New Tables – Use “table based partitioning” Last Partition – Don’t set max/min limitkey (may cause -327 SQLCODEs) Converting from “Indexed Based Partitioning” Don’t convert to table based partitioning with ROTATE Use …ALTER INDEX index NOT CLUSTER Then …ALTER INDEX index CLUSTER Plan for Outage on 1st ROTATE Query for values beyond limitkey before reorg ALTER ASC/DESC limitkey from max/min value Downtime REORG to remove REORP status Know Logical Partitions Prior to Rotate Query SYSIBM.SYSTABLEPART Outage Mitigating Rotate Issues

  38. Rotate Dangers • Knowing which physical part is 1st logical part • Long running DELETEs to empty 1st logical part (42 secs to ROTATE / delete 1,000,000 row partition) • ROTATE can cause an outage (REORP status) (convert to table based partitioning or ALTER limitkeys) • Which part # to use for Commands / DDL / Utilities • Mistakenly rotating the wrong table (DDL reuse or finger fault) • Adding partitions to ROTATEd tables (Confusion factor on first & last parts) • With ascending keys, trying to insert null key inserts • Recoverability after a ROTATE • Attempting to REBALANCE ROTATEd tables

  39. Which partition will rotate next? 1st Logical partition - query SYSIBM.SYSTABLEPART Does rotate interrupt availability? Yes – if indexed based partitioning (convert to table based, last 2 partitions in REORP) Yes – If last part limitkey is altered No – table based partitioning & limitkey doesn’t need to be altered Can rotate be blocked? Set MAXVALUE in limitkey of last partition Show we use rotate to Convert to Table Based Partitioning? Not advisable (last 2 partitions in REORP) Does rotate behave differently with Index Based Partitioning? Yes - the first time Does rotate rename datasets? No – does SQL deletes instead Does rotate delete SYSCOPY entries? No – puts a rotate row in SYSCOPY If accidentally rotate can I recover? No – rotate row in SYSCOPY prevents recovery prior to ROTATE The Rotate Questions

  40. Extents Wide Open Chapter 4

  41. Request 1 Track Request 1 Track Request 1 Track The Extent Concept VOL001 1st Extent 2nd Extent catg.DSNDB.db.ts.I0001.A001 VOL002 Extent Consolidation DB2 requests space from z/OS which finds blocks of space and updates the VTOC and Catalog. VOL003 3rd Extent Getting Allocations Is Relatively Slow

  42. Once Upon A Time Life in the database kingdom was good. Autonomic features had eliminated many servant duties. One day a troubled user called. A table was broken and needed to be recovered. Luckily it had only had five million rows. We proclaimed that it would be back in merely moments. Tapes were mounted, disks were spinning and the clock was ticking. Five minutes turned to ten and then to fifteen. After having many discussions with management, the recovery finally finished. Why did a small table take so long to recover?

  43. Extent Evolution Cylinder Track Block 20th Century DASD z/OS Rules z/OS Rules DASD Reality 21st Century DASD DASD Subsystem Constrained By z/OS Rules Not DASD Reality!

  44. Logical Extent Limits Max Extents / Dataset 255 extents z/OS 1.6 7,257 extents z/OS 1.7 Max Extents / Volume 123 extents / volume Large DASD increase this issue (Mod 27s & 54s) Extent Rule 5 pieces primary extent Whatever can get on secondary Extent Limits We Must Follow z/OS Rules!

  45. IBM Strategies Tolerate More Extents z/OS 1.7 – 7,257 extents Make It Harder to Hit Limits SMS Extent Consolidation Automate Extent Management V8 Sliding Secondaries Solving Systemic Extent Issues Some Problems Still Exist!

  46. 305 276 256 118 115 63 60 45 25 16 How Extents Affect Utilities (V8) Table 1.2 M Rows and 1 Index Elapsed Seconds Extents Affect Writing Utilities!

  47. DASD Fragmentation STOGROUPs by size Standardized allocations z/OS Slow Allocations Fewer Extents Automated Method Use sliding secondaries Can cause fragmentation Managed Method STOGROUPs by size Standardized allocations From z/OS We Need Faster Allocation Search Faster Cataloging z/OS Avoiding Extent Issues

  48. Beware Be Knowledgeable Be Careful Do Excellent Work DB2 War Stories And Scary Tales By DATABASE BOB Think Tank Publications In Closing Want to Hear More? Part 2 Is Up To You! Please Fill Out Your Evaluations

  49. Robert Goodman Session: A11 DB2 War Stories and Scary Tales (Part 1) robert.goodman@flhosp.org Thanks for Coming...

More Related