720 likes | 840 Views
Physical Ordering of Data: Is it ever Useful? Session ID#479. Mark W. Farnham. 36 West Street Lebanon, NH 03766 Office: 603.448.1803 www.rsiz.com. Mark W. Farnham President, Rightsizing, Inc. mwf@rsiz.com Co-founding Director, OAUG Charter member, current director IOUG
E N D
Physical Ordering of Data: Is it ever Useful?Session ID#479 Mark W. Farnham 36 West Street Lebanon, NH 03766 Office: 603.448.1803 www.rsiz.com
Mark W. Farnham President, Rightsizing, Inc. mwf@rsiz.com Co-founding Director, OAUG Charter member, current director IOUG Secretary, OAUG DBSIG Member, MOSES, Oracle VLDB Co-founder, APPSPERF Member, Oaktable Network Oracle Ace
Agenda • Told you a bit about who I am • Told you to trust but verify what I tell you • I’m about paraphrase the abstract to you • Explain example situations where physically sorting data makes sense so you can recognize similar patterns. • Time allowing, discuss technical details • Time allowing, speculate on value in Exadata • Recap, Q&A
Abstract (part 1) Some operations, both small ones such as lookup tables and large ones such as batch jobs and reports, can perform better if the data’s physical order matches the predominant order of use. When is it worthwhile to try this ordering? How does automatic segment space management (ASSM) affect ability to order?
Abstract (part 2) An expensive mission to frequently physically order all of your data in the best possible order for the queries you expect to run is likely very silly! But a one time or infrequent periodic selective reordering may dramatically exceed its cost in savings on later queries. Sometimes sorting a small object is leveraged.
Agenda • Told you a bit about who I am • Told you to trust but verify what I tell you • I’m about paraphrase the abstract to you • Explain example situations where physically sorting data makes sense so you can recognize similar patterns. • Time allowing, discuss technical details • Time allowing, speculate on value in Exadata • Recap, Q&A
Agenda • Told you a bit about who I am • Told you to trust but verify what I tell you • I’m about paraphrase the abstract to you • Explain example situations where physically sorting data makes sense so you can recognize similar patterns. • Time allowing, discuss technical details • Time allowing, speculate on value in Exadata • Recap, Q&A
The heart of the matter • Oracle stores rows in blocks. • Ten rows of one hundred rows that fit ten to a block • Oracle stores rows in blocks. • Two rows in a row from one block in quick succession • Oracle stores rows in blocks. • The optimizer knows this
Okay so I lied a little… Before the pattern examples I’m going to talk just a little bit about the potential for physically sorting to be of much higher value than you might expect. Also remember it is usually not free and that “reorganizing” for performance can become an addictive disease some folks call “Compulsive Tuning Disorder” (CTD).* *Gaja Krishna Vaidyanatha, circa 2000
Agenda • Told you a bit about who I am • Told you to trust but verify what I tell you • I’m about paraphrase the abstract to you • Explain the heart of the potential advantages • Explain example situations where physically sorting data makes sense so you can recognize similar patterns. • Time allowing, discuss technical details • Time allowing, speculate on value in Exadata • Recap, Q&A
The heart of the matter • Oracle stores rows in blocks. • Ten rows of one hundred rows that fit ten to a block • Oracle stores rows in blocks. • Two rows in a row from one block in quick succession • Oracle stores rows in blocks. • The optimizer knows this
The heart of the matter • Oracle stores rows in blocks. • Ten rows of one hundred rows that fit ten to a block • If I need values 1 through 10, they might be in any number of blocks from 1 to 10. • It is possible to physically order a table in an order that matches an index. • If values of that index is the way you most often range access this table – big win.
The heart of the matter • Oracle stores rows in blocks. • Two rows consecutively from one block • Suppose you’re querying rows or blocks that are hotly updated. • Getting more rows from a block right away sure increases the chances the block is still cached at the right “vintage”
The heart of the matter • Oracle stores rows in blocks. • The optimizer knows this • More likely a plan cheaper than scanning exists when the rows are in order • More likely the optimizer figures out that the corresponding index plan is cheaper (cluster factor).
My favorite old chestnut • Fall 1993 – 32 million GL accounts • Oracle 7.0.13, I think. • They actually paid me money to play with hardware at a DEC facility in Maryland! • 48.5% reduction in time required for month end processes
My favorite old chestnut • Just from reordering GL_CODE_COMBINATIONS • Intelligently – account segments, not PK • Intelligently – maybe sorted by different columns for different companies or Charts • Before the open period of the tested GL period
My favorite old chestnut • Only had to do it once a year or so because the allowed accounts were very static. (Good shelf life) • Sorting a “small” object generated sorted results on the “large” object (Leveraged effects)
My favorite old chestnut • Matched moving windows programs like the flexfield validation portion of journal import at the time • So look for a chance where it is valid to sort input batches before processing • NEVER allow this to be mandatory for the program to work correctly. • But it may be a natural side effect that is efficient!
Consider Alternatives to Manual Physical Ordering • Hash clusters, IOTs • Adding columns to an index to avoid accessing table blocks • Find a better execution plan that does not need ordering (often quite hard) • Ask the audience (if anyone is still here)
Consider Alternatives to Manual Physical Ordering • Hash clusters, IOT – Hmm, effect with Exadata? • Adding columns to an index to avoid accessing table blocks – Indexes? Exadata? • Pure speculation on my part on this, • I have NOT had the pleasure, but… • Storage Indexes! • Not saying things won’t work. I predict physically sorting will out-perform hash and IOT in Exadata for many cases.
Summary of decision process • Pro physical ordering • Low cost to reorder • Good shelf life of reordering • Time constrained monoliths that already have very good plans and you can’t figure how else to make them faster • Lookup tables where multiple rows are needed frequently or which naturally drive new data loads • Available maintenance windows “often enough” • Probably good with Exadata
Summary of decision process • Con physical ordering • High cost to reorder • Short shelf life of reordering • No demonstrated need • Alternative plans make sense • Extra columns on an index (within reason) • Hard to schedule
Summary of decision process • Watch for: • Conflicts in access path (Cary raised concern) • Possible that making one path better could make another path significantly worse. • Check change in cluster factor to guage degradation • Remember that range, full scans, and fast full scans are what matters – not single row lookups (except some small effect that index may be bigger) • Monitor baselines to see which access paths are used. (John Kanagaraj suggested)
Summary of decision process • Pro physical ordering • Con physical ordering • Items to watch for that may be of concern • Suggestions from audience?
Q&A Does time allow, or should I start in on the technical details?
The Experiments • Four combinations of extent allocation and segment space allocation • Freelists, uniform • Freelists, system • ASSM, uniform • ASSM, system • I consider dictionary managed obsolete
Four tablespaces tablespace block initial_ext pctincr MIN_EXTLEN STATUS ext_mgmt alloc_type segspace L_S_A01 8192 65,536 ~ 65536 ONLINE LOCAL SYSTEM AUTO L_S_M01 8192 65,536 ~ 65536 ONLINE LOCAL SYSTEM MANUAL L_U_A01 8192 1,048,576 0 1048576 ONLINE LOCAL UNIFORM AUTO L_U_M01 8192 1,048,576 0 1048576 ONLINE LOCAL UNIFORM MANUAL Local [System|Uniform] [Automatic|Manual]
The Table(s) create table mwf01_m ( access_order number, ctag varchar2(100), cpad varchar2(100), constraint mwf01_m_pk primary key (access_order) )
The Table(s) organization heap pctfree 0 tablespace l_u_m01 for table 01_m storage (freelists 20)
The Table(s) organization heap pctfree 0 tablespace l_u_a01 for 01_a /* no freelists clause since it is ASSM */
Eight tables • “Odd” numbered tables are loaded in order • “Even” numbered tables are loaded anti-ordered • A is for ASSM, M is for manual (freelist) • 1 and 2 are uniform extents • 3 and 4 are “system” allocated extents
Eight, er, Ten tables • mwf01_m, mwf02_m • mwf03_m, mwf04_m • mwf01_a, mwf02_a • mwf03_a, mwf04_a • mwf01_as who can guess? • mwf03_as
Inserting the rows I meant to have the insert statement be really trivial, but the anti-sorted result sets grouped even 100’s and 10000’s so the rows were enough shorter to get in extras. So I had to fix that up and the insert gets a little funky.
Inserting the ordered rows insert into mwf01_a select rownum+9999, /* 4 bytes (except the even 100's and 10000's */ to_char(rownum,'000000000000000'), /* 15 zeros, 16 bytes */ 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'|| decode(mod(rownum+9999,100) ,0,'+',NULL)|| decode(mod(rownum+9999,10000),0,'=',NULL) /* 52 bytes with an extra byte tacked on for 100 and another for 10000 */ from dba_objects /* 72 bytes total about 100 rows per 8K block */ where rownum <= 40000
Inserting the ordered rows insert into mwf01_a select rownum+9999, /* 4 bytes (except the even 100's and 10000's */ to_char(rownum,'000000000000000'), /* 15 zeros, 16 bytes */ 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'|| decode(mod(rownum+9999,100) ,0,'+',NULL)|| decode(mod(rownum+9999,10000),0,'=',NULL) /* 52 bytes with an extra byte tacked on for 100 and another for 10000 */ from dba_objects /* 72 bytes total about 100 rows per 8K block */ where rownum <= 40000
Inserting the anti-ordered rows insert into mwf02_m select trunc((rownum-1)/100)+((mod(rownum-1,100))*400)+10000, /* 4 bytes (except the even 100's and 10000’s */ to_char(rownum,'000000000000000'), /* 15 zeroes vsize 16 bytes */ 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'|| decode(mod(trunc((rownum-1)/100)+((mod(rownum-1,100))*400)+10000,100) ,0,'+',NULL)|| decode(mod(trunc((rownum-1)/100)+((mod(rownum-1,100))*400)+10000,10000),0,'=',NULL) /* 52 bytes with an extra byte tacked on for 100 and another for 10000 */ from dba_objects /* 72 bytes total about 100 rows per 8K block */ where rownum <= 40000
Inserting the anti-ordered rows insert into mwf02_m select trunc((rownum-1)/100) +((mod(rownum-1,100))*400) +10000, /* 4 bytes (except the even 100's and 10000’s */ to_char(rownum,'000000000000000'), /* 15 zeroes vsize 16 bytes */ 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'|| decode(mod(trunc((rownum-1)/100)+((mod(rownum-1,100))*400)+10000,100) ,0,'+',NULL)|| decode(mod(trunc((rownum-1)/100)+((mod(rownum-1,100))*400)+10000,10000),0,'=',NULL) /* 52 bytes with an extra byte tacked on for 100 and another for 10000 */ from dba_objects /* 72 bytes total about 100 rows per 8K block */ where rownum <= 40000 Counting by 400s: 10000, 10400, 10800, …, 49600 Every 100, plus 1: 10001, 10401, 10801, …, 49601 ….. : 10399, 10799, 11199, ….,49999
The Diagnostic Query select dbms_rowid.rowid_block_number(rowid) block_no, min(access_order) min_acc_order, max(access_order) max_acc_order, (max(access_order) - min(access_order) +1) acc_order_span, interesting value max(dbms_rowid.rowid_row_number(rowid))+1 rows_in_block interesting value from mwf01_a group by dbms_rowid.rowid_block_number(rowid) order by dbms_rowid.rowid_block_number(rowid)
The Extents Query break on segment skip 1 on report compute sum of kbytes on break select substr(segment_name,1,16) segment, segment_type, extent_id, block_id, bytes/1024 kbytes from dba_extents where segment_name like 'MWF01_M%' order by segment_name,extent_id
The Index Info Query select index_name, index_type, uniqueness, blevel, leaf_blocks, distinct_keys, distinct_keys/leaf_blocks keys_per_leaf, {!skew} num_rows, avg_leaf_blocks_per_key, avg_data_blocks_per_key, clustering_factor, buffer_pool, tablespace_name from dba_indexes where table_name = 'MWF01_M'
The Table Info Query select table_name, pct_free, freelists, num_rows, blocks, num_rows/blocks rows_per_block, {!skew} avg_row_len, empty_blocks, degree, cache, compression from dba_tables where table_name = 'MWF01_M'
Stats Gathering - gs_onetab begin dbms_stats.gather_table_stats( ownname => null ,tabname => '&tabname' ,partname => null ,estimate_percent => null ,block_sample => false ,method_opt => 'for columns access_order size 254' ,degree => NULL ,cascade => true ); end;
Displaying The Plan SELECT PLAN_TABLE_OUTPUT FROM TABLE(DBMS_XPLAN.DISPLAY(null,null,'ALL')) / That’s what I use. If anyone has a better idea, let me know.
Displaying The Plan SELECT PLAN_TABLE_OUTPUT FROM TABLE(DBMS_XPLAN.DISPLAY(null,null,'ALL')) / --+ gather_plan_statistics select * from table( dbms_xplan.display_cursor(format=>'ALLSTATS LAST')); (sometimes very useful to see e-rows, etc.)
Okay, here we go! Freelists ordered • SQL> truncate table mwf01_m; • SQL> @x_ins_mwf01_m • 40000 rows created, using freelists in order • SQL> @gs_onetab • For table mwf01_m • Table and index extents on next slides
Freelists ordered extents SEGMENT SEGMENT_TYPE EXTENT_ID BLOCK_ID KBYTES ------- ------------ ---------- ---------- ---------- MWF01_M TABLE 0 9 1024 TABLE 1 265 1024 TABLE 2 393 1024 TABLE 3 521 1024 MWF01_M_PK INDEX 0 137 1024 • Now I’m not really worried for these tests that the only extent so far of the index is spliced into the tablespace with the additional extents of the table • If I had sufficient spindles and needed to get the last nth out, I’d test whether moving it helped. • At this size I sure doubt it would help.
Freelists ordered index stats INDEX_NAME BLEVEL LEAF_BLOCKS MWF01_M_PK 1 76 DISTINCT_KEYS KEYS_PER_LEAF NUM_ROWS 40000 526.315789 40000 AVG_LEAF_BLOCKS_PER_KEY 1 AVG_DATA_BLOCKS_PER_KEY 1 CLUSTERING_FACTOR 400 interesting value
Freelists ordered table stats TABLE_NAME PCT_FREE FREELISTS MWF01_M 0 20 NUM_ROWS BLOCKS 40000 403 what? ROWS_PER_BLOCK AVG_ROW_LEN 99.2555831 what? 75 EMPTY_BLOCKS DEGREE CACHE COMPRESSION 0 1 N DISABLED
Freelists ordered table stat bug? BLOCK_NO MIN_ACC_ORDER MAX_ACC_ORDER ACC_ORDER_SPAN ROWS_IN_BLOCK 10 10000 10099 100 100 11 10100 10199 100 100 … 537 49900 49999 100 100 400 rows selected.
Freelists table stat bug? No. Demoninator includes 3 more. BLOCK_NO MIN_ACC_ORDER MAX_ACC_ORDER ACC_ORDER_SPAN ROWS_IN_BLOCK 10 10000 10099 100 100 11 10100 10199 100 100 … 537 49900 49999 100 100 400 rows selected.