1.05k likes | 1.18k Views
Benchmarking Index Impact on OLTP Load Rates and Online Database Block Size Rebuilds paper #529. Kevin Loney, TUSC http://www.tusc.com 800-755-TUSC. What’s going on?. Real-Time Data Warehousing VLOLTP Constraints Very little downtime Immediate impact on business processes
E N D
Benchmarking Index Impacton OLTP Load Ratesand Online Database Block Size Rebuildspaper #529 Kevin Loney, TUSC http://www.tusc.com 800-755-TUSC
What’s going on? • Real-Time Data Warehousing • VLOLTP • Constraints • Very little downtime • Immediate impact on business processes • Growth is required by the business plan • Performance cannot harm business
Goals • Measure the impact of different types of indexes on load rates • Identify the costs that go along with index benefits • Improve testing procedures • Make better cost/benefit decisions on index creations
3 x 3 = 10 As a general rule, use this simple estimation guide for the cost of index maintenance: Each index maintained by an INSERT, DELETE, or UPDATE of the indexed keys requires about three times as much resource as the actual DML operation on the table. What this means is that if you INSERT into a table with three indexes, then it will be approximately 10 times slower. For DML, and particularly for INSERT-heavy applications, the index design should be seriously reviewed, which might require a compromise between the query and insert performance. - Oracle9i Performance Methods But sometimes, 1 x 3 = 10
Notes on the tests • Your Mileage Will Vary • Issues include: • Environment constraints • Data sort order • Database block sizes • Versions • Buy vs Build
Test configurations • Focused on isolated tables, isolated servers, isolated disks • 8i & 9i • Custom tables • Table-to-table inserts, repeated executions of multiple small and large transactions
Some indexes are more equal than others • Sort order impacts load rate dramatically • Results from internal block management • Space requirements impacted too • Test table structure: Table EMP, one single-column index on an EmpSeq numeric column
Tracking Index Impact on Load Rate Initial rate(rows/s) Sustainedrate(% of initial) 50% 0 0 Number of records inserted
Loading Sorted Order Initial rate(rows/s) Sustainedrate42% of initial 50% 0 0 Number of records inserted
Loading Unsorted Order Initial rate(rows/s) 50% 42% sustained 20% sustained 0 0 Number of records inserted
Conclusions • Unsorted inserts into indexes force more internal work to be done • Unsorted sustained rate is about one half of the sorted sustained rate • Initial rate is not an accurate indicator of performance • 1 x 3 = 5 for high enough values of 3
Managing Index Blocks Full index blocks Lastblock New record
99-1 split of the last block Full index blocks 99% of oldLastblock 1% of oldLastblock New record
Insert into any other block Full index blocks Middleblock New record
50-50 split of internal block Full index blocks 50% of oldMiddle block 50% of oldMiddle block Lastblock New record
Result • Inserts into non-terminal index blocks cause more inefficient internal block splits • Costs load performance • Benefits query performance • Costs space usage • Cost will vary index by index • Sort by most expensive index
How to see this after the fact • Analyze indexes • Compare space usage • Compare clustering factor to leaf blocks
Seeing the space impact • Compare sorted insert vs unsorted insert • As of Oracle8i, you can use ORDER BY in views: create view EMP_SORTEDas select * from EMP order by Ename; insert into EMP_WITH_ENAME_INDEXselect * from EMP_SORTED;
Space and Performance Impact • Test: 100,000 rows, one index • Sorted: 58% faster • Sorted index blocks: 920 • Unsorted index blocks: 1370 • Savings: 58% time cost, 49% space cost
Clustering factor • In load-sorted indexes, the clustering factor will be close to the number of leaf blocks • Select from USER_INDEXES • Sorted: 2,731 for 920 blocks; ratio = 3 • Unsorted: 91,654 for 1370 blocks; ratio = 90 • Serves as an indicator for index rebuilds
Mitigating the unsorted insert cost • Reduce the number of block splits • Use a larger database block size • Use a smaller PCTFREE for index blocks • Rebuild to reclaim space • Weigh these changes against their potential costs to query performance
No index vs one index No indexes Initial rate(rows/s) 50% 30% 9% sustained 0 0 Number of records inserted
Index cost during inserts • Initial cost is around 70% of performance • Performance then drops by another 70% after the number of entries in the index exceeds the limit for a 1-level B*tree This is a step function, not a slow curve!
No index vs one index No indexes Initial rate(rows/s) Blevel=1 30% Blevel=2 9% sustained 0 0 Number of records inserted
B*tree levels • Index B*tree expands to hold new values • When it runs out of space for entries, it splits into a second level • 2-level B*trees may add another 3x to the index cost factor • Analyze, then check: select BLevel from USER_INDEXES where index_name = … ;
How to lower BLevel • Use a larger database block size • 9i: block size varies at the tablespace level • Decrease PCTFREE • Partition the table, use local indexes with fewer records in each
What about performance testing? • Load rate testing should not begin until after the index has split • Initial load times are partly relevant • Predict 2-Blevel load rates based on 1-Blevel rates
How many entries per Blevel? • Varies widely, depending on database block size and index entry length • Blevel 1 to 2 split ranges observed • 2K: 8,000 to 12,000 rows • 4K: 17,000 to 123,000 rows • 8K: 24,000 to 307,000 rows • 16K: 563,000 to 6,243,000 rows
The Good News • Using a higher block size lowers the performance cost of index loads • Only one 3-level B*tree was found: • 16K: 64,300,000 rows
Next steps • Evaluate relative costs: • One one-column index (1-1) • One two-column index (1-2) • Two one-column indexes (2-1) • Three one-column indexes (3-1) • relative to the initial load rate of 1-1.
One-column vs Two-column indexes • Two columns • Comparable sort levels • Compare two one-column indexes to one-two column index on the same columns
One-one vs one-two 1-1 Initial rate(rows/s) 1-2 65% 50% 27% sustained 25% sustained 0 0 Number of records inserted
1-1 vs 1-2 vs 2-1 1-1 Initial rate(rows/s) 1-2 65% 50% 2-1 27% sustained 25% sustained 16% sustained 0 0 Number of records inserted
Conclusions for load rates • For like columns: • Fewer longer indexes are preferable to more shorter indexes • Leading column drives the index load cost
Predictive load performance • If initial load into 1 index = 1000 records/s • Then: • When Blevel=2, load rate = 270 records/s • If a second index is added: • When both Blevels=2, load rate = 160 rec/sec • What about a third index?
1-1, 1-2, 2-1, 3-1 1-1 Initial rate(rows/s) 1-2 65% 50% 2-1 27% sustained 25% 25% sustained 16% sustained 3-1 13% sustained 0 0 Number of records inserted
So? • Going from 1 one-column index to 3 one-column indexes cuts the initial load rate by 75% and the sustained rate by 50%
As a percentage of unindexed rate 1-1 Initial rate(rows/s) 1-2 2-1 9% sustained 8% sustained 6% sustained 3-1 5% sustained 0 0 Number of records inserted
Other indexing options • Bitmap indexes • Reverse key indexes • Function-based indexes • Unique indexes • Index-organized tables
Bitmap index loading • Bitmap index maintenance is deferred until the end of each DML operation • One bitmap segment update per commit • Not a benefit to OLTP transactions
What about reverse keys for sequential inserts? • During load tests, reverse key indexes trail B*tree indexes by a factor of 2.5:1 to 3:1 for sequentially loaded rows.
Function-based indexes? • Indexing UPPER(Name) costs approximately the same as indexing Name.
Unique indexes? • Uniqueness checks add little to the index load cost, generally < 10%. • For sequential unique data, initial load rates approach unindexed rates. • Still have a performance drop at Blevel=2
Index-organized tables? • TableA: 6 columns, 1 one-column primary key • TableB: 6 column index-organized table • Performance: TableB load time was 34% slower than TableA.
Conclusions • One one-column index, over time, can lower load performance by over 90%. • You must maintain the Blevel for the indexes on your major transaction tables. • The order of loaded data is critical to load performance. • The loading process is as slow as its weakest link.
Possible solutions • Partition the table - divide and conquer the index • Split partitions to maintain Blevels • Analyze the partitioned index following loads alter table EMP split partition P2 at ‘JANE’;
More solutions • Use materialized views • Separate data loading and query processes • Heavily index materialized view tables, drop indexes on data source tables • Drop unused indexes • 9i: alter index EMP_DEPT_NDX monitoring usage; then query V$OBJECT_USAGE.Used
Still more solutions • Favor fewer, longer indexes and use 9i’s skip-scan features for query tuning. • More efficient than index fast full scans • Increase the block size • 9i: change online - covered in next section • Use external tables (9i)
Larger block size implications • For larger, denser blocks: • Increase INITRANS • Increase MAXTRANS • Increase FREELISTS
Cost/Benefit Predictions If unindexed: 3,000 rec/sec One index, Blevel 1: 900 rec/sec One index, Blevel 2: 300 rec/sec Two indexes, Blevel 2: 180 rec/sec Three indexes, Blevel 2: 150 rec/sec • Weigh against the query tuning benefits and administrative partitioning choices