130 likes | 355 Views
Physical Database Design & Performance. Optimizing for Query Performance. For DBs with high retrieval traffic as compared to maintenance traffic, optimizing the DB for query performance is the primary goal. The amount of work required to do so greatly depends on the type of DBMS.
E N D
Optimizing for Query Performance • For DBs with high retrieval traffic as compared to maintenance traffic, optimizing the DB for query performance is the primary goal. • The amount of work required to do so greatly depends on the type of DBMS. • Some DBMS give little control to DB designer / query writer on how a query is processed. • Other give significant control to tune DB design and structure of queries.
Parallel Query Processing • With the advent of multiple processors, many DB servers are now frequently using SMP technology. • To exploit this parallel processing capability, some DBMS include strategies for breaking apart a query into modules that can be processed in parallel by each of the processors. • A common approach is to replicate the query so that each copy works against a portion of the database (usually horizonally partitioned row sets). • The same query runs in parallel on separate processors, intermediate results from each processor are combined to create final query result
Parallel Query Processing • Parallel query processing speeds can be impressive (e.g., for a table scan, it can take half processing time as compared to normal processing) • Other table operations that can make use of parallel processing: • Table joins • Grouping table results into categories • Union operation • Sorting rows • Computing aggregate values • Row update, delete and inserts • Creating and rebuilding index
Overriding Automatic Query Optimization • With most relational DBMSs, you can learn the optimizer’s plan for processing the query before it runs (EXPLAIN command). • The query optimizer uses he best plan based on statistics about each table (no. of rows, avg row length, etc.) • User must analyze the query costs before processing them. • In Oracle, we may force a full scan as well as parallel scan in a query that counts total no. of orders (see example next slide).
Overriding Automatic Query Optimization • Select /* + full(order) parallel(order,3)*/ count(*) • from orders • where salesperson=“Smith” • The clause inside /* */ is a hint to Oracle to override the actual query plan. • A hint is specific to each query.
Picking Block Size • Data is transferred between RAM and disk memory in blocks or pages. • Too small block size will increase I/O while too large one may result in extra data transfer. • Usually min. block size is 2K bytes, typically max . size is 32 bytes or more (depends on OS) • In general small block sizes are used for OLTP applications and larger sizes for DSS and data warehousing solutions
Picking Block Size (Contd.) Trade-offs among 5 performance factors while switching from small to large block sizes: • Block contention • In concurrent access of same block by several I/O commands, smaller blocks create less contention • Random row access • When one row from a table has to be accessed, smaller blocks are best, e.g., in case of OLTP applications • Sequential row access • When many rows have to be accessed sequentially, larger block size is better.
Picking Block Size (Contd.) • Large blocks allow many rows to be cached in RAM in one I/O operation. • Sequential scans occur in DSS and Datawarehousing application. • Row size (length of all fields in a table row) • It is usually best to match block size with physical table row size or a multiple of row size. • Overhead • This is the cost in terms of ‘time’ to manage I/O operations for a database operation. • Smalle block sizes have more overhead than larger ones.
How to Design Better Queries? • Various guidelines have been suggested by various DB experts for improving query processing. • Understand how indexes are used in query processing • Many DBMS use only 1 index per table in a query. • Learn how DBMS selects which index. • Drop infrequently used indexes. • Queries using equality criteria process faster as they can be evaluated via indexes.
How to Design Better Queries? • Use compatible data types for fields & literals in queries • Compatible means DBMS will avoid to convert data during query processing. • Write simple queries • Simple queries are easy to process. • Break complex queries into multiple, simple parts • Result of smaller queries can be combined (using UNION) • Don’t nest one query inside another • Such queries are less efficient.
How to Design Better Queries? • Don’t combine a table with itself • Instead of self-join, make temp. copy of the table and then relate original with temp. table • Create temporary tables for groups of queries • Sometimes a series of queries all refer to same subset of data from database. • it will be more efficient to store this subset in one or more temp. tables to avoid scanning DB again and again • Combine update operations • When possible, combine multiple update commands into one.
How to Design Better Queries? • Receive only the data you need • Reduce processing time by avoiding extra data columns which are not required (e.g. avoid use of select *) • Don’t have the DBMS sort without an index • Suppose data has to be displayed in sorted order and index does not exist on sort key field. • Sort the data after it has been retrieved from DB • Learn • Review query plans using EXPLAIN command • Understand ways in which DBMS determines query processing