210 likes | 367 Views
Database Performance. Database Performance. Perfomance can be improved through : Denormalisation Optimisation of SQL Indexes Clusters. Database Performance. DENORMALISATION Normalisation improves the logical database design and prevents anomalies BUT More tables more joins
E N D
Database Performance Perfomance can be improved through: • Denormalisation • Optimisation of SQL • Indexes • Clusters
Database Performance DENORMALISATION • Normalisation improves the logical database design and prevents anomalies BUT • More tables more joins • Joining > 3 tables is likely to be slow • Denormalisation reverses normalisation for efficiency purposes
Database Performance SQL OPTIMISATION Select * from ...; DBMS DATA FILES
Database Performance INDEXES An index is a table or some other data structure that is used to determine the location of a row within a table that satisfies some condition. • Indexes may be defined on both primary and non key attributes.
Database Performance INDEXES • oracle allows faster access on any named table by using an index. • each row within a table is given a unique value or rowid. • each rowid can be held in an index. • an index can be created at any time. • any column within a table can be indexed.
Database Performance WHEN TO CREATE AN INDEX Beforeany input of data for Unique index Afterdata input for Non-unique index
Database Performance HOW DO YOU CREATE AN INDEX ? EXAMPLE :- (a) CREATE INDEX TENIDX ON TENANT(SURNAME); (b) CREATE UNIQUE INDEX TENIDX ON TENANT(SURNAME);
Database Performance GUIDELINES FOR USE OF INDEXES • > 200 rows in a table • a column is frequently used in a where clause • specific columns are frequently used as join columns
Database Performance POINTS TO WATCH • avoid if possible > 3 indexes on any one table • avoid indexing a column with too few distinct values For example:- male/female • avoid indexing a column with too many distinct values • avoid if > 15% of rows will be retrieved
Database Performance CLUSTERS • A disk is arranged in blocks • Blocks are retrieved as a whole and buffered • Disk Access time is slow compared with Memory access • Gains can be made if the number of block transfers can be reduced
Database Performance CLUSTERING • clusters physically arrange the data on disk so that frequently retrieved info is stored together • allows 2 or more tables to be stored in the same physical block • can greatly reduce access time for join operations . • can also reduce storage space requirements.
Database Performance CLUSTER DEFINITION • clustering is transparent to the user no queries have to be modified no applications need to be changed • tables are queried in the same way whether clustered or not
Database Performance DECIDING WHERE TO USE CLUSTERS • Each table can only reside in 1 cluster • At least one attribute in the cluster must be NOT NULL • Consider the query transactions in the system • How often is the query submitted • How time critical is the query • What’s the amount of data retrieved
Database Performance HOW TO CREATE A CLUSTER Step 1 create cluster tenpay(tenid char(8)); Step 2 alter cluster tenpay add table tenant where tenpay.tenid = tenant.property_id; Step 3 alter cluster tenpay add table payment where tenpay.tenid = payment.payment_id
Database Performance • CLUSTERING EXERCISE WAREHOUSE STOCK 3 PRODUCT 1000
Database Performance • To speed up access time to data in these three tables (WAREHOUSE, PRODUCT, STOCK) it is necessary to cluster either STOCK around WAREHOUSE, or STOCK around PRODUCT. • How do we decide which will be the most efficient? • For the purpose of this exercise we will assume that each block can hold 100 records.
Database Performance If STOCK is clustered around PRODUCT No of products = 1000. There will be 1 record for each PRODUCT in each WAREHOUSE. Therefore each product would have 3 records • Each block would contain 100/3 products, i.e. 33 products. There would therefore be a 1 in 3 chance of accessing a particular stock item by reading one block of data.
Database Performance If STOCK is clustered around WAREHOUSE • No of warehouses = _____. There will be ____ record for each item of STOCK in each warehouse. Therefore each warehouse would have ______ records. The records for each warehouse would have to be stored across ______ blocks. • Access would therefore be faster if STOCK is clustered around the product.
Database Performance OPTIMIZING PERFORMANCE Performance can be regarded as a balancing act between:- • access performance • update performance • ease of use/modification
Further Reading • Query Optimisation • Connolly & Begg, 3rd edition, chapter 20 • Connolly & Begg, 4th edition, chapter 21 • Performance Tuning (including denormalisation) • Connolly & Begg, 3rd edition, chapter 17 • Connolly & Begg, 4th edition, chapter 18 • Next Week • HCI