1 / 14

The aim of all applications optimisation is to minimise some cost function - usually

The aim of all applications optimisation is to minimise some cost function - usually I/O or transaction time. Database Server. PERFORMANCE TUNING - INTRODUCTION. Database Administrator. contribute to. PERFORMANCE. issues concerning query decomposition. database front-end.

palila
Download Presentation

The aim of all applications optimisation is to minimise some cost function - usually

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The aim of all applications optimisation is to minimise some cost function - usually I/O or transaction time Database Server PERFORMANCE TUNING - INTRODUCTION Database Administrator contribute to PERFORMANCE issues concerning query decomposition

  2. database front-end SERVER PROCEDURE FOR SATISFYING QUERY SQL Query Scan, Validate & Parse Query Precompiled Queries Query Modifier (Permits, Views, Integrities) Query Optimiser Query Execution Data Manipulation Facility Low-Level Data Requests

  3. Relation Descriptions Incoming Query Column Descriptions Secondary Indices OPTIMISER Statistics Query Execution Plan Histograms THE QUERY OPTIMISER

  4. Full Sort Merge Join Strategy Join Sort Sort Possible QEP for Joining Two Tables Join Sort Sort selection of tuples Proj-Rest Proj-Rest Base Table Base Table EXAMPLE JOINS 2 Tables - one of 100 rows - one of 10.000 rows (of which only 50 rows are to be selected)

  5. Query Execution Plan Created to access the required data given the SQL statement at minimum cost Cost function is a combination of I/O’s required, CPU cycles consumed, together with memory and sort-merge requirements QEP is defined and control is passed to the runtime supervisor - the data manager then invokes the buffer manager - which calls on the services of the lowest level operating system - deciding how to share resources amongst multiple users

  6. Implementation of the QEP - Indexing storage structure 5 different types of data organisation to be examined:- HEAPS INVERTED LIST ISAM (Indexed Sequential Access Method) B-tree (Binary Tree) Hashed

  7. HEAPS Unordered set of rows New data added to end of file No direct access to individual row Adding data very efficient - no need to re-order pages or indices Seems crude - why use it? -good for applications requiring large volumes of data entry on-line followed by over-night batch reporting - no ad hoc queries

  8. Inverted-Lists For every instance of the key value - store the RID of the records matching that value Example Table STREET_LIGHTS light_number,road_number,sector,road_name, control_type, wattage,etc. Inverted Index on ‘road-name’ Allestree Close RID1,RID4,RID10,RID12,RID15 ….. Bentley Brook RID201,IRD202,RID250,RID251,RID254 …... Kedleston Road RID151,RID152,RID154,….. NB List is maintained in sorted order The index page is read into main storage, probably only one I/O needed. RIDs then used to locate data pages. Therefore, efficient data retrieval in query like:- SELECT*FROM STREET_LIGHTS WHERE road_name=‘Bently Brook’

  9. <=M > M Brubaker Cary Eagleton Allen Baker Bottorff Arly ISAM (or VSAM for ABM DB2) Indexed Sequential Access Method is based on primary key only. Therefore, need to select a key which will serve most of the likely requests of the user. In practice a compound key is often used.Disadvantage - Static - overflow can rapidly increase . INDEX PAGES <N < J < E < >BP =BP DATA PAGES OVERFLOW PAGES

  10. Brubaker Cary Eagleton Allen Baker Bottorff B-Tree Leaf pages contain RIDs to the individual records. Data is returned in sorted order to the data buffer manager. Dynamic index - more balanced index than ISAM. More widely used. > M <=M INDEX PAGES <N < <J E LEAF PAGES Allen Baker Bottorff Brubaker Cary Eagleton DATA PAGES

  11. Hash In hashed files the key value is directly related one-to-one with a physical address by means of a mathematical formula. Address=f (key-value) Advantage no index need be used - no index I/O required -therefore fast access Disadvantage large amounts of storage space may remain unused Hash index may be a possible choice for an evenly distributed key value table.

  12. Data Storage Structures Data compressed is stored on disk in pages. A page is a unit of physical I/O typically containing many logical records each of which maps to a row in a database table. Each record is identified uniquely by a RID. RID for a record ‘x’ 20 4 page no offset to the location of a record pointer ‘P’ 20 record ‘x’ P free space

  13. BEFORE AFTER 1 2 2 1 4 3 4 5 6 5 6 7 7 8 9 8 10 9 10 12 11 12 13 14 13 15 14 16 15 17 16 17 18 18 19 19 20 21 20 21 2% free space 5% free space Sort by primary key Unload Reload Database Re-organisation Utility Before and after of a single data page

  14. Parallel Processing Parallel processing has been described as : “The key to high performance transaction and database processing” Use the paper on ‘Open Issues in Parallel Query Optimisation’ as a discussion document to test whether: “Parallel Processing offers the possibility of vastly improved performance where the database is very large or can be conveniently partitioned across different disks to reduce the overheads of communication.” Look at the topics: Share-Nothing and Share-memory architectures. Load-balancing, disk access problems, communications overheads. Application applicability - especially in respect of Object-oriented databases.

More Related