B. Information Technology (Hons.) CMPB245: Database Design

B. Information Technology (Hons.)CMPB245:Database Design Physical Design

Objectives • Explain the purpose of physical database design • Design base relations and integrity rules • Select an appropriate file organization based on analysis of transactions

Physical Database Design: The Purpose Objective 1

Database design Database planning Conceptual design DBMS selection Systems definition Logical design Requirements collection & analysis Physical design Implementation Prototyping Data loading & conversion Testing Operational maintenance

Introduction • What did we do in Logical Database design? • We produced the Conceptual Data Model followed by the Local Logical Data Model and then the Global Logical Data Model • What does this global logical data model represents?

Introduction • Physical database design depends on the functions and power of selected DBMS • More than one way of implementing it • At this stage, the task is to • convert the global logical data model into a specific database implementation

Physical Database Design • What is Physical Database Design? • The process of producing a description of the implementation of the database on secondary storage • It describes the storage structures and access methods used to gain access effectively

Physical Database Design • It involves the following tasks: • designing the base relations and integrity constraints • selecting specific storage structures and access methods for the data to achieve good performance • designing security measures required on the data

Physical Database Design • Connolly says that a physical database designer should • know how the computer system hosting the DBMS functions • be fully aware of the functionality of the selected DBMS • Physical database design must be tailored to a specific DBMS system

Physical Database DesignThe Tasks Involved 4.0 Translate global logical data model for target DBMS 4.1 Design base relations for the target DBMS 4.2 Design enterprise constraints for the target DBMS

Physical Database DesignThe Tasks Involved 5.0 Design and implement physical representation 5.1 Analyse transactions 5.2 Choose file organizations 5.3 Choose secondary indexes 5.4 Consider the use of controlled redundancy 5.5 Estimate disk space

Physical Database DesignThe Tasks Involved 6.0 Design and implement security mechanisms 6.1 Design and implement user views 6.2 Design and implement access rules 7.0 Monitor and tune the operational system

Design Base Relations& Integrity Rules Objective 2

4.0 Translate Global Logical Data Model • Translate the relations from the global logical data model into a form that can be implemented in the DBMS • collate the information gathered during logical data modeling and documented in the data dictionary • use this information to produce the design of the base relations

4.0 Translate Global Logical Data Model • Knowledge about the target DBMS the designer should know: • does the system supports the definition of primary, foreign and alternate keys? • does it supports the definition of the required data, e.g. NOT NULL for primary key? • does it supports the definition of domains? • how to create the base relations?

4.1 Design base relations • Required information from the global data model: • the name of the relation • a list of attributes • the primary key, alternate keys and foreign keys • integrity constraints for foreign keys

4.1 Design base relations • For each attribute, you should have: • its domain - data types, length and any constraints • optional default value for the attribute • whether it can hold nulls • whether it is derived and how it is computed

4.1 Design base relations • To represent the design of base relations, use the DDL to define: • domains • default values • null indicators • See Fig. 9.1, pg. 273

4.1 Design base relations • Implementation of the base relations depends on the target DBMS • Use SQL statements • See Fig. 9.2, pg. 274 • These operations creates the domains, table and constraints • Document the design of base relations

4.2 Design enterprise constraints • If an organisation has its own rules governing its real world transactions • Enforce those rules in the target DBMS • This depends on the DBMS • Some systems do not support some or all of the enterprise constraints

4.2 Design enterprise constraints • For example, DreamHome would not allow a staff handles more than 10 properties, could be enforced in SQL as follows: CONSTRAINTS staff_not_handling_too_much CHECK (NOT EXISTS (SELECT sno FROM property_for_rent GROUP BY sno HAVING COUNT(*)>10)) • Document the design of enterprise constraints

Transactions Analysis and File Organization Objective 3

5.0 Design and implement physical representation • At this stage, determine the file organizations and access methods that will be used to store the base relations • Efficient file storage can be measured • Transaction throughput - no. of transactions processed in a given time • Response time - elasped time for a single transaction • Disk storage - amount of disk space used

5.0 Design and implement physical representation • The types of file organization depends on the support of the DBMS • The designer should fully understand • the storage structures that are available • how the system uses these structures

5.0 Design and implement physical representation • The activities in this stage are: 5.1 Analyse transactions 5.2 Choose file organizations 5.3 Add secondary indexes 5.4 Consider the introduction of controlled redundancy 5.5 Estimate disk space

5.1 Analyse transactions • Identify the transactions or queries that will run on the database • Why? • To determine • the best access structures for the attributes • which type of file is best used

5.1 Analyse transactions • For each transaction, determine: • expected running frequency of transactions, e.g. no. of updates per day • the relations and attributes accessed by the transaction • the type of access – whether query, insert, update, or delete

5.1 Analyse transactions • For update transaction, note the attributes that are updated • These may be candidates for avoiding an access structure, e.g. secondary index

5.1 Analyse transactions • For each transaction, determine the attributes use in any predicates • check whether the predicate involves • pattern matching • range searches • exact-match key retrieval • These attributes may be candidates for access structures

5.1 Analyse transactions • For each transaction, determine • the attributes of a query that are involved in the join of two or more relations • These attributes may be candidates for access structures

5.1 Analyse transactions • For each transaction, determine • time constraints imposed on the transactions, e.g., the transaction must complete within 1 sec • Attributes used in any predicates for critical transaction should have a higher priority for access structures

5.1 Analyse transactionsTransaction Usage Map • Impossible to analyse all transactions • Investigate the most important ones • Use Transaction Usage Map to • identify which relations each transaction accesses • indicate which of these relations are potentially heavily used

5.1 Analyse transactionsTransaction Usage Map • Map all transaction paths to relations • Determine which relations are most frequently accessed by transactions • Analyse selected transactions that involve these relations • How to do it?

5.1 Analyse transactionsTransaction Usage Map • Identify transactions, e.g., (A) Insert details for a new member of staff, given branch address (B) List rental properties handled by each staff at a given branch address (C) Assign a rental property to a staff, checking that a staff does not manage more than 10 properties (D) List rental properties handled by each branch office

5.1 Analyse transactionsTransaction Usage Map • Focus on the ER model for the transactions • Estimate the average and maximum number of occurrences in each relationships • Identify the relations that are most often required

avg = 20 max = 40 1500 Staff 1 M Oversees Has 1 avg = 6 max = 10 M M 1 12000 50 ISAllocated Property for Rent Branch avg = 50 max = 300 5.1 Analyse transactionsTransaction Usage Map Staff and Property for Rent are accessed more often

5.1 Analyse transactionsTransaction Usage Map • Consider the day and time that a transaction is run - peak, average, or min. load • Identify the pattern of operation of tran-sactions • mutually exclusive • conflict • Examine transactions closely if pattern of operation conflicts

5.1 Analyse transactionsTransaction Usage Map • Look how each relation is accessed • Insert (I) • Read (R) • Update (U) • Delete (D) • Identify which attributes are used to gain access • Asterisk all transactions that scan through a set of records

B (I) (R) B A A B (R) (R) (R) 5.1 Analyse transactionsTransaction Usage Map 1500 (A) Insert details for a new member of staff, given branch address Staff Oversees Has (B) List rental properties handled by each staff at a given branch address 12000 50 ISAllocated Property for Rent Branch

C (R) (U) C (R) D (R) (R) D C 5.1 Analyse transactionsTransaction Usage Map 1500 (D) List rental properties handled by each branch office (C) Assign a rental property to a staff, checking that a staff does not manage more than 10 properties Staff Oversees Has 12000 50 ISAllocated Property for Rent Branch

5.1 Analyse transactionsTransaction Table Transaction B Day Time Number of runs per hour Peak Mon 9-10 pm 2 Wed 2-4 pm 2 Ave - - - From relation To relation Attributes Access No. of times accessed - Branch 1 Address R(E) Bno R Branch Staff 8-20 Bno R(E)* Sno R FName R LName R Staff Property 48-200 for Rent Sno R(E)* Pno R Address R Entry point A set of records is scanned (B) List rental properties handled by each staff at a given branch address

5.0 Design and implement physical representation • The activities in this stage are: 5.1 Analyse transactions 5.2 Choose file organizations 5.3 Add secondary indexes 5.4 Consider the introduction of control-led redundancy 5.5 Estimate disk space

5.2 Choose file organizations • Consider using the following file organizations • Heap • Hash • Index Sequential Access Method (ISAM) • B+- Tree

5.2 Choose file organizations • Document choice of file organization • Give reasons for your choice based on the advantages of a particular file organization

THE END College of Information Technology

B. Information Technology (Hons.) CMPB245: Database Design