1 / 23

Relational Data Analysis

Relational Data Analysis. Learning outcomes understand the process of normalisation; perform Relational Data Analysis; recognise the importance of normalised databases; recognise first, second and third normal forms; augment and formalise your understanding of Logical Data Modelling;

davin
Download Presentation

Relational Data Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Relational Data Analysis Learning outcomes • understand the process of normalisation; • perform Relational Data Analysis; • recognise the importance of normalised databases; • recognise first, second and third normal forms; • augment and formalise your understanding of Logical Data Modelling; • decide when in the development cycle to perform normalisation; • evaluate the integrity of a data structure.

  2. Relational Data AnalysisConcepts • Relations, Tables and Entity Types • Repeating Groups and Levels • Functional Dependencies • Advantages of Normalisation • Unnormalised Data • First, Second and Third Normal Forms • Rationalising 3NF Tables • Converting 3NF Tables into an LDS

  3. Relational Data AnalysisRelations and Tables Supplier Number Supplier Name Supplier Address Supplier Tel. No. Sup Con 1463 SRW 115 Lancelot St 020-75630254 John 3621 Off Beat Recordings 12 High St 020-8… ... 2327 Bella Sonic Lake Industrial Estate ... ... 6762 … 3 Lot’s Corner ... ... 1970s Edgar Codd IBM Attribute Names Primary Key Columns Rows

  4. Relational Data AnalysisTables and Entity Types Supplier Supplier Number Supplier Name Supplier Address Supplier Tel. Number Supplier Contact Name Supplier Supplier Number Supplier Name Supplier Address Supplier Tel. Number Supplier Contact Name SUPPLIER (Supplier Number, Supplier Name, Supplier Address,Supplier Tel. Number, Supplier Contact Name)

  5. Relational Data AnalysisRepeating Groups and Levels Product’s Suppliers List Product Number: 993201 Product Name:Unbranded Blank3hrVideoTapes Product Type Code:BV Product Type Name:BlancVideo Supplier Supplier Supplier’s Cost Main Supplier Number Name Product Ref No Price Y/N 1463 SRW 3HVHS 54p Y 3628 Videos Are Us 438893 57p N 2327 Bella Sonic 3485VHS/3 53p N

  6. Relational Data AnalysisRepeating Groups and Levels PRODUCT’S SUPPLIERS Product Number Product Name Product Type Code Product Type Name Supplier Number Supplier Name S/P Ref. Number Cost Price Main Supplier Y/N level 1 1 1 1 2 2 2 2 2

  7. Relational Data AnalysisFunctional Dependencies • An attribute X is said to be functionally dependent on an attribute Y if each value of Y is associated with only one value of X • For example, each product is of one product type. This means that each product number is associated with only one product type code.Product type code is therefore functionally dependent on product number. The opposite is not true as each product type code may be associated with many product numbers

  8. Relational Data AnalysisFunctional Dependencies • Another way of phrasing functional dependency is to say that the value of X can be determined from the value of Y, or that Y functionally determines X • So product number functionally determines the value of product type code, or the value of product type code can be determined from the value of the product number, i.e. given the value of a product number we can always establish the value of the associated product type code.

  9. Relational Data AnalysisNormalisation • The process of normalisation involves applying a series of refinements to groups of data items in order to produce tables that conform to specified standards, known as normal forms • Unnormalised tables are converted to First Normal Form by removing repeating groups into separate tables. Second and Third Normal Forms are achieved by reducing and splitting tables so that the only functional dependencies which exist are between the primary keys and the remaining non-key attributes.

  10. Relational Data AnalysisAdvantages of Normalisation • Before describing normalisation in detail it is worth mentioning some of its advantages briefly. Data in Third Normal Form (3NF) consists of tables of closely associated attributes which are entirely dependent on ‘the key, the whole key, and nothing but the key’ • This has the effect of minimising data duplication across different tables, thereby resolving many of the problems associated with data redundancy. In particular it should reduce the incidence of ‘update anomalies’

  11. Relational Data AnalysisAdvantages of Normalisation • Update anomalies is the collective term for problems with modifying, inserting and deleting data from a database • These can be illustrated by considering the ‘unnormalised’ contents of the list of a product’s suppliers again • If we were to implement this data structure as it stands, and to use it as the only place in which product details were stored we would encounter the following problems:

  12. Relational Data AnalysisAdvantages of Normalisation • Insertion Anomalies. No new suppliers could be added to the system without adding a product • Deletion Anomalies. If the last remaining product for a given supplier were deleted, then all information on that supplier would be lost • Amendment Anomalies. Any change to a supplier’s details (e.g. to the telephone number) would mean that every product for that supplier would need amending to keep it in line.

  13. Relational Data AnalysisUNF to 1NF Choose data items Identify keys Split groups Product Number Product Name Product Type Code Product Type Name Supplier Number Supplier Name S/P Ref. Number Cost Price Main Supplier Y/N 1NF UNF level 1 1 1 1 2 2 2 2 2 Product Number Product Name Product Type Code Product Type Name Product Number Supplier Number Supplier Name S/P Ref. Number Cost Price Main Supplier Y/N Note key

  14. Relational Data Analysis1NF to 2NF Does this attribute depend on the whole of the primary key? 2NF 1NF Product Number Product Name Product Type Code Product Type Name Product Number Supplier Number Supplier Name S/P Ref. Number Cost Price Main Supplier Y/N Product Number Product Name Product Type Code Product Type Name Product Number Supplier Number S/P Ref. Number Cost Price Main Supplier Y/N Supplier Number Supplier Name

  15. Relational Data Analysis2NF to 3NF Is this attribute dependent on any other non-key attribute(s)? Note foreign key 3NF 2NF Product Number Product Name Product Type Code Product Type Name Product Number Supplier Number S/P Ref. Number Cost Price Main Supplier Y/N Supplier Number Supplier Name Product Number Product Name *Product Type Code Product Type Code Product Type Name Product Number Supplier Number S/P Ref. Number Cost Price Main Supplier Y/N Supplier Number Supplier Name

  16. Relational Data AnalysisNaming the 3NF tables Table Names 3NF Product Number Product Name *Product Type Code Product Type Code Product Type Name Product Number Supplier Number S/P Ref. Number Cost Price Main Supplier Y/N Supplier Number Supplier Name PRODUCT PRODUCT TYPE SUPPLIER PRODUCT SUPPLIER

  17. Relational Data AnalysisRationalising 3NF tables • Once we have carried out normalisation on a number of Functions we will have several sets of tables in 3NF, which we now rationalise into a single, larger set • Any tables that share a primary key should be merged, as should tables with matching candidate keys • We will also look for attributes which now act as foreign keys when compared with primary keys in other 3NF sets. A little care is needed to ensure that any synonyms or homonyms are identified, as failure to do so could lead to missing or spurious merges.

  18. Relational Data AnalysisConverting 3NF tables into LDSs PRODUCT TYPE PRODUCT SUPPLIER SUPPLIER PRODUCT 3NF Product Number Product Name *Product Type Code Product Type Code Product Type Name Product Number Supplier Number S/P Ref. Number Cost Price Main Supplier Y/N Supplier Number Supplier Name PRODUCT Product Type Code PRODUCT TYPE SUPPLIER PRODUCT Product Number *Product Type Code Supplier Number SUPPLIER Represent each table as an entity type box List primary and foreign key attributes Small Keys Grab Large Keys *Supplier Number *Product Number Crows Feet Grab Asterisks

  19. Relational Data AnalysisComparing LDSs • We now compare our two data structures (the extract we just produced using RDA with the Required System LDM we have produced earlier in the development and decide whether any discrepancies are due to errors in Logical Data Modelling or whether they represent redundant information resulting from RDA • In practice there may be large numbers of entities involved in the comparison, so a fair amount of time is likely to be spent in identifying corresponding entities in the two models. Probably the best starting point is to look for common attributes, in particular common primary keys or candidate keys.

  20. Relational Data AnalysisSummary • Relational Data Analysis (RDA) is based on material published in the 1970s by Edgar Codd of IBM, proposing the application of mathematical set theory and algebra to the organisation of data • RDA is used to create data model extracts from collections of individual data items, which can then be used to enhance or confirm the Required System LDM and to provide the basis for database design

  21. Relational Data AnalysisSummary • The process of normalisation involves applying a series of refinements to groups of data items in order to produce tables that conform to specified standards, known as normal forms • Data in Third Normal Form (3NF) consists of tables of closely associated attributes which are entirely dependent on ‘the key, the whole key, and nothing but the key’. This has the effect of minimising data duplication across different tables, thereby resolving many of the problems associated with data redundancy

  22. Relational Data AnalysisSummary • Finally, the only way to learn normalisation is to practise. • This is specially true in order to understand the process of rationalisation and how to compare LDSs

  23. The Place of Relational Data Analysis Investigation BAM RD DFM BSO Specification WPM Conceptual Model External Design LDM User Organisation Policies and Procedures Decision Structure RDA FD Internal design Construction

More Related