1 / 52

Database Management Systems

Database Management Systems. Learning Objectives. Recognize complexities/limitations of traditional data management approaches Appreciate advantages of database approach to data management Understand creation of normalized tables in relational database

Download Presentation

Database Management Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Database Management Systems

  2. Learning Objectives • Recognize complexities/limitations of traditional data management approaches • Appreciate advantages of database approach to data management • Understand creation of normalized tables in relational database • Understand use of entity-relationship diagrams in database design and implementation • Explain the importance of advanced databases applications in decision support and knowledge management

  3. Business Event Processing • Raw data captured as events occur • Minimum data to be collected/stored • Who, What, When, Where • Data can be aggregated to meet user requirements

  4. Applications Versus Database Approaches to Business Event Processing • Computer processing involves two components: data and instructions (programs). • Conceptually, there are two methods for designing the interface between program instructions and data: • file-oriented processing: A specific data file was created for each application • data-oriented processing: Create a single data repository to support numerous applications. • Disadvantages of file-oriented processing include redundant data and programs and varying formats for storing the redundant data. • The format for similar fields may vary because the programmer used inconsistent field formats.

  5. Applications Approach Sales order file User 1 Sales Transactions Sales program Inventory file Customer file User 2 Inventory Transactions Inventory file Inventory program Sales order file

  6. Data Redundancy in Application Approach • Data Storage - creates excessive storage costs of paper documents and/or magnetic form • Data Updating - any changes or additions must be performed multiple times • Currency of Information - potential problem of failing to update all affected files • Task-Data Dependency - user’s inability to obtain additional information as his or her needs change • Data Inconsistencies – Data format inconsistencies

  7. Application approach - Record Layout Sales order # Sales order date Customer PO # Sales region Unit price Customer code Quantity ordered Item # Sales order record layout Inventory item # Quantity balance Sales Order # Customer code Item name Unit price Quantity ordered Unit cost Inventory record layout

  8. Database Approach Database User 1 Transactions Sales Program A, B, C, X, Y, L, M D B M S User 2 Transactions Inventory program User 3 Transactions Accounting GL program

  9. Advantages of the Database Approach Data sharing/centralize database resolves problems of the applications approach: • Nodata redundancy -Data is stored only once, eliminating data redundancy and reducing storage costs. • Single update - Because data is in only one place, it requires only a single update procedure, reducing the time and cost of keeping the database current. • Current values - A change to the database made by any user yields current data values for all other users. • Task-data independence - As users’ information needs expand beyond their immediate domain, the new needs can be more easily satisfied than under the flat-file approach.

  10. Disadvantages of the Database Approach • Can be costly to implement • additional hardware, software, storage, and network resources are required • Can only run in certain operating environments • may make it unsuitable for some system configurations • Because it is so different from the Applications (file-oriented) approach, the database approach requires training users and may give rise to resistance

  11. Elements of the Database Approach Database Administrator System Development Process System Requests Applications User Programs DBMS Transactions Data Definition Language Host Operating System U S E R S Transactions User Programs Data Manipulation Language Transactions User Programs Query Language Physical Database User Queries

  12. Database Management Systems (DBMS) • DBMS is a set of integrated programs designed to simplify tasks of creating, accessing, managing databases • Its functions are: • defining data • defining relations among data • interfacing with operating system • mapping each user’s view of data (subschema) to organisational view of data (Schema)

  13. DBMS Features • User Programs - makes the presence of the DBMS transparent to the user • Direct Query - allows authorized users to access data without programming • Application Development - user created applications • Backup and Recovery - copies database • Database Usage Reporting - captures statistics on database usage (who, when, etc.) • Database Access - authorizes access to sections of the database

  14. Internal Controls and DBMS • The purpose of the DBMS is to provide controlled access to the database. • The DBMS is a special software system programmed to know which data elements each user is authorized to access and deny unauthorized requests of data.

  15. Data Definition Language (DDL) • DDL is a programming language used to define the database to the DBMS. • The DDL identifies the names and the relationship of all data elements, records, and files that constitute the database.

  16. Viewing Levels and Schema • Viewing Levels: • internal (physical) view - physical arrangement of records • conceptual (logical) view – schema of database • user view – subschema of the database that each user views

  17. Schema & Subschema YTD Sales Customer code Customer name Credit limit Sales Rep Sales region Address Schema Customer code Sales Rep YTD sales Customer name Sales region Sales manager’s subschema Customer code Customer name Credit limit Credit controller’s subschema

  18. Data Manipulation Language (DML) • DML is the proprietary programming language that a particular DBMS uses to retrieve, process, and store data. • Entire user programs may be written in the DML, or selected DML commands can be inserted into universal programs, such as COBOL and FORTRAN.

  19. Query Language • The query capability permits end users and professional programmers to access data in the database without the need for conventional programs. • SQL)has emerged as the standard query language and is a useful tool for the accounting and business professionals.

  20. Functions of the Database Administrator

  21. Coding Systems • Sequential (serial numbering) • Student ID numbers • “Wait” ticket at Post Office • Example based on employee ID codes: • 001 - 1st employee hired • 002 - 2nd employee hired • etc

  22. Coding Systems..contd. • Block- block of digits identified for certain purpose • Chart of Accounts: • 001-099 Assets • 100 -199 Liabilities • 200 – 299 Owners’ equity • 300 -399 Revenue • 400 – 499 Expenses • Employee ID codes: • 001-100 - Fab dept. • 101-200 - Assembly dept.

  23. Coding Systems…contd. • Group (significant digit) Code – a series of block of digits; each block is assigned specific meaning • Inventory item XX – X – XX – XXXX Product Product Warehouse Item Group Model code (unique identifier) • Bank Customer Account Code XXXX – XXXXXXX – XX – XX Branch Customer Account Check digit Code (unique) Type

  24. Coding Systems…contd. • Hierarchical Code – orders items in descending order, where each successive rank order is a subset of the rank above. • Employee ID Division – Branch – Department – Employee code (unique) • Mnemonic Code– usually made of letters for easy remembering or recognition. For example: M =male and F = female

  25. Logical Data Structures or Database Models • A particular method used to organize records in a database is called the database’s structure. • The objective is to develop this structure efficiently so that data can be accessed quickly and easily. • Four types of structures or database models are: • hierarchical (the tree structure): • parent (1) - child (m) • Network : parent (m) - child (m) • Relational: two-dimensional tables • Object-oriented: stores more complex data types

  26. The Relational Model • The relational model portrays data in the form of two dimensional tables: • Each column stores a specific attribute (data element) of an entity (e.g., customer). • Each row (record) stores all the (required) information of a particular instance (unique) of the entity stored in the table, e.g., a particular customer (Sykt XYZ) in the customer table. • Each row has a unique identifier called the primary key (first column of the table). • Some tables use two or more columns in combination to become the primary key for each row. This type of key is called the composite primary key.

  27. Properly Designed Relational Tables • No repeating values - All occurrences at the intersection of a row and column are a single value. • The attribute values in any column must all be of the same class. • Each column in a given table must be uniquely named. • Each row in the table must be unique in at least one attribute, which is the primary key.

  28. Entity Association:Cardinality (1:0,1) (1:1) (1:0,M) (1:M) (M:M)

  29. Relational Model Data Linkages (>1 table) • No explicit pointers are present. The data are viewed as a collection of independent tables. • Relations are formed by an attribute that is common to both tables in the relation. • Assignment of foreign keys: • if 1 to 1 association, either of the table’s primary key may be the foreign key. • if 1 to many association, the primary key on one of the sides is embedded as the foreign key on the other side. • if many to many association, may embed foreign keys or create a separate linking table.

  30. Three Types of Anomalies (Errors) • Insertion Anomaly: A new item cannot be added to the table until at least one entity uses a particular attribute item. • Deletion Anomaly: If an attribute item used by only one entity is deleted, all information about that attribute item is lost. • Update Anomaly: A modification on an attribute must be made in each of the rows in which the attribute appears. • Anomalies can be corrected by creating relational tables.

  31. Advantages of Relational Tables • Removes all three anomalies • Various items of interest (customers, inventory, sales) are stored in separate tables. • Space is used efficiently. • Very flexible. Users can form ad hoc relationships.

  32. The Normalization Process • A process which systematically splits unnormalized complex tables into smaller tables that meet two conditions: • all non-key (secondary) attributes in the table are dependent on the primary key • all non-key attributes are independent of the other non-key attributes • When unnormalized tables are split and reduced to third normal form, they must then be linked together by foreign keys. A foreign key is the primary key of another related table.

  33. Steps in Normalization Table with repeating groups Remove repeating groups First normal form 1NF Remove partial dependencies Second normal form 2NF Remove transitive dependencies Third normal form 3NF Remove remaining anomalies Higher normal forms

  34. Unnormalized Table • SO_Number Item_Number Item_Name Qty_Ordered Cust_Code Cust_Name • 211 Wheel 5 A035 Ali • 101 217 Tube 10 A035 Ali • 102 202 Handlebar 4 B023 Banny • 217 Tube 3 B023 Banny • 211 Wheel 12 C118 Chong • 103 217 Tube 7 C118 Chong • 103 202 Handlebar 2 C118 Chong Sales_Orders Table: one sales order record contains repeating item numbers, item names, qty ordered cust code and cust name. An unnormalized table is a table that contains repeated attributes (or fields) within each row (or record)

  35. First Normal Form (1NF) 1. Remove the repeating groups: • SO_Number Item_Number Item_Name Qty_Ordered Cust_Code Cust_Name • 211 Wheel 5 A035 Ali • 101 217 Tube 10 A035 Ali • 102 202 Handlebar 4 B023 Banny • 102 217 Tube 3 B023 Banny • 103 211 Wheel 12 C118 Chong • 103 217 Tube 7 C118 Chong • 103 202 Handlebar 2 C118 Chong SO_Number and Item_Number are made composite primary key.

  36. Second Normal Form (2NF) 2. Remove the functional dependencies: Sales_Orders • SO_Number Cust_Code Cust_Name • A035 Ali • 101 A035 Ali • 102 B023 Banny • 102 B023 Banny • 103 C118 Chong • 103 C118 Chong • 103 C118 Chong Item_Number Item_Name202 Handlebar 211 Wheel 217 Tube Inventory_Items SO_Number Item_Number Qty Ordered 101 211 5 101 217 10 102 202 4 102 217 3 103 202 2 103 211 12 103 217 7 Sales_Order line item Inventory

  37. Third Normal Form (3NF) 3. Remove the transitive dependencies: Sales_Orders • SO_Number Cust_Code • A035 • 101 A035 • 102 B023 • 102 B023 • 103 C118 • 103 C118 • 103 C118 Item_Number Item_Name202 Handlebar 211 Wheel 217 Tube Inventory_Items SO_Number Item_Number Qty Ordered 101 211 5 101 217 10 102 202 4 102 217 3 103 202 2 103 211 12 103 217 7 Cust_Code Cust_Name A035 Ali B023 Banny C118 Chong Customers Sales_Order line item Inventory

  38. Accountants and Data Normalization • The update anomaly can generate conflicting and obsolete database values. • The insertion anomaly can result in unrecorded transactions and incomplete audit trails. • The deletion anomaly can cause the loss of accounting records and the destruction of audit trails. • Accountants should have an understanding of the data normalization process and be able to determine whether a database is properly normalized.

  39. Six Phases in Designing Relational Databases • Identify entities • identify the primary entities of the organization • construct a data model of their relationships • Construct a data model showing entity associations • determine the associations between entities • model associations into an ER diagram

  40. Six Phases in Designing Relational Databases • Add primary keys and attributes to the model • assign primary keys to all entities in the model to uniquely identify records • every attribute should appear in one or more user views • Normalize the data model and add foreign keys • remove repeating groups, partial and transitive dependencies • assign foreign keys to be able to link tables

  41. Six Phases in Designing Relational Databases • Construct the physical database • create physical tables • populate tables with data • Prepare the user views • normalized tables should support all required views of system users • user views restrict users from have access to unauthorized data

  42. Distributed Data Processing Centralized Database Central Site • Site A • Site B • Site C

  43. Distributed Data Processing (DDP) • DP is organized around several information processing units (IPUs) distributed throughout the organization and placed under the control of the end users. • DDP does not mean decentralization • IPUs are connected to one another and coordinated

  44. Potential Advantages of DDP • Cost reductions in hardware and data entry tasks • Improved cost control responsibility • Improved user satisfaction since control is closer to the user level • Backup of data can be improved through the use of multiple data storage sites

  45. Potential Disadvantages of DDP • Loss of control • Mismanagement of organization-wide resources • Hardware and software incompatibility • Redundant tasks and data • Consolidating incompatible tasks • Difficulty attracting qualified personnel • Lack of standards

  46. Centralized Databases in DDP Environment • The data is retained in a central location. • Remote IPUs send requests for data. • Central site services the needs of the remote IPUs. • The actual processing of the data is performed at the remote IPU.

  47. Data Currency • Occurs in DDP with a centralized database • During transaction processing, the data will temporarily be inconsistent as a record is being read and updated. • Database lockout procedures are necessary to keep IPUs from reading inconsistent data and from writing over a transaction being written by another IPU.

  48. Distributed Databases: Partitioning • Splits the central database into segments that are distributed to their primary users • Advantages: • users’ control is increased by having data stored at local sites • transaction processing response time is improved • the volume of transmitted data between IPUs is reduced • reduces the potential data loss from a disaster

  49. The Deadlock Phenomenon • Especially a problem with partitioned databases • Occurs when multiple sites lock each other out of data that they are currently using • One site needs data locked by another site. • Special software is needed to analyze and resolve conflicts. • Transactions may be terminated and have to be restarted.

  50. The Deadlock Phenomenon Locked A, waiting for C Locked E, waiting for A A,B E, F C,D Locked C, waiting for E

More Related