1.24k likes | 1.26k Views
Learn fundamental concepts of data organization, database management, and storage systems. Understand the benefits and drawbacks of databases, different database models, and recent developments in data management. Discover the common functions of database systems and popular end-user systems. Explore data hierarchy, logical units, and database structures. Enhance your understanding of data management principles and key database concepts.
E N D
Organizing Data and Information Chapter 5
Chapter 5, organizing data and information, describes basic concepts that are essential to understanding the key functions in, and impacts of, data storage and database management systems. Most organizations have a significant amount of data; but for that data to be useful, it must be properly organized and easily accessible. Many firms develop databases in order to manage data more efficiently. Designed correctly, databases can contribute to organizational success by providing information to help reduce costs, increase profits, track past activities, and forecast the future. Principles of Information Systems, Fifth Edition
After study in chapter five, you should be able to address the learning objectives described in the following 2 slides. Principles of Information Systems, Fifth Edition
Learning Objectives • Define general data management concepts and terms • Describe the advantages & disadvantages of the database approach to data management • Name 3 database models & list their features, advantages & disadvantages. Principles of Information Systems, Fifth Edition
This chapter describes two basic approaches to data management: the traditional file based approach and the newer database approach. The database approach provides significant advantages that you should be able to describe. There are different databases models and you should understand the differences. Principles of Information Systems, Fifth Edition
Learning Objectives • Identify the common functions performed by all database management systems • Identify 3 popular end-user database management systems • Briefly discuss recent database developments. Principles of Information Systems, Fifth Edition
Data and information are critical assets of an enterprise. Data and information are critical assets of an enterprise. Therefore it is important that a database is well designed and well managed so that data can be easily accessed when needed. Because of the importance of data in any organization, it is important to understand database trends and developments and the benefits they will produce. Principles of Information Systems, Fifth Edition
It is helpful to consider data as organized in a hierarchy, from the smallest unit (a bit) to the largest (a database). There is a difference between physical units of data, such as bits, which is how data exists in the computer or in storage, and logical units of data, which is how data is represented so that it will be meaningful to users. Data physically exists in bits, bytes, and words. The smallest logical representation of data used a character, such as an “A” or a “$”. A character usually consists of one byte. Principles of Information Systems, Fifth Edition
A character may be an upper case or a lower case letter, a number,or a symbol. Characters are combined to make a field. A field is a combination of characters describing a characteristic of a business object or activity. For example “employee name” and “social security number” are fields. They are characteristics or attributes describing “employee”. Fields are combined to form a record. Principles of Information Systems, Fifth Edition
A record is a collection of related fields. For example, a student record would contain fields about one student, such as student name, address, SSN, and date of birth. Records are combined to form a file. A file is a collection of related records. For example, a STUDENT file would contain student records. Continuing the previous example, each student record would contain the student’s name, address, SSN, and date of birth. In order to distinguish one record from another, each record contains a record key. Principles of Information Systems, Fifth Edition
A record key is a field that uniquely identifies a record. In our student file, the record key would be social security number. Files are combined to create a database. At the top of the data hierarchy is the largest of logical unit of data, a database. A database is a collection of integrated files that are organized in such a way as to be accessible to multiple applications and users. A database includes data and records and the relationships among them. Principles of Information Systems, Fifth Edition
Figure 5.2 illustrates some important data base concepts. An entity is a general class of people, places, events, or objects – that is virtually anything – for which data is collected and stored. For instance, at a university, students, library books, and courses would be some of the entities. The entity shown in figure 5.2 is employee. An attribute is a characteristic of an entity. In figure 5.2 employee number, last name, first name, hire date, and department number are attributes describing employee. The specific value of an attribute, such as “Johns” is called a data item. Principles of Information Systems, Fifth Edition
A “key” is a field or set of fields that identifies a record. A primary key is a field or a set of fields the uniquely defines a record. No two records may contain the same value in a primary key field. For example, if two employees had identical social security numbers, it would make it difficult to select a particular employee’s record. A secondary key is an alternative key that can be used to access records if the primary key is not known. Principles of Information Systems, Fifth Edition
A secondary key is not necessarily unique. For example if last name were a secondary key, there might be more than one employee named “Jones”. Therefore more than one employee record would be retrieved and displayed to the user, and the employee’s identity would have to be verified by using other information, such as department number or social security number. Principles of Information Systems, Fifth Edition
The Traditional versus the Database Approach to Data Management
As stated earlier, there are two main approaches to data management: the traditional file management approach and the newer database approach. We’ll now consider the characteristics, benefits, and disadvantages of both approaches. Principles of Information Systems, Fifth Edition
In the traditional, or file-based, approach to data management, separate data files are created by or for different application programs. In a file based environment, for instance, the library would keep a STUDENT file associated with its program to check out books, the registrar’s office would have a different STUDENT file for use by the registration program, and the college of business would keep a separate file of business students for advising purposes. Principles of Information Systems, Fifth Edition
These three files would contain redundant information, such as a student’s name, phone number, and address, as well as data unique in to a particular application, such as a call number of a book checked out. Principles of Information Systems, Fifth Edition
Problems with the “Traditional” Approach • Data redundancy • Program-data dependence • Inflexibility Principles of Information Systems, Fifth Edition
Data redundancy is a problem in the file management environment, since it can lead to inconsistencies. For example, it the college of business maintains student grades in their student file, they would have to be sure that they were identical to those entered in the registrar’s file. If a student’s grade is changed, someone would need to be sure that it was changed in both places. Data redundancy thus makes it difficult to maintain data integrity. Data integrity is the degree to which data into any one file is accurate. Principles of Information Systems, Fifth Edition
Data stored in files is related to buy the programs using the data, not by the data themselves. Program data dependence means that programs and data developed for one application are not compatible with programs and data organized differently for another application. for example, the registrar’s program might store grade point averages to three decimal places, such as 3.413, whereas the college of business’ program might or GPA to only two decimal places, such as 3.41. In a file based environment programs would need to be changed a in order to access one file or the other. It can be costly to write programs bridging different files. Principles of Information Systems, Fifth Edition
Since data stored in files is related by programs, changes cannot be quickly made when different data or combinations of data are needed. For example, if one day the dean of the college of business suddenly discovers that she needs information about IS that she has never required before, such as the number of overdue library books they have had while active the university, programmers would need to to write the programs needed to extract the required information from the different files and the library and college of business. Principles of Information Systems, Fifth Edition
This could take a long time and the dean may no longer need the information when the program is completed. Despite these difficulties, some organizations continue to use at, since the cost of converting existing, or legacy, files to a different approach can be high. Principles of Information Systems, Fifth Edition
A database is organized to minimize redundancy and to allow data to be shared among different application programs. There is not a collection of separate files, but rather,all applications share a collection of data stored in the database. For a database to be used most effectively, software called a database management system is used. The data base management system acts like a buffer between the users or application programs and the database. Principles of Information Systems, Fifth Edition
The database management system stores the relationships among the data and provides the interface for end users or application programs to access the data. thus, in our student example, programs used by the registrar’s office, the library, and the college of business, would all use a common pool of student data. Principles of Information Systems, Fifth Edition
As we have seen, an important advantage of the database approach to data management is reduced data redundancy. Ideally each attribute is stored only one time, although for performance reasons this is often not done. However, the database management system software ensures the consistency of data stored more than one time. Thus data integrity is ensured. Table 5.1 explains other advantages to the data base approach. The data base management system also serves as a buffer between users or programs and the stored data. Therefore, if files change, programs do not have to be rewritten. Principles of Information Systems, Fifth Edition
This is called data independence. The information systems personnel charged with maintaining the database need only change descriptions in the database management system software. The database approach also offers a more flexible approach to data access. New programs do not need to be written as data needs change. Most database management systems include a fourth generation language that information systems specialists or end users can use to produce new reports quickly. Principles of Information Systems, Fifth Edition
However, the database approach does have its drawbacks. There is a high cost involved with acquiring and implementing a database and database management system. There are additional personnel costs involved with ongoing database management. Furthermore, when all data is stored in the same place, it is more vulnerable to accidental or intentional error. Principles of Information Systems, Fifth Edition
Because enterprises maintain so much data, it is important that it is organized in a way that allows it to be used effectively. So how a database is designed is critical. Principles of Information Systems, Fifth Edition
Data Design Issues • Content: What data should be collected? • Access: What data should be given to what users? • Logical structure: How will the data be organized to make sense to a particular user? • Physical organization: Where will the data actually be located? Principles of Information Systems, Fifth Edition
Because a database design must reflect the enterprise’s business processes, several questions must be carefully considered when designing a database. Clearly, the content of the database must be determined. However this may not be an easy question. Sometimes, it is very costly to collect and maintain particular data, so the importance and cost of the data must be considered as tradeoffs. It is important to consider which users will be given access to which data, as well as the actions they will be allowed to perform on that data. Principles of Information Systems, Fifth Edition
For example, although an employee may be authorized to view is personnel record, he would not be authorized to change his salary. The logical structure of the database is determined by identifying and grouping different data items and identifying relationships among the groups. This must be done in a way that makes sense to end users and makes it easy for them to get the data they need. Principles of Information Systems, Fifth Edition
Finally, the actual data must be physically stored somewhere and in some physical storage organization. For example, decisions must be made about whether all data will be located in a single physical location or distributed among more than one computers or storage devices. Principles of Information Systems, Fifth Edition
Data Modeling • Logical design • Physical design • Planned data redundancy • Data model Principles of Information Systems, Fifth Edition
Building a database requires two different types of designs: a logical design and a physical design. The logical design shows how data are grouped together and how that are related to one another, in a way that makes sense to users. After the logical database design has been decided, the physical design is done. In the physical design, it may be necessary to change the logical design for performance and cost considerations. The physical design involves combining or splitting some of the groups identified in the logical design, deciding to maintain summary data in the database, and intentionally storing a data item more than one time. Principles of Information Systems, Fifth Edition
The latter is called planned data redundancy, and is done to improve system performance so that users can get information more quickly. A data model is a diagram used by a database designer to show the logical relationships among the entities in the database. When this is done at the level of the entire organization, the diagram is called an enterprise data model. Principles of Information Systems, Fifth Edition
Entity-Relationship (ER) Diagrams • Fig 5.5
An entity relationship diagram, often called an E-R diagram, is often used in data modeling. E-R diagrams use simple graphical symbols to represent entities and their relationships. Figure 5.5 shows two entities, customer and product. The relationship between customer and product is that one customer may order many products. this is called a one to many relationship. Attributes of customer include last name, first name, and identification number. Product attributes include color, name, and identification number. Principles of Information Systems, Fifth Edition
Data Models • Hierarchical models • Network models • Relational models Principles of Information Systems, Fifth Edition
A data model defines how records are related, which affects how users can access the data. Database management systems are classified by the type of data model they support. The three main data models are the hierarchical, network, and relational models. Relational models have become the most popular. Principles of Information Systems, Fifth Edition
The hierarchical model was the original data model. A hierarchical model, or tree, resembles an organization chart. It is best suited to data that can be properly described using one-to-many relationships. For example in figure 5.6, a project described in the database can have multiple employees from multiple departments associated with it. This organization would be very useful, for example, to print listings of employees assigned to particular projects. However, it would be clumsy to use this database to view the record of a single employee. Principles of Information Systems, Fifth Edition
If for instance, I wanted to see John Smith’s phone number, the program or database management system would need to search through each department connected to each project to find John Smith. This would not be efficient. Hierarchical databases are very efficient when organized correctly. However, once a relationship is established between data elements in a hierarchical model, it is very difficult to change it. Principles of Information Systems, Fifth Edition