550 likes | 1k Views
ORGANIZING DATA IN A TRADITIONAL FILE ENVIRONMENT. File Organization Terms and Concepts. Bit: Smallest unit of data; binary digit (0,1) Byte: Group of bits that represents a single character Field: Group of words or a complete number. ORGANIZING DATA IN A TRADITIONAL FILE ENVIRONMENT.
E N D
ORGANIZING DATA IN A TRADITIONAL FILE ENVIRONMENT File Organization Terms and Concepts • Bit: Smallest unit of data; binary digit (0,1) • Byte: Group of bits that represents a single character • Field: Group of words or a complete number
ORGANIZING DATA IN A TRADITIONAL FILE ENVIRONMENT File Organization Terms and Concepts • Record:Group of related fields • File:Group of records of same type • Database:Group of related files
Figure 7-1 ORGANIZING DATA IN A TRADITIONAL FILE ENVIRONMENT Data Hierarchy in a Computer System
ORGANIZING DATA IN A TRADITIONAL FILE ENVIRONMENT File Organization Terms and Concepts • Entity: Person, place, thing, event about which information is maintained • Attribute: Description of a particular entity • Key field: Identifier field used to retrieve, update, sort a record
Figure 7-2 ORGANIZING DATA IN A TRADITIONAL FILE ENVIRONMENT Entitities and Attributes
Figure 7-3 ORGANIZING DATA IN A TRADITIONAL FILE ENVIRONMENT Traditional File Processing
ORGANIZING DATA IN A TRADITIONAL FILE ENVIRONMENT Problems with the Traditional File Environment • Data redundancy • Program-Data dependence • Lack of flexibility • Poor security • Lack of data-sharing and availability
DATA REDUNDANCY • The presence of duplicate data in multiple data files • Different functions collect the same information independently • May have different meanings in different parts of the organisation
Data Redundancy • Staff_Branch relation has redundant data; the details of a branch are repeated for every member of staff. • In contrast, the branch information appears only once for each branch in the Branch relation and only the branch number (Branch_No) is repeated in the Staff relation, to represent where each member of staff is located. 8
Program Data Dependence • The tight relationship between data stored in files and the specific programs required to update and maintain those files • Every program must describe the nature • In traditional file environment any changes to data requires a change in all programs that access the data • A change in tax rates for example !!
Lack of Flexibility • Traditional File system can deliver routine scheduled reports after a significant programming efforts • An ad hoc/ unanticipated request for information, would require a lot of time • The information is somewhere in the system but too expensive to locate/retrieve • Compiling the data could take weeks
Poor Security • There is little or no control and management of data • Data could be disseminated all over the organisation without control • Who is accessing the data and making changes?
Lack of Data-sharing • Lack of control over access • Hard to get hands on information • Different pieces of information in different files and different physical locations • Since files in different locations can’t be related hard to share or access in a timely manner • Impossible for information to flow freely
Database Technology • DATABASE: • A collection of data organised to serve many applications efficiently by centralising the data and minimising redundant data.
Historical context • Why develop DBMS at all? • Manage flood of data from Transaction Processing Systems • Integrate data across organisation • “Data glare”
DBMS A Database Management System (DBMS) is general purpose software and hardware facility to: • Create, delete, reorganize, and manipulate data in a database • Store, retrieve, share, and maintain data in a database • Maintain relationships between the database components
THE DATABASE APPROACH TO DATA MANAGEMENT Database Management System (DBMS) • Creates and maintains databases • Eliminates requirement for data definition statements • Acts as interface between application programs and physical data files
DBMS Cont’d • Provide security and procedures relating to privilege and access. • Authenticates the integrity of all the updates and transactions that are carried out. • interface for the access, deletion and addition of data and for redefining the relationships within the database. A DBMS is a collection of programs that manages the database structure and controls access to the data stored in the database.
DBMS • Relieves the programmer or end user from the task of understanding where and how data are actually stored • Seperates the logical view from the physical view • Logical View-How data perceived by end users or business specialists • Physical View-How data is actually organised and structured on phsical storage media
Figure 7-4 THE DATABASE APPROACH TO DATA MANAGEMENT The Contemporary Database Environment
THE DATABASE APPROACH TO DATA MANAGEMENT Types of Databases • Relational DBMS • Hierarchical and Network DBMS • Object-Oriented Databases
THE DATABASE APPROACH TO DATA MANAGEMENT Relational DBMS • The most popular type of DBMS today for PCs as well as for larger companies and mainframes • Represents all data in DB as two-dimensional tables called relations • Similar to flat files but information in more than one file can easily be extracted and combined • Relates data across tables based on common data element • Examples: DB2, Oracle, MS SQL Server
Figure 7-6 THE DATABASE APPROACH TO DATA MANAGEMENT Relational Data Model
THE DATABASE APPROACH TO DATA MANAGEMENT Three Basic Operations in a Relational Database • Select:Creates subset of rows that meet specific criteria • Join:Combines relational tables to provide users with information • Project:Enables users to create new tables containing only relevant information
Figure 7-7 THE DATABASE APPROACH TO DATA MANAGEMENT Three Basic Operations in a Relational Database
THE DATABASE APPROACH TO DATA MANAGEMENT Hierarchical and Network DBMS • Hierarchical DBMS • Organizes data in a tree-like structure • Supports one-to-many parent-child relationships • Prevalent in large legacy systems
Figure 7-8 THE DATABASE APPROACH TO DATA MANAGEMENT Hierarchical DBMS
Hierarchical • Disadvantages • Knowledge of physical level required • Does not support logical data independence and does not support all physical data independence operations • Not all problems are one-to-many types • Problems with multiple parent implementation • Problems with anomalies for parent deletion • Application development in 3GL time-consuming • Support programs are not part of the DBMS • “System created by programmers for programmers!”
THE DATABASE APPROACH TO DATA MANAGEMENT • Network DBMS • Depicts data logically as many-to-many relationships
THE DATABASE APPROACH TO DATA MANAGEMENT Network DBMS
THE DATABASE APPROACH TO DATA MANAGEMENT Disadvantages • Outdated • Less flexible compared to RDBMS • Lack support for ad-hoc and English language-like queries
Object-Oriented databases • Object-oriented DBMS:Stores data and procedures as objects that can be retrieved and shared automatically • Object-relational DBMS:Provides capabilities of both object-oriented and relational DBMS
DBMS Disadvantages • DBMSs are complex; • Need for explicit backup and control; • Costs associated with development and operation can be substantial; • Consolidation of an entire business’ information resources can create a high level of vulnerability.
Designing Databases • Conceptual design:Abstract model of database from a business perspective • Physical design:How data is actually stored on direct access storage devices
CREATING A DATABASE ENVIRONMENT Designing Databases • Entity-relationship diagram:Methodology for documenting databases illustrating relationships between database entities • Normalization:Process of creating small stable data structures from complex groups of data
Figure 7-10 An Entity-Relationship Diagram
Figure 7-11 An Unnormalized Relation of ORDER
Figure 7-12 An Normalized Relation of ORDER
Distributing Databases Centralized database • Used by single central processor or multiple processors in client/server network
Distributing Databases Distributed database • Stored in more than one physical location • Partitioned database • Duplicated database
Figure 7-13 Distributed Databases
DATABASE TRENDS Data Warehousing and Datamining Data warehouse • Supports reporting and query tools • Stores current and historical data • Consolidates data for management analysis and decision making
Figure 7-16 DATABASE TRENDS Components of a Data Warehouse
DATABASE TRENDS Data Warehousing and Datamining Datamining • Tools for analyzing large pools of data • Find hidden patterns and infer rules to predict trends
DATABASE TRENDS Benefits of Data Warehouses • Improved and easy accessibility to information • Ability to model and remodel the data
DATABASE TRENDS Databases and the Web Database server • Computer in a client/server environment runs a DBMS to process SQL statements and perform database management tasks Application server • Software handling all application operations
Figure 7-18 DATABASE TRENDS Linking Internal Databases to the Web