220 likes | 238 Views
FILES AND DATABASES. A FILE is a collection of records with similar characteristics, e.g: A Sales Ledger Stock Records A Price List Customer Records Files may be manual, or computer. A RECORD is a collection of descriptive fields, e.g. A Customer record = Customer Name
E N D
A FILE is a collection of records with similar characteristics, e.g: A Sales Ledger Stock Records A Price List Customer Records Files may be manual, or computer.
A RECORD is a collection of descriptive fields, e.g. A Customer record = Customer Name Customer Number Customer Address Customer Credit Limit
A FIELD is a collection of characters, e.g. Customer Name = Gordon Brittas A KEY field is a field by which a record can be uniquely located, e.g. Customer Number
A CHARACTER (or BYTE) may be alphabetic or numeric and is made up of two nibbles A nibble is made up of four bits (binary digits) A bit is either 0, or 1
A TRANSACTION file, not surprisingly, holds transactions e.g. An addition - a new customer A deletion - removing a customer An amendment - increasing a customer's credit limit The contents of these files are constantly changing These files are mainly used to update master files
A MASTER file contains important data that is relatively static e.g. The Customer File A REFERENCE file contains reference information that also seldom changes e.g. Price List Department Codes A SPOOL file is an intermediate store for data before it is written out to a printer or a disc. A WORKFILE or TEMPORARY file is used as working space during operations such as sorting.
WHY DATABASES? • More applications = redundant data • e.g. Master File for Sales • Master File for Accounts • Master File for Support • all require Customer details • 2. Add/Change applications = change files • = need to change other applications accessing those files • (fields may have moved) • 3. More Management Information required • = data needed from more than one file
A DATABASE is a collection of data supporting the operations of an organisation. It is a means of creating and maintaining data on a computer such that it is useable for many purposes. In order to be efficient, it must have the following characteristics: 1. Non-Redundancy 2. Data Independence 3. Usable by all Programs 4. Support all necessary linkages 5. Adopt a common approach across data Details to follow …..
Non-Redundancy • E.g. Customer name and address should appear once only in the database • Possible problems if it appears more than once are: • a. The values of that data item may differ • b. The waste of storage space • c. The complexities of updating several occurrences • However, for reasons of security or speed of access, • some redundancy may be acceptable
2. Data Independence If you need to change the program, you don't have to change the data AND If you need to change the data, you don't have to change the program 3. Program Usage The database must be usable by all existing programs AND where possible, all possible future programs
4. Support all necessary linkages One application may require employee + pension contribution (one file) Another may require employee + tax payment (different file)
5. Common approach to data e.g. common length for names common abbreviations Data Dictionary
Logical vs. Physical Data LOGICAL data is the way datais seen by the program PHYSICAL data is the way the datais stored on computer disc The physical storage of the data has implications on the speed of access and on the number of linkages possible The piece of software that interfaces between the logical requirements of the program and the physical layout of the data is called the DBMS (Database Management System)
Types of Databases Hierarchical Network Relational
Hierarchical Top-down or tree arrangement Rapid access of data PROVIDING it involves descending the hierarchy If higher level details are not known, it involves searching every lower level record e.g. IMS (Information Management System) from IBM
Network Similar to tree structure but with additional links, using pointers Gives faster access, but occupies more space and requires a more complex DBMS e.g. IDMS and TOTAL
Relational Consists of a number of tables or two-dimensional arrays with links between entries e.g. dBaseIV, Oracle, Ingres, Access The flexibility in use requires a large number of 'virtual' tables and hence a large amount of storage space
Advantages 1. Avoids duplication of data, i.e. waste 2. Easier to keep it up-to-date and consistent 3. Permits multiple usage of data 4. Centralisation of data causes it to be regarded, and treated, as a major resource 5. Independent of programs 6. New applications easier, programmer not responsible for file organisations
Disadvantages 1. All data held together, therefore possible problems regarding access and security 2. Only one set of data, therefore accuracy and integrity essential 3. Must be available continuously 4. Communications considerations for distributed systems
Choosing a database package - Questions to ask before: 1. How much data will be stored 2. How will the data be accessed - processing requirements 3. How complex need the database be 4. Will users be computer literate 5. Will people who set up the database be computer literate 6. Does it have to be multi-user 7. Are future modifications allowed 8. Security adequate 9. Speed adequate 10. Report writing facilities adequate 11. Supplier support adequate