370 likes | 513 Views
Databases. Flat Files & Relational Databases. Learning Objectives. Describe flat files and databases. Explain the advantages that using a relational database gives over flat files. Flat Files. A Flat File. All the data is held in a single file which is two dimensional.
E N D
Databases Flat Files & Relational Databases
Learning Objectives • Describe flat files and databases. • Explain the advantages that using a relational database gives over flat files.
A Flat File • All the data is held in a single file which is two dimensional. • Rows for records and columns for fields.
A Set / Series of Flat Files • Flat files which have at least one common field which is duplicated but not linked.
E.g. A Set / Series of Flat Files File containing Stock Code, Description, Re-order level, Cost Price, Sale Price, Supplier name and address, etc Programs to place orders when stocks are low Purchasing Department File containing Stock Code, Description, Number sold, Sale Price, Customer name and address, etc. Programs to record orders from customers Sales Department Programs to record accounts of customers File containing Customer name and address, amount owing, dates of orders, etc. Accounts Department
Separation and isolation of data • Which customers have bought parts produced by a particular supplier? • Find the parts supplied by a particular supplier from one file and then use a second file to find which customers have bought those parts. • Problem compounded if data is needed from more than two files.
Duplication of data • Supplier details duplicated if a supplier supplies more than one part. • Customer details held in two different files. • Data entered more than once, therefore time, space and money are wasted. • Loss of data integrity: • What happens if a customer changes his address? • The Sales Department may update their files but the Accounts Department may not do this at the same time. • If the Order Department order some parts and there is an increase in price. • The Order Department increases the Cost and Sale prices but the Accounts Department do not, there is now a discrepancy.
Data dependence • If there is a need to change data formats, whole programs have to be changed. • Different applications may hold the data in different forms, again causing a problem. • If an extra field is needed in a file, all applications using that file have to be changed, even if they do not use that new item of data.
Incompatibility of files • If one department writes its applications in COBOL and another in C. • Departments cannot read each other’s files.
Fixed queries and the proliferation of application programs • Each time a new query is needed, a new program has to be written. • Difficult if the data needed to answer the query is in more than one file. • Compounded if some files are incompatible.
Relational Databases Note: The term database on its own means a relational database.
Database Management System (DBMS) • Relational Databases use DBMS to provide a structure at three different levels: • Internal Level • Conceptual Level • External Level
Conceptual Level A manager uses a data description/definition language (DDL) to: • Create two or more logical/virtual tables or views and define one or more common linked fields called relationships. • Set the data types and write descriptions of each field. • Set validation. http://databases.about.com/od/sql/a/sqlfundamentals_2.htm
External Level • Users use a Data Manipulation Language (DML) to store, access, change, delete and search for data.
Data Manipulation Language (DML) • INSERT • To add a new employee to a “personal_info” table: • INSERT INTO personal_info values('bart','simpson',12345,$45000) • SELECT • Select everything from a “personal_info” table”: • SELECT * FROM personal_info • Extract a list of the last names of all employees in the company: • SELECT last_name FROM personal_info • Who has a salary value greater than $50,000? • SELECT * FROM personal_info WHERE salary > $50000 • UPDATE • Gives all employees a 3% cost-of-living increase in their salary: • UPDATE personal_info SET salary = salary * 1.03 • An employee (ID = 12345) is due a $5,000 raise. • UPDATE personal_info SET salary = salary + $5000 WHERE employee_id = 12345 • DELETE • An employee (ID = 12345) has been laid off. • DELETE FROM personal_info WHERE employee_id = 12345 • http://databases.about.com/od/sql/a/sqlfundamentals_3.htm • http://en.wikipedia.org/wiki/Data_manipulation_language
Internal Level • The relational database physically stores on some form of storage medium all data in one file structure as a physical table witha data dictionary. • Physical table: • Contains all fields for all types of users (e.g. departments). • Data dictionary: • Holds information about the database necessary to enable it to work in different views for different types of users e.g. departments. • Structures of logical / virtual tables or views, Relationships, Fields (names, data types, validation, descriptions etc…).
Advantages of Relational Databases • Through the use of DBMS: • Different logical/virtual tables or user views of data can be created. • Views: • A virtual or logical table composed of the result set of a query. • Unlike ordinary tables a view is not part of the physical schema: it is a dynamic, virtual table computed or collated from data in the database. • Easier to access the data because data is now accessible through queries using a Data Manipulation Language (DML).
Advantages of Relational Databases • All data can be accessed through this one relational database but different tables can be suited to different requirements (shows only relevant information). • Less duplication of data held because data is not duplicated across different files, consequently there is less danger of data integrity being compromised. • Less chance of one copy of data altered when another is not.
Advantages of Relational Databases • Data manipulation / input can be achieved more quickly as there is only one copy of each piece of data. • Flat files need to be compatible, this problem does not arise with a database.
Vocabulary for relational databases • Fields / columns = attributes • Records / rows = tuples • Tables = relations
Table description / shorthand notation / design: • Example table: • Employee (EMP) table for a company. • Note that an NINumber is a unique number given by the UK government to every person who works in the UK, for tax purposes. • Table: EMP (EmpID, NINumber, Name, Address) • The words in brackets separated by a , = Fields / Columns / Attributes
Primary Keys • Primary Key: • Unique attribute (or set of attributes) used to identify the record or tuple. • E.g. EMP (EmpID, NINumber, Name, Address) • EmpID & NINumber could be primary keys as they are unique for each employee. However, it would be standard to choose EmpID to be the primary key. • Note that it is conventional to underline the primary key and I will do so on the next few slides.
Foreign Key • An non-primary key attribute/field in one table which is also the primary key in another table and so links the two tables together. • E.g. • CINEMA (CID, Cname, Loc, MID) • CID = Cinema ID, Cname = Cinema name, Loc = Location, MID = Manager ID & • MANAGER (MID, Mname) • Mname = Manager name • MID occurs in CINEMA and is the primary key in MANAGER. When discussing CINEMA we say MID is the foreign key.
Secondary Keys • A different attribute (other than the primary key) that allows the data (the same record as the primary key or other groups of related records – so not necessarily unique) to be accessed in a different way. • E.g. EMP (EmpID, NINumber, FName, LName, Address, Dept) • NINumber could be a secondary key to allow access to the Employee’s record in a different way e.g. if EmpID is not known. • Dept could also be a secondary key as it allows access to all employees in a particular department. • The setup of one or more secondary keys allows these fields to be “indexed” so that searching by these fields is faster than ones that are not set to be possible secondary keys. • For example, it is very unlikely that we would search the EMP table above by FName only, so we would not set FName to be a secondary key (even though it would still be possible to search by FName, it just would be slower than searching by a secondary key such as NINumber or Dept). • However, Secondary keys can be a combination of fields e.g. FName+LName to allow a record to be accessed if neither the EmpID or the NINumber is known (but note that it is possible that two or more records share the same FName+LName).
Plenary • What is meant by a flat file?
Plenary • All the data is held in a single file which is two dimensional. • Rows for records and columns for fields.
Plenary • Describe the advantages of using a relational database over flat files.
Plenary • All data can be accessed through this one relational database but different tables can be suited to different requirements (shows only relevant information). • Less duplication of data held because data is not duplicated across different files, consequently there is less danger of data integrity being compromised. • Less chance of one copy of data altered when another is not.
Plenary • Data manipulation / input can be achieved more quickly as there is only one copy of each piece of data. • Flat files need to be compatible, this problem does not arise with a database. • Through the use of DBMS: • Different user views of data can be created • Easier to access the data because data is now accessible through relations / queries.
Plenary • What is meant by terms primary key, secondary key and foreign key?
Primary Keys • Primary Key: • Unique attribute (or set of attributes) used to identify the record or tuple. • E.g. EMP(EmpID, NINumber, Name, Address) • EmpID & NINumber could be primary keys as they are unique for each employee.
Secondary Keys • Secondary • A different attribute that allows the data to be accessed in a different order. • E.g. EMP(EmpID, NINumber, Name, Address) • If EmpID is the primary key then NINumber is the secondary key.
Foreign Keys • Foreign • The primary key of another file / table / relation that is used to link files / tables / relations together. • E.g. • CINEMA(CID, Cname, Loc, MID)& • MANAGER(MID, MName) • MID occurs in CINEMA and is the primary key in MANAGER. When discussing CINEMA we say MID is the foreign key.