480 likes | 592 Views
IT – som værktøj. Bent Thomsen Institut for Datalogi Aalborg Universitet. Introduction to databases. Bent Thomsen. Who uses databases?. Everybody! Every time you use your credit card, you (indirectly) use a database When you book a flight you use a database
E N D
IT – som værktøj Bent Thomsen Institut for Datalogi Aalborg Universitet
Introduction to databases Bent Thomsen
Who uses databases? • Everybody! • Every time you use your credit card, you (indirectly) use a database • When you book a flight you use a database • When you take out insurance or a mortgage, you use a database • When you contact public services • When you buy something online
Classes of database users • Workers on the scene • End user: The person who makes queries, updates and extracts reports • Database administrator: The person responsible for managing the database system • Database designer • Application programmer: Design and implement transactions for end-users
Database users (cont.) • Workers behind the scene • Database designers and implementers • Tool developers • Operators and maintenance personnel • Run and maintain the hardware and software
Types of databases • Large database systems • Oracle, DB2, Informix, Ingres, MS SQLServer • Huge amounts of data – high throughput • Small(ish) database systems • Excell DataLists • MySQL • MS Access • Small companies, clubs, scientific data
Concepts of Database • A database is a collection of records that are organized for a particular purpose • A student record book with addresses and grades • An inventory with software, books, and hardware • A field is a basic fact (or data element). • Name, address,… • A record is a set of fields. • A student, a book • A table is a set of records. • A database consists of one or more tables. • A primary key is a field that identify a single record. • In the primary key field, every record has a unique number. (Student ID Number)
Parts of a database Record Tables Attribute/Field • Records become “rows” • Attributes/fields become “columns” • Rules determine the relationship between the tables and tie the data together to form a database
Creating a database • What information are we trying to store? • How do we describe the information? • Phone Book/Contact entries • Name • Address • Company • Phone Number • URL/Web Page • Age • Height (in meters) • Birthday • When we added the entry
Data Types • Binary • Database specific binary objects • Pictures, digital signatures, etc. • Boolean • True/False values • Character • Fixed width or variable size • Numeric • Integer, Real (floating decimal point), Money • Temporal • Time, Date, Timestamp
Phone Book/Contact Record Name Character Address Character Company Character Phone Number Character URL/Web Page Character Age Integer Height Real (float) Birthday Date When we added the entry Timestamp
Example (cont.) Could be a DataList table in Excell!
What is a relational database? • Originally developed by E.F. Codd in 1970 • Organizes data into tables where each item is a row and the attributes of the item are in columns. • Different from “flat file” databases because you can define “relationships” between items in different tables.
Normal forms • E. F. Codd in 1972 wrote a paper on “Further Normalization of the Data Base Relational Model” • Normal forms reduce the amount of redundancy and inconsistent dependency within databases. • Codd proposed three normal forms and through the years two more have been added. Summarized from Barry Wise’s article on Database Normalization http://www.phpbuilder.com/columns/barry20000731.php3?page=1
The Zero Form • No rules have been applied • Where most people start (and stop) • No room for growth • Usually wastes space
First Normal Form • Eliminate repeating columns in each table • Create a separate table for each set of related data • Identify each set of related data with a primary key Benefits: Now we can have infinite phone numbers or company addresses for each contact. Drawback: Now we have to type in everything over and over again. This leads to inconsistency, redundancy and wasting space. Thus, the second normal form…
Second Normal Form • Create separate tables for sets of values that apply to multiple records • Relate these tables with a “foreign key”.
Third Normal Form • Eliminate fields that do not depend on the primary key. Is this enough? Codd thought so… What about “many to many”?
Kinds of Relationships • “One to One” • One row of a table matches exactly to another • One person, one id number, one address • “One to Many” • One row of a table matches many of another • One person, many phone numbers • “Many to Many” • One row may match many of another or many rows match one row of another
Fourth Normal Form • In a “many to many” relationship, independent entities cannot be stored in the same table.
Fifth Normal Form • The “very esoteric” one that is probably not required to get the most out of your database. • “The original table must be reconstructed from the tables into which it has been broken down.” • The rule ensures that you have not created any extraneous columns and all the tables are only as large as they need to be.
The Normal Forms • First Form • Eliminate replicated data in tables • Create separate tables for each set of related data • Identify each set of related data with a primary key • Second Form • Create separate tables for sets of values that apply to multiple records • Relate the tables with a foreign key • Third Form • Eliminate fields that do not depend on the primary key • Fourth Form • In many-to-many relationships, independent entities cannot be stored in the same table
Why normalize? • Increases the integrity of the data • Reduces redundancy • Improves efficiency • Although normalization can be hard, it is worth it in the long run.
What do I need to remember? • Keep normalization in mind. • Don’t replicate data in a table. • If you break the rules, know why you are breaking the rules and do it for a good reason.
All you need to know about SQL in 30 minutes (or less) SQL = Structured Query Language
Basic SQL Commands • Creating tables with CREATE • Adding data with INSERT • Viewing data with SELECT • Removing data with DELETE • Modifying data with UPDATE • Destroying tables with DROP
Creating tables with CREATE • Generic form CREATE TABLE tablename ( column_name data_type attributes…, column_name data_type attributes…, … ) • Table and column names can’t have spaces or be “reserved words” like TABLE, CREATE, etc.
Phone Book/Contact Record Name Character Address Character Company Character Phone Number Character URL/Web Page Character Age Integer Height Real (float) Birthday Date When we added the entry Timestamp
Phone Book/Contact Table CREATE TABLE contacts ( Name VARCHAR(40), Address VARCHAR(60), Company VARCHAR(60), Phone VARCHAR(11), URL VARCHAR(80), Age INT, Height FLOAT, Birthday DATE, WhenEntered TIMESTAMP ); Plan your tables very carefully! Once created, they are difficult to change!
Phone Book/Contact Table CREATE TABLE contacts ( ContactID INT PRIMARY KEY, Name VARCHAR(40), Address VARCHAR(60), Company VARCHAR(60), Phone VARCHAR(11), URL VARCHAR(80), Age INT, Height FLOAT, Birthday DATE, WhenEntered TIMESTAMP ); If you are going to use the relational nature of a database,don’t forget you need to have a unique way to access records! There is a way to make the key automatically increment,so you don’t have to worry about which one is next.
Adding data with INSERT • Generic Form INSERT INTO tablename (column_name,…) VALUES (value,…)
Inserting a record into ‘contacts’ INSERT INTO contacts (contactid,name,address,company,phone,url,age,height,birthday,whenentered) VALUES (1,‘Joe’,’123 Any St.’,’ABC’, ’800-555-1212’,‘http://abc.com’,30,1.9,’6/14/1972’,now());
Inserting a partial record INSERT INTO contacts (contactid,name,phone) VALUES (2,’Jane’,’212-555-1212’);
Viewing data with SELECT • Generic Form SELECT column,… FROM table,… WHERE condition GROUP BY group_by_expressionHAVING condition ORDER BY order_expression • The most used command • Probably the most complicated also • If used improperly, can cause very long waits because complex computations
A few simple SELECTs • SELECT * FROM contacts; • Display all records in the ‘contacts’ table • SELECT contactid,name FROM contacts; • Display only the record number and names • SELECT DISTINCT url FROM contacts; • Display only one entry for every value of URL.
Refining selections with WHERE • The WHERE “subclause” allows you to select records based on a condition. • SELECT * FROM contacts WHERE age<10; • Display records from contacts where age<10 • SELECT * FROM contacts WHERE age BETWEEN 18 AND 35; • Display records where age is 18-35
Additional selections • The “LIKE” condition • Allows you to look at strings that are alike • SELECT * FROM contacts WHERE name LIKE ‘J%’; • Display records where the name starts with ‘J’ • SELECT * FROM contacts WHERE url LIKE ‘%.com’; • Display records where url ends in “.com”
Removing data with DELETE • Generic Form DELETE FROM table WHERE condition; DELETE FROM contacts WHERE age<13;
Modifying data with UPDATE • Generic Form UPDATE table SET column=expression WHERE condition; UPDATE contacts SET company=‘AOL’ WHERE company=‘Time Warner’;
Destroying tables with DROP • Generic Form DROP TABLE tablename; DROP TABLE contacts;
“Normal Forms” and SELECT • Good database design using the normal forms requires data to be separated into different tables • SELECT allows us to join the data back together • We can use “views” to create virtual tables
Joining together tables • SELECT name,phone,zip FROM people, phonenumbers, address WHERE people.addressid=address.addressid AND people.id=phonenumbers.id;
General form of SELECT/JOIN SELECT columns,… FROM left_table join_type JOIN right_table ON condition; SELECT name,phone FROM people JOIN phonenumbers ON people.id=phonenumbers.id;
ORDER BY • The “ORDER BY” clause allows you to sort the results returned by SELECT. SELECT * FROM contacts ORDER BY company; SELECT * FROM contacts ORDER BY company, name;
Views • You can use “CREATE VIEW” to create a virtual table from a SELECT statement. CREATE VIEW contactview AS (SELECT name,phone,zip FROM people,phonenumbers,address WHERE people.id=phonenumbers.id AND people.addressid=address.addressid);
Finishing off • SQL is a large language • You have seen some basic commands • Takes time and effort to learn • Mainly needed for powerusers • Lots of stuff we haven’t talked about • E.g. security, multiple users, program accessing DB, … • Many database systems now use graphical user interfaces instead of SQL directly