130 likes | 261 Views
Relational Databases. Joins continued. In week 1 we looked at the concept of a key, primary key, which contains 1 or more columns/attributes, uniquely identifies each row/ tuple within a table. No 2 rows can have the same PK value
E N D
Relational Databases Joins continued
In week 1 we looked at the concept of a key, primary key, which contains 1 or more columns/attributes, uniquely identifies each row/tuple within a table. • No 2 rows can have the same PK value • Even if 2 students have the same name and date of birth, the unique studentid ensures that the database user can track the correct student. • A foreign key (contains one or more columns) in a table relates specifically to the PK of another table. It is ‘imported’ from another table or possibly the same table to provide a link. • In relational databases, this is known as referential integrity (a student cannot be given a mark on a subject that does not exist in the database) • FK enables all the related data to be accessed without having to hold or duplicate the same data held in another table. • A table does not have to have a FK and there is no limit on the number of FKs it does have but each FK must correspond to a PK of another table. • The purpose of the keys are so that data can be related across tables without having to repeat it, this is the power of relational databases. Re-cap
Select name, classtime, day From class, staffmember Where teacher = staffid; The link between class and staffmember is on the teacher = staffid in the WHERE clause, the teacher is the FK referring to staffid, the PK of staffmember table. Select distinct name, classtime, day From class, staffmember Where teacher = staffid; The key word distinct will remove any duplicates from the list. Re-cap continued (join in where clause)
select sname, period, count(*) from subject natural join marks where mark >= 40 group by sname, period order by sname; The above code will list the name of a subject when it ran and the number of people who got more than 40% for that run. The data is pulled from 2 different tables (marks and subject) and there is a common column (subjectid) which will enable a natural join to take place Recap continued (natural join)
Not always possible to have a natural join • Staffmember and class do not have a natural join but have to be joined at the where clause level currently. • Joins in the FROM clause are more effective to run but we cannot assume that there will always be a natural join possible • Move clause from WHERE clause to FROM clause via the explicit INNER join Inner join continued
The inner join places a combination of table A and table B into a temp table, then removes the unmatched rowsusing the join predicate. (does a cartisian product join then filters the results) • Actual SQL implementations normally use other approaches where possible, since computing the Cartesian product is very inefficient. • SQL specifies two different ways to express joins • "explicit join notation", uses the keyword JOIN: • INNER JOIN • NATURAL JOIN • "implicit join notation". The implicit join notation lists the tables for joining in the FROM clause of a SELECT statement, using commas to separate them, and the condition for the join in the WHERE clause. Inner join (explicit)
select sname, period, count(*) from subject INNER JOIN marks on (subject.subjectid = marks.subjectid) where mark >= 40 group by sname, period order by sname; Is the same as: select sname, period, count(*) from subject NATURAL JOIN marks where mark >= 40 group by sname, period order by sname; Which is the same as: select sname, period, count(*) From subject, marks Where mark > 40 and subject.subjectid = marks.subjectid group by sname, period order by sname; Explicit join example All 3 bits of SQL code perform the same query and will produce the same output, the nature of the join however is different in each statement
The join we have looked at so far is an inner join, this means that only matching records will be displayed, records on either side that have a null value will NOT be returned. The join SQL code we have used has not stipulated the type of join we are using but some syntax required you to explicitly state the type of join (inner) you are using select sname, period, count(*) from subject inner join marks using (subjectid) where mark >= 40 group by sname, period order by sname; • This code indicate the nature of the join and the shared attribute to join on Explicit inner join
There are 3 members of staff who are in the staffmember table but do not have any classes associated with them. If staffmember and class are joined with an inner join the 3 staff will not be included in the result because it is required that there is at least one matching row on each side of the join. select name, day, classtime from staffmember inner join class on staffid = teacher; Inner joins Following rows not included as the staff have no classes NAME STAFFID -------------------------------------------- Bertie Wooster 851 Glen Maxwell 811 EsmeLettitia 831
Efficiently joining tables in the SQL statement is something that challenges database-systems • Optimising effective execution • Inner joins operate commutatively and associative which in practice means that the user merely supplies a list of tables for joining and the join conditions and the DBMS has to determine the most effective way to perform the operation. • A query optimiser determines how to execute a query containing joins. A query optimizer has two basic freedoms: • Join order: The order in which the system joins tables does not change the final result-set of the query. However, join-order does have an enormous impact on the ‘cost’ of the join operation, cost being the time it takes. • Join method: Given two tables and a join condition, multiple algorithms can be used, which algorithm runs most efficiently depends on the sizes of the input tables, the number of rows from each table that match the join condition, and the operations required by the rest of the query. These issues will be looked at again later in the module when we consider the database administration side of databases Join issues
The following code retrieves the details for Tim Rose, re-write the code joining in the FROM clause rather than the WHERE clause select name, classtime, room from class, staffmember where name = 'Tim Rose' and teacher = staffid; Example to work through select name, day, classtime from staffmember inner join class on staffid = teacher Where name = ‘Tim Rose’
It is possible to join multiple tables using the join syntax, the tables need to be ‘chained’ to each other by the join clauses select classtime, room, sname from staffmember join class on staffid = teacher join subject on class.subjectid = subject.subjectid where name = 'Tim Rose'; CLASSTIME ROOM SNAME ---------- --------------- ------------------------------ 11am Elec Eng G25 Digital Systems Structures 11am Elec Eng G25 Concurrent Computing 11am Elec Eng G25 Digital Systems Structures 11am Elec Eng G25 Artificial Intelligence Join 3+ tables Staffmember joins to class which in turns joins to subject.
select stuname, sname, mark from student, marks, subject Where student.studentid = marks.studentid and marks.subjectid = subject.subjectid and mark < 40; Re-write the SQL query above using explicit join Worked example select stuname, sname, mark from student join marks on student.studentid = marks.studentid join subject on subject.subjectid = marks.subjectid where mark < 40; OR select stuname, sname, mark from student join marks using(studentid) join subjectusing (subjectid) where mark < 40;