240 likes | 249 Views
Learn the basics of SQL queries for data manipulation with examples and explanations. Understand the SELECT-FROM-WHERE statements and relational operations in SQL for optimized query executions. Explore single-relation queries, extended projections, selections, patterns, and more.
E N D
SQL • SQL is a very-high-level language, in which the programmer is able to avoid specifying a lot of data-manipulation details that would be necessary in languages like C++. • What makes SQL viable is that its queries are “optimized” quite well, yielding efficient query executions.
Select-From-Where Statements • The principal form of a query is: SELECT desired attributes FROM one or more tables WHERE condition about tuples of the tables
Our Running Example • Our SQL queries will be based on the following database schema. Movie(title, year, length, inColor, studioName, producerC) StarsIn(movieTitle, movieYear, starName) MovieStar(name, address, gender, birthdate) MovieExec(name, address, cert#, netWorth) Studio(name, address, cert#, netWorth) Find all movies produced by Disney Studios in 1990. SELECT * FROM Movie WHERE studioName = 'Disney' AND year = 1990;
Meaning of Single-Relation Query • Begin with the relation in the FROM clause. • Apply the selection indicated by the WHERE clause. • Apply the extended projection indicated by the SELECT clause.
(Extended) Projection in SQL SELECT title, length FROM Movie WHERE studioName = 'Disney' AND year = 1990; SELECT title AS name, length AS duration FROM Movie WHERE studioName = 'Disney' AND year = 1990; SELECT title AS name, length*0.016667 AS lenghtInHours FROM Movie WHERE studioName = 'Disney' AND year = 1990; SELECT title AS name, length/60 AS length, 'hrs.' AS inHours FROM Movie WHERE studioName = 'Disney' AND year = 1990;
Selection in SQL • The selection of the relational algebra is available through the WHERE clause of SQL. • We may build expressions by using the operators: = <> < > <= >= • The string constants are surrounded by single quotes. • studioName = 'Disney' • Numeric constants are for e.g.: -12.34, 1.23E45 • Boolean operators are: AND, OR, NOT. SELECT title FROM Movie WHERE (year > 1970) AND NOT (inColor='C');
Selection in SQL (Cont.) • Which Disney movies are after 1970 or have length greater than 90 mins? SELECT title FROM Movie WHERE (year > 1970 OR length < 90) AND studioName='Disney'; • The parenthesis are needed because the precedence of OR is less than that of AND.
Comparision of strings • Strings can as well be compared (lexicographically) with the same operators: = <> < > <= >= • For instance ‘fodder’<‘foo’ ‘bar’ < ‘bargain’
Patterns • WHERE clauses can have conditions in which a string is compared with a pattern, to see if it matches. • General form: <Attribute> LIKE <pattern> <Attribute> NOT LIKE <pattern> • Pattern is a quoted string with % = “any string” _ = “any character.” • Examples. Suppose we remember a movie “Star something”. SELECT title FROM Movie WHERE title LIKE 'Star %'; SELECT title FROM Movie WHERE title LIKE '%''s%'; Two consecutive apostrophes in a string represent itself and not the end of the string.
Comparison of Strings (Continued) • What if the pattern we wish to use in a LIKE expression involves the characters % or _? • We should “escape” their special meaning proceeding them by some escape character. • In UNIX and C we use backslash \ as the escape character. • SQL allows us to use any character we like. • s LIKE 'x%%x%%' ESCAPE 'x' • x will be the escape character. • A string that is matched by this pattern is for example: %aaaa% SELECT title FROM Movie WHERE title LIKE 'x%%x%%' ESCAPE 'x';
Ordering the Input • We may ask the tuples produced by a query to be presented in sorted order. • ORDER BY <list of attributes> • Example. Find the Disney movies of 1990. • Movie(title, year, length, inColor, studioName, producerC#) • To get the movies listed by length, shortest first, and among movies of equal length, sort alphabetically: SELECT * FROM Movie WHERE studioName = 'Disney' ORDER BY length, title; • Ordering is ascending, unless you specify the DESC keyword to an attribute. • Ties are broken by the second attribute on the ORDER BY list, etc.
NULL Values • Tuples in SQL relations can have NULL as a value for one or more components. • Meaning depends on context. Two common cases: • Missing value : e.g., we know the length has some value, but we don’t know what it is. • Inapplicable : e.g., the value of attribute spouse for an unmarried person.
Comparing NULL’s to Values • The logic of conditions in SQL is really 3-valued logic: TRUE, FALSE, UNKNOWN. • When any value is compared with NULL, the truth value is UNKNOWN. • But a query only produces a tuple in the answer if its truth value for the WHERE clause is TRUE (not FALSE or UNKNOWN).
Three-Valued Logic • To understand how AND, OR, and NOT work in 3-valued logic, think of • TRUE = 1, FALSE = 0, and UNKNOWN = ½. • AND = MIN • OR = MAX • NOT(x) = 1-x • Example: TRUE AND (FALSE ORNOT(UNKNOWN)) = MIN(1, MAX(0, (1 - ½ ))) = MIN(1, MAX(0, ½ )) = MIN(1, ½ ) = ½.
Surprising Example SELECT * FROM Movie WHERE length <=120 OR length > 120; • Suppose that we have some NULL values in the length. • What’s the result?
Products and Joins in SQL • SQL has a simple way to couple relations in one query: list each relation in the FROM clause. • All the relations in the FROM clause are coupled through Cartesian product • Then we can put conditions in the WHERE clause in order to get the various kinds of join. • Example. We want to know the name of the producer of Star Wars. • To answer we need the information from both of the relations: • Movie(title, year, length, inColor, studioName, producerC) • MovieExec(name, address, cert, netWorth) SELECT name FROM Movie, MovieExec WHERE title = 'Star Wars' AND producerC = cert;
Disambiguating Attributes • When we involve two or more relations in a query, we can have attributes with the same name among these relations. • We solve the problem of disambiguating between them by putting the name of the relation followed by a dot and then the name of the attribute. • Example. Suppose we wish to find pairs (star, movie executive) living in the same address. • MovieStar(name, address, gender, birthdate) • MovieExec(name, address, cert, netWorth) SELECT MovieStar.name, MovieExec.name FROM MovieStar, MovieExec WHERE MovieStar.address = MovieExec.address;
Tuple Variables • Sometimes we need to ask a query that involves two or more tuples from the same relation. We may list a relation R as many times we want in the from clause but we need a way to refer to each occurrence of R. • SQL allows us to define, for each occurrence in the FROM clause, an alias which we call “tuple variable.” • Example. We like to know about two stars who share an address. SELECT Star1.name, Star2.name FROM MovieStar AS Star1, MovieStar AS Star2 WHERE Star1.address = Star2.address AND Star1.name < Star2.name; AS is not supported in Oracle.
In Oracle SELECT Star1.name, Star2.name FROM MovieStar Star1, MovieStar Star2 WHERE Star1.address = Star2.address AND Star1.name <> Star2.name;
Tuple Variables (Continued) • Why we have the condition • Star1.name < Star2.name ? • Without this condition we would produce also pairs of identical star names. • This conditions forces us to produce each pair of stars with a common address only. • Why we used the operator < and not <>? • If we had used <> the we would have produced pairs of married stars twice, like: Star1.name Star2.name Alec Baldwin Kim Basinger Kim Basinger Alec Baldwin
Conversion to Relational Algebra • Another approach to interpret SQL queries is to relate them to relational algebra. • Start with the relations in the FROM clause and take their Cartesian Product. • Having created the product, we apply a selection operator to it by converting the WHERE clause to a selection condition. • Finally from the list of attributes in the SELECT clause we do a projection.
An Unintuitive Consequence of SQL semantics • Suppose R, S, T are unary relations each having attribute A alone. • We wish to compute R(ST). • We might expect the following SQL query to do the job. SELECT R.A FROM R, S, T WHERE R.A = S.A OR R.A = T.A • However, consider the situation in which T is empty. Since R.A = T.A can never be satisfied, we might expect the query to produce exactly RT. • But using the interpretation the result is empty. • If we use the conversion to RA, the Cartesian product R x S x T is .
Union, Intersection, and Difference of Queries • If two SQL queries produce relations with the same set of attributes then we can combine the queries using the set operations: UNION, INTERSECT and EXCEPT. • Example. Suppose we want the names and addresses of all female movie stars who are also movie executives with a net worth over $1,000,000. • MovieStar(name, address, gender, birthdate) • MovieExec(name, address, cert, netWorth) (SELECT name, address FROM MovieStar WHERE gender = 'F') INTERSECT (SELECT name, address FROM MovieExec WHERE netWorth > 1000000);
Union, Intersection, and Difference of Queries (Continued) • Example. Give the names and addresses of movie stars who are not also movie executives. (SELECT name, address FROM MovieStar) EXCEPT (SELECT name, address FROM MovieExec); • In ORACLE the EXCEPT is MINUS. • Example. We want all the titles and years of movies that appeared in either the Movie or StarsIn relation. (SELECT title, year FROM Movie) UNION (SELECT title, year FROM StarsIn);