190 likes | 292 Views
L ogics for D ata and K nowledge R epresentation. Query languages. Originally by Alessandro Agostini and Fausto Giunchiglia Modified by Fausto Giunchiglia, Rui Zhang and Vincenzo Maltese. Outline. Relational and Algebraic Structures Answer set Relational schemas and Databases
E N D
Logics for Data and KnowledgeRepresentation Query languages Originally by Alessandro Agostini and Fausto Giunchiglia Modified by Fausto Giunchiglia, Rui Zhang and Vincenzo Maltese
Outline • Relational and Algebraic Structures • Answer set • Relational schemas and Databases • Query languages • Domain Relational Calculus • Tuple Relational Calculus • SQL 2
Relational Structure STRUCTURES :: ANSWER SET :: RELATIONAL SCHEMAS AND DATABASES :: QUERY LANGUAGES • A relational structure is a mathematical structure of the form S = <D, R1, …, Rn> where: - D is a non-empty domain (a set of objects) - Ri with 1≤ i ≤ n is a relation over D of any arity <N, ≤> is a relational structure
Algebraic Structure STRUCTURES :: ANSWER SET :: RELATIONAL SCHEMAS AND DATABASES :: QUERY LANGUAGES • A mathematical structure S is an algebraic structure if S = <D, f1, …, fn> where each fi with 1≤ i ≤ n is a function. • The term structure is therefore used to denote either a relational structure or an algebraic structure. If D is finite, we say that the structure S is finite <N, +> is an algebraic structure 4
Answer Set STRUCTURES :: ANSWER SET :: RELATIONAL SCHEMAS AND DATABASES :: QUERY LANGUAGES • FOL can be used as query language on structures • Let be γ = γ(x1, …, xn) a FOL-formula, M a structure, a an assignment then γ is a query over M. Qγ = {(a1, …, an) Dn | M ⊨ γ [a1, …, an]} • Qγ is called the answer set (or retrieval set) of γ. Since model checking takes polynomial time, also the problem of finding the answer set takes polynomial time. NOTE: A database is a relational structure. [Codd, 1970] Tables in databases are finite relations.
Example (I) STRUCTURES :: ANSWER SET :: RELATIONAL SCHEMAS AND DATABASES :: QUERY LANGUAGES • DB = <D, Attend> D = {Enzo, Max, Mary, LDKR, ML, Fausto, Alex} Attend = {(Enzo, LDKR, Fausto), (Max, LDKR, Fausto), (Mary, ML, Alex), (Mary, LDKR, Fausto)} γ = Attend(Enzo, LDKR, x) Qγ = {(Fausto)} γ = Attend(Mary, x, Alex) Qγ = {(ML)} γ = Attend(x, LDKR, Fausto) Qγ = {(Enzo), (Max), (Mary)} ATTEND
Example (II) STRUCTURES :: ANSWER SET :: RELATIONAL SCHEMAS AND DATABASES :: QUERY LANGUAGES • DB = <D, Attend> D = {Enzo, Max, Mary, LDKR, ML, Fausto, Alex} Attend = {(Enzo, LDKR, Fausto), (Max, LDKR, Fausto), (Mary, ML, Alex), (Mary, LDKR, Fausto)} γ = ∃z Attend(Mary, x, z) Qγ = {(ML), (LDKR)} Qγ is such that DB ⊨ ∃z Attend(Mary, x, z) [a(x) = ML] or DB ⊨ ∃z Attend(Mary, x, z) [a(x) = LDKR] ATTEND
Relational schema and database schema STRUCTURES :: ANSWER SET :: RELATIONAL SCHEMAS AND DATABASES :: QUERY LANGUAGES • A relational schema consists of a name and an arity • A relational schema is a FOL-formula. • A database schema is a collection of relational schemas plus a domain Attend(x, y, z) has name “Attend” and arity 3. D + Attend(x, y, z) + Student(x, y) + Course(x, y)
Relational instance and database instance STRUCTURES :: ANSWER SET :: RELATIONAL SCHEMAS AND DATABASES :: QUERY LANGUAGES • A n-ary relational instance is an n-ary relation R Dn • A database instance is a collection of relational instances over a domain NOTE: a database instance is a relational structure Attend(Enzo, LDKR, Fausto) DB + the collection of rows in the tables
Relational database STRUCTURES :: ANSWER SET :: RELATIONAL SCHEMAS AND DATABASES :: QUERY LANGUAGES • A relational database is a finite relational structure DB = <D, R1, …, Rn> • In particular a DB is said to be in first normal form if: - the objects in D are atomic (i.e. are not sets) - the relations R1, …, Rn are defined over D - the order of the tuples in each Ri does not matter (is a set)
Domain Relational Calculus (DRC) STRUCTURES :: ANSWER SET :: RELATIONAL SCHEMAS AND DATABASES :: QUERY LANGUAGES “List name, position and manager of the employees” γ = Employee(x, y, z) Qγ = All the tuples in the Relation “List managers of the employees named Enzo” γ = ∃y Employee(Enzo, y, z) Qγ = {(Fausto)} EMPLOYEE
SELECT STRUCTURES :: ANSWER SET :: RELATIONAL SCHEMAS AND DATABASES :: QUERY LANGUAGES “List name and position of the employees having a manager” γ = ∃z Employee(x, y, z) Qγ = {(Enzo, PhD), (Fausto, Professor), (Feroz, PhD)} NOTE: The ∀ quantifier is used to state a condition that must be true for all the tuples in the database, i.e. no missing values. EMPLOYEE
FILTER STRUCTURES :: ANSWER SET :: RELATIONAL SCHEMAS AND DATABASES :: QUERY LANGUAGES “List name and position of the employees with Fausto as manager” γ = ∃z Employee(x, y, Fausto, z) Qγ = {(Enzo, PhD)} “List name and position of the employees with age > 28” γ = ∃w ∃z (Employee(x, y, z, w) ∧ (w > 28)) Qγ = {(Enzo, PhD), (Fausto, Professor)} EMPLOYEE
JOIN STRUCTURES :: ANSWER SET :: RELATIONAL SCHEMAS AND DATABASES :: QUERY LANGUAGES “List name of faculty members and their department” γ = ∃y ∃z ( Faculty(x, y, z) ∧ Dept(x, w) ) Qγ = {(Enzo, DISI), (Fausto, DISI), (Feroz, MATH)} “List name of faculty members who belong to all departments” γ = ∀w (∃x Dept(x, w) → ∃y ∃z (Faculty(x, y, z) ∧ Dept(x, w))) FACULTY DEPT
Problems with DRC STRUCTURES :: ANSWER SET :: RELATIONAL SCHEMAS AND DATABASES :: QUERY LANGUAGES • Atomic objects in the domain cannot be distinguished by column Faculty(Phd, 28, Enzo, Fausto) is a valid (but unwanted) query • The order of the columns is important. The DB must be queried by remembering the order of the columns To overcome these problems we use the Tuple Relational Calculus (see next slide)
Tuple Relational Calculus (TRC) STRUCTURES :: ANSWER SET :: RELATIONAL SCHEMAS AND DATABASES :: QUERY LANGUAGES • We define a set of types{t1, …, tn}. The set of objects D is then partitioned as follows: D = Dt1 ∪ … ∪ Dtn where Dti is the set of objects of type ti • We introduce a set of attributes, i.e. pairs of the form (name, type). Each pair is a typed attribute and the name is the attribute (that correspond to the name of the columns). • A relational schema is a relation name with a tuple of typed attributes. Employee(Name String, Position String, Manager String, Age Int)
Queries in Tuple Relational Calculus STRUCTURES :: ANSWER SET :: RELATIONAL SCHEMAS AND DATABASES :: QUERY LANGUAGES “List all employees who have an age of exactly 35” γ = Employee(t) ∧ (t.age = 35) “List name and position of the employees who have age 35” γ = ∃ t : {name, position} (Employee(t) ∧ t.age = 35) • We write t.age to extract from the tuple t the content of the attribute called “age” • t is called a tuple variable • In TRC variables range over tuples and not over the domain D
Structured Query Language (SQL) STRUCTURES :: ANSWER SET :: RELATIONAL SCHEMAS AND DATABASES :: QUERY LANGUAGES • A friendly face over the Tuple Relational Calculus SELECT {T1.attrib, …, T2.attrib}FROM {relation} T1, {relation} T2, …WHERE {predicates} 18
Queries in Structured Query Language STRUCTURES :: ANSWER SET :: RELATIONAL SCHEMAS AND DATABASES :: QUERY LANGUAGES “List name and position of the employees with Fausto as manager” SELECT name, position FROM employee WHERE manager = ‘Fausto’ EMPLOYEE