750 likes | 926 Views
V iew Maintenance. Based on several papers in view maintenance, most notably: A.Gupta and I.S. Mumick. Maintenance of Materialized Views: Problems, Techniques, and Application . In Bulletin of Technical Committee on Data Engineering 1995. Outline. Introduction to views
E N D
View Maintenance Based on several papers in view maintenance, most notably: A.Gupta and I.S. Mumick. Maintenance of Materialized Views: Problems, Techniques, and Application. In Bulletin of Technical Committee on Data Engineering 1995
Outline • Introduction to views • The problem of materialized view maintenance • The idea behind incremental view maintenance • Dimensions – the problem space • View maintenance using full information • The counting algorithm • View maintenance using partial information • Self-maintenance • Applications of materialized views • Open problems
Outline • Introduction to views • The problem of materialized view maintenance • The idea behind incremental view maintenance • Dimensions – the problem space • View maintenance using full information • The counting algorithm • View maintenance using partial information • Self-maintenance • Applications of incremental view maintenance • Open problems
1. Introduction to views What is a view? • A view is a derived relation defined in terms of base (stored) relations. • Example: • Flight is table of available direct flights. • We need a view of flights with one intermediate stop. CREATE TABLE Flight ( from CHAR(20), to CHAR(20), ); CREATE VIEW Conn(src, dest) AS SELECT F1.from, F2.to FROM Flight F1, Flight F2 WHERE F1.to = F2.from;
1. Introduction to views Views are treated as base tables in regard of querying The view Conn The base relation Flight
1. Introduction to views Motivation: Why views? • Logical data independence • If the conceptual schema changes, the changes can be “masked” through the views in the external schema • Security • Not everybody may see everything • “Relations” tailored to user’s needs • And many more reasons and applications … discussed later
1. Introduction to views View materialization vs. computing on demand • How will the following query be answered? • Two options: • Computing on demand • query modification, composing of the user query and the view query • View materialization • The view Conn would be materialized, its content would be stored in the database SELECT * FROM Conn C WHERE C.src = “Worcester” SELECT * FROM (SELECT F1.from, F2.to FROM Flight F1, Flight F2 WHERE F1.to = F2.from ) AS C WHERE C.src = “Worcester”
1. Introduction to views View materialization vs. computing on demand • How will the following query be answered? • Two options: • Computing on demand • query modification, composing of the user query and the view query • View materialization • The view Conn would be materialized, its content would be stored in the database SELECT * FROM Conn C WHERE C.src = “Worcester” SELECT * FROM (SELECT F1.from, F2.to FROM Flight F1, Flight F2 WHERE F1.to = F2.from ) AS C WHERE C.src = “Worcester”
1. Introduction to views View materialization vs. computing on demand • Trade-offs? • Queries can be answered faster • Indexes can be build over a materialized view to even more speed up the processing of the queries defined over the view • The view requires additional storage space • The consistency of the view has to be maintained
Outline • Introduction to views • The problem of materialized view maintenance • The idea behind incremental view maintenance • Dimensions – the problem space • View maintenance using full information • The counting algorithm • View maintenance using partial information • Self-maintenance • Applications • Open problems
2. The problem of view maintenance Definition of the problem of materialized view maintenance • What is materialized view maintenance ? • When the base relations are modified, • the view (often) becomes inconsistent. • Updating the view to make it consistent • is called view maintenance (refreshing).
2. The problem of view maintenance View maintenance policies ( When ? ) • Immediate view maintenance • The view is refreshed within the same transaction that updates the underlying tables + the view is always up to date - slows down the transaction • Deferred view maintenance • Lazy – the view is refreshed when query over it has to be evaluated - slows down the queries • Periodic - the view is refreshed periodically, e.g., once a day • Such views are called snapshots • Forced – the view is refreshed after a certain number of changes have been made to the underlying tables
2. The problem of view maintenance Methods of view maintenance (How ?) • Recomputation • recompute to view from scratch • Incremental view maintenance • compute the changes to the view in response to the changes to the base relation • add/delete some tuples in the existing materialized view • Heuristics: Incremental view maintenance is usually cheaper then recomputation
Outline • Introduction to views • The problem of materialized view maintenance • The idea behind incremental view maintenance • Dimensions – the problem space • View maintenance using full information • The counting algorithm • View maintenance using partial information • Self-maintenance • Applications • Open problems
The idea behind incremental view maintenance Example: Flight is table of available direct flights. We need a view of flights with one intermediate stop
3. The idea behind view maintenance F1 The view Conn F2 The base relation
3. The idea behind view maintenance F1 Inserted tuple F2 The view Conn New tuples in the view The base relation
3. The idea behind view maintenance F1 Inserted tuple F2 The view Conn New tuples in the view The base relation
3. The idea behind view maintenance F1 F1 F1 F1=F1+ F1 F2 The view Conn The base relation
3. The idea behind view maintenance The differentiation equation
3. The idea behind view maintenance F1 F2 The view Conn The base relation
3. The idea behind view maintenance F1 Join F2 The view Conn The base relation
3. The idea behind view maintenance F1 Join F2 The view Conn The base relation
3. The idea behind view maintenance F1 Join= F2 The view Conn The base relation
3. The idea behind view maintenance F1 Savings! F2 The view Conn The base relation
3. The idea behind view maintenance • Insertions • The observed example was for insertions into the base relation • Deletions • If tuples are deleted from a base relation, the tuples that need to be deleted from the view are computed similarly using delta’s • Updates • May be treated separately or may be modeled as deletions followed by inserts • Multiple relations involved • The deltas are similarly computed
Outline • Introduction to views • The problem of materialized view maintenance • The idea behind incremental view maintenance • Dimensions – the problem space • View maintenance using full information • The counting algorithm • View maintenance using partial information • Self-maintenance • Applications • Open problems
4. Dimensions Expressiveness of View Definition Language Amount of Information . . . …. Recursion Integrity Constraints Difference Outer-Joins Chronicle Algebra Other Views Union Subqueries Base Relations Aggregation Arithmetic Duplicates Materialized View Conjunctive Queries Insertions Deletions Updates Sets of each Type of Modification Group Updates Change view definition Dimensions + Instance dimension
Information Dimension • Information dimension • Information available for view maintenance (other than view definition and the modification, which are always assumed available) • E.g., • base relations, • the materialized view, • other views, • integrity constraints… • Don’t we have all these information? • Sometimes we don’t, • Sometimes it is expensive to access them (base relations when views are physically not close to data)
4. Dimensions Dimensions – information & modification • Example: CREATE TABLE Flight ( from CHAR(20), to CHAR(20), price REAL, company CHAR(20) ); CREATE VIEW CheapF(src, dest) AS SELECT DISTINCT F.from, F.to FROM Flight F WHERE F.price < 400;
4. Dimensions Can we maintain the view when tuples are inserted into base relation if only information available is the materialized view (its content) Reminder: The view definition and the modification are always available Dimensions – information & modification • Example: CREATE TABLE Flight ( from CHAR(20), to CHAR(20), price REAL, company CHAR(20) ); CREATE VIEW CheapF(src, dest) AS SELECT DISTINCT F.from, F.to FROM Flight F WHERE F.price < 400;
4. Dimensions Dimensions – information & modification • Example: CREATE TABLE Flight ( from CHAR(20), to CHAR(20), price REAL, company CHAR(20) ); CREATE VIEW CheapF(src, dest) AS SELECT DISTINCT F.from, F.to FROM Flight F WHERE F.price < 400;
Outline • Introduction to views • The problem of materialized view maintenance • The idea behind incremental view maintenance • Dimensions – the problem space • View maintenance using full information • The counting algorithm • View maintenance using partial information • Self-maintenance • Applications • Open problems
View maintenance using full information • ‘Classical’ view maintenance algorithms assume : • Full Information: the base relations, the materialized view, keys,… • All database and modification instances • Modification: inserts, deletes, updates ( as insert-delete ) • Language: Focus on efficient techniques for maintaining PSJ views expressed in subset of view definition language (SQL) • Classification along language dimension: • Nonrecursive views • The counting algorithm • Outer-join views • Recursive views • …
5. View Main. using full information The Counting Algorithm - Motivation F1 F2 The view Conn The base relation
5. View Main. using full information The Counting Algorithm - Motivation Shall we delete the (Worcester, Seattle) tuple from the view? F1 F2 The view Conn The base relation
5. View Main. using full information The Counting Algorithm - Motivation No, because it can still be derived from the remaining tuples F1 F2 The view Conn The base relation
5. View Main. using full information The Counting Algorithm • We need to know if there are more derivations of one tuple in the view • Main idea: • Keep count of number of derivations for each tuple in view • A tuple is removed from view only if its count is zero
5. View Main. using full information The Counting Algorithm F1 F2 The view Conn The base relation
5. View Main. using full information The Counting Algorithm F1 F2 The view Conn The base relation
5. View Main. using full information The Counting Algorithm F1 F2 The view Conn The base relation
5. View Main. using full information The Counting Algorithm • What if we have aggregation? CREATE TABLE Flight ( from CHAR(20), to CHAR(20), price REAL ); CREATE VIEW CheapFlight(src, dest, minPrice) AS SELECT F.from, F.to, MIN (F.price) FROM Flight F GROUP BY F.from, F.to;
5. View Main. using full information The Counting Algorithm - Aggregation The inserted tuple does not affect the view Inserted tuple The base relation Potentially affected tuple The view CheapFlight
5. View Main. using full information The Counting Algorithm - Aggregation One tuple in the view has to be updated Inserted tuple The base relation Potentially affected tuple The view CheapFlight
5. View Main. using full information The Counting Algorithm - Aggregation One tuple in the view has to be recomputed Deleted tuple The base relation Potentially affected tuple The view CheapFlight
5. View Main. using full information The Counting Algorithm - Aggregation • When a change to the base relation occurs: • Identifies the tuples that may be affected • Whenever possible incrementally computes new values of affected tuples by only looking at materialized view and modification. • Other aggregation functions that may be computed this way: COUNT, SUM, MIN, MAX • Some other aggregation functions like AVERAGE and VARIANCE can be decomposed into incrementally computable functions
5. View Main. using full information The Counting Algorithm – Multiple relations and views over views • Handles views over multiple relations, handles views over views (by first updating the views lower in the hierarchy) Materialized view 2 I I I Materialized view 1 Base relation 1 Base relation 2 Base relation 3
5. View Main. using full information The Counting Algorithm - Summary • Keeps track of the number of derivation of each tuple – tuples with count zero are deleted from the view • Handles updates as difference of positive and negative counts • Handles views over multiple relations, handles views over views • Language limitations: SPJ views, UNION, negation, aggregation • Works for both set and duplicate semantics
Outline • Introduction to views • The problem of materialized view maintenance • The idea behind incremental view maintenance • Dimensions – the problem space • View maintenance using full information • The counting algorithm • View maintenance using partial information • Self-maintenance • Applications • Open problems
View maintenance using partial information • Views may be maintainable using partial information • May depend on the modification: insert, delete or update • Goals: • Check whether the view can be maintained • How to maintain the view