130 likes | 252 Views
Data Warehousing and Decision Support. Part Two: - The use of views. Topics. What is a View? Why Views are useful in Data Warehousing? Understand Materialised Views Understand View Maintenance policies. Views and Decision Support. OLAP queries are typically aggregate queries.
E N D
Data Warehousing and Decision Support Part Two: - The use of views.
Topics • What is a View? • Why Views are useful in Data Warehousing? • Understand Materialised Views • Understand View Maintenance policies
Views and Decision Support • OLAP queries are typically aggregate queries. • Precomputation is essential for interactive response times. • The CUBE is in fact a collection of aggregate queries, and precomputation is especially important: lots of work on what is best to precompute given a limited amount of space to store precomputed results. • Warehouses can be thought of as a collection of asynchronously replicated tables and periodically maintained views.
What is a VIEW • A “virtual table” • Does not exist as a base table in the database • Automatically activated when referenced in an SQL statement. • Views are stored SQL statements. • Created with the Create View statement • Can be dropped with Drop View
To create a simple view on the employee table Create view empdept30 As Select ename, sal, comm From Emp Where deptno = 30; The View can then be used and referenced as if it were a normal table:- Select ename, sal from empdept30;
View Modification (Evaluate On Demand) CREATE VIEWRegionalSales(category,sales,state) ASSELECT P.category, S.sales, L.state FROM Products P, Sales S, Locations L WHERE P.pid=S.pid AND S.locid=L.locid View SELECT R.category, R.state, SUM(R.sales) FROMRegionalSales AS R GROUP BY R.category, R.state Query SELECT R.category, R.state, SUM(R.sales) FROM (SELECT P.category, S.sales, L.state FROM Products P, Sales S, Locations L WHERE P.pid=S.pid AND S.locid=L.locid) AS R GROUP BY R.category, R.state Modified Query
Materialized Views • A view whose tuples are stored in the database is said to be materialized. • Provides fast access, like a (very high-level) cache. • Need to maintain the view as the underlying tables change. • Ideally, we want incremental view maintenance algorithms. • Views are closely related to data warehousing, and OLAP.
Extract, Transform and Load (ETL) • Transforming data is typically accomplished by defining a relational View over the tables in the data sources. • Loading data then consists of materializing such views and storing them in the warehouse.
Issues in View Materialization • What views should we materialize, and what indexes should we build on the precomputed results? • Given a query and a set of materialized views, can we use the materialized views to answer the query? • How frequently should we refresh materialized views to make them consistent with the underlying tables? (And how can we do this incrementally?)
View Maintenance • A View Maintenance policy is a decision about when a view is refreshed. • Immediate Refresh: Can be refreshed as part of the transaction that modifies the underlying data tables. (+ Materialized view is always consistent; - updates are slowed) • Deferred Refresh: Some time later, in a separate transaction. (- View becomes inconsistent; + can scale to maintain many views without slowing updates. Updates are captured in a log and applied subsequently to the materialised views.)
Deferred Maintenance • Three flavors: • Lazy: Delay refresh until next query on view; then refresh before answering the query. • Periodic (Snapshot): Refresh periodically (say once per day). Queries will possibly be answered using outdated version of view tuples. Widely used for warehouse applications. • Forced: E.g., Refresh after a fixed number of updates to underlying data tables.
Note: • In periodic and forced view maintenance queries may see data that is not consistent with the underlying tables. A price to be paid for fast updates and queries. • A snapshot (Oracle) is a local materialization of a view on data stored at a master site.