110 likes | 274 Views
Materialized View Selection in a Multidimensional Database. Elena Baralis Stefano Paraboschi Ernest Teniente. A Practical Example.
E N D
Materialized View Selection in a Multidimensional Database Elena Baralis Stefano Paraboschi Ernest Teniente
A Practical Example Consider the MDDB for a large grocery store chain, characterized by a large number of stores, each of which is a supermarket selling a wide variety of different products. We can identify the following dimensions: • Product, which can be characterized by 15 different attributes. • Store, which characterizes each point of sale. The store dimension contain 15 attributes. • Time, which provides the appropriate detail to allow accurate analysis of the MDDB data. The time dimension have 9 attributes. • Promotion, which describes the characteristics of product promotions. The promotion dimension is characterized by 11 attributes.
Attribute hierarchy • An attribute hierarchy on a dimension table D is a set of functional dependencies FDD={fd0,fd1,…,fdn}, where each fdi is characterized by two sets of attributes Ail
Identification of Candidate Views The idea of the reduction technique is to consider only those views of an MD-lattice that , when materialized, can provide some contribution to reduce the total cost. We call them candidate views. • View vi is associated to some query qi. • There exist two candidate views vj and vk, and vi is the least upper bound (l.u.b) of vj and vk.
has an associated query • The cost of using a set of views already materialized: • The cost of using the materialization of view : When the materialization of will be beneficial.
There exist at least two candidate views, vj and vk, such that vi is the l.u.b of vj and vk • When vi and vk are materialized and vi is not, the cost C(Q,M,F) is : • If vi is materialized while vj and vk are not, with vi being the least expensive materialization for both qj and qk:
Data-cube lattice with associated queries psdr q4 q3 psd psr pdr sdr q2 ps pd pr sd sr dr q1 p s d r none q1 = total sales per product q2 = total sales per product and store q3 = total sales per product and day q4 = total sales per product, store and day
Operator ancestor The result of the ancestor operator to queries qx and qy is the “smallest” query that contains all the information necessary for answering qx as well as qy.
Operator descendent The descendent operator computes the “greatest” among the set of attributes characterizing the queries that can be computed by both qx and qy.
A heuristic reduction • Example: A dimension A with 1,000 tuples. A view contain an aggregation fro the pair of attributes {A1, A2}, where each attribute has 100 distinct values. There will be 10,000 possible pairs of values of the attributes! Instead of materializing this view, it could be convenient to use the view which has the key of dimension A as aggregating attribute.