200 likes | 330 Views
Research topics in data warehouse . Directed By : Dr Rahgozar Mostafa h.Chehreghani. List of research topics. Lineage tracing Incremental view maintenance Indexing in data warehouse Data quality. Lineage tracing. List of papers : Using AutoMed Metadata in Data Warehousing Environments
E N D
Research topics in data warehouse Directed By : Dr Rahgozar Mostafa h.Chehreghani
List of research topics • Lineage tracing • Incremental view maintenance • Indexing in data warehouse • Data quality
Lineage tracing • List of papers : • Using AutoMed Metadata in Data Warehousing Environments • A Tutorial on the IQL Query Language • Practical Lineage Tracing in Data Warehouses • Incremental view maintenance and data lineage tracing in heterogeneous database environments • A Framework for supporting data integration using the materialized and virtual approaches
Lineage tracing • Automed: model for metadata in data warehouse • Use tag for relations • Use a language such as IQL • Node , Edge , Constraint • IOL: • Functional and typed language • Prefix and Infix functions • New functions by lambda • lambda {x,y,z} ((*) ((+) x y) z)
IQL • let v = q1 in q2 • let v = ((+) 200 500) in ((*) v v) • union : R ++ S • duplicate elimination: distinct (R) • setUnion R S Ξdistinct (R ++ S) • difference : R – S • projection : [{x,z} | {x,y,z} <- R] • Cartesian product and Joins • gc agFun xs • map f xs • Grouping and Aggregation Operations
Using IQL in Automed • Example : Enforce unique key constraint: (=) (count (distinct [n | {s,n} <- <<Student,name>>])) (count <<Student>>) • Name : field • Student : table
Example of lineage tracing • TS1,S2 = addNode (dept,{“Maths”,“CompSci”}); • addNode (person, [x| x mathematician] ++ [x| x compScientist]); • addNode (avgDeptSalary, {avg [s| (m,s)«_, mathematician, salary»]} ++ {avg [s| (c,s)«_, compScientist, salary»]}); • addEdge («_, dept, person», [( “Maths”, x)| x mathematician] ++ [(“CompSci”, x) | x compScientist]); • addEdge («_, person, salary», «_, mathematician,salary» ++ «_, compScientist, salary»); • addEdge («_, dept, avgDeptSalary», {( “Maths”, avg [s| (m,s) «_, mathematician, salary»]),
Example of lineage tracing • (“CompSci”, avg [s| (c,s)«_, compScientist, salary»])}); • delEdge («_, mathematician, salary», [(p, s)| (d, p) «_, dept, person»; (p’, s) «_, person, salary»; d = “Maths”; p = p’]); • delEdge («_, compScientist, salary», [(p, s)| (d, p) «_, dept, person»; (p’, s) «_, person, salary»; d = “CompSci”; p = p’}); • delNode (mathematician, [p| (d, p) «_, dept, person»; d = “Maths”]); • delNode («compScientist», [p| (d, p) «_, dept, person»; d = “CompSci”]);
Incremental view maintenance • List of papers • Incremental view maintenance and data lineage tracing in heterogeneous database environments • View maintenance in a warehousing environment • A System Prototype for Warehouse View Maintenance
Incremental view maintenance • Di : set of base relations • ΔDi : bags inserted into Di • ⌂Di : bags deleted from Di • V : materialized view • ΔV : bags inserted into V • ⌂V : bags deleted from V • Vnew = (V ++ ΔV) -- ⌂V • Minimality condition • ΔV C V • ΔV∩ ⌂V = Ø
Indexing in data warehouse • Paper • Bitmap Index Design and Evaluation • Advantages : • Compact size • Efficient hardware support for bitmap operations (AND, OR, XOR, NOT) • Fast search
Data quality in data warehouse • List of papers • Towards Quality-Oriented Data Warehouse Usage and Evolution • Data Quality Problems and Proactive Data Quality Management in Data-Warehouse-Systems • Data Warehouse Data Policy • Fitness for use • Subjective : • Related to end users • Objective : • Definition of system • Models: • GQM : Goal Question Metric • English
GQM • Goal factor • Importance of each factor determined respect to Goal • Quality dimension : • Data coherence • Data Completeness • Data freshness