130 likes | 271 Views
Building a Data Warehouse for Business Reporting. Presented by – Arpit Desai Faculty Advisor – Dr. Meiliu Lu CSC Department – Spring 2006 California State University, Sacramento. Agenda. Fundamentals of Data Warehousing Data warehousing and Business Analysis
E N D
Building a Data Warehouse for Business Reporting Presented by – Arpit Desai Faculty Advisor – Dr. Meiliu Lu CSC Department – Spring 2006 California State University, Sacramento
Agenda • Fundamentals of Data Warehousing • Data warehousing and Business Analysis • Total Loss Valuation Reporting (TLV)– An example project • Designing the solution data mart • Implementing the design • Business Reports for TLV • Performance enhancement techniques • Questions & Suggestions
Fundamentals of Data Warehousing (DW) • What is OLTP? • OLTP – A technology that uses highly normalized tables to quickly record large number of transactions while making sure that these updates of data occur in as few places as possible • What is OLAP? • OLAP – A technology that uses database tables to enable multi-dimensional viewing, analysis and query large amounts of data • What is a Data warehouse? • DW is a collection of historic data from different functional operations of the company • What is a Data Mart? • Data Mart (DM) is a segment of DW that provides data for reporting
Data Warehousing and Business Analysis • Benefits of DW • Useful information derived from operational data • Specially designed to quickly execute aggregate queries on large amounts of data • Queries, complex in highly normalized databases, could be easier to build and maintain in data warehouses, decreasing the workload on transaction systems • Business Requirements • Organizations need to keep track of market trends – generally through numbers • Business analysts rely on Reports (facts, figures) to make critical business decisions • New features need to be added to existing products – like reporting
Total Loss Valuation (TLV) Reporting • Sub-part of Mitchell WorkCentertm, a product of Mitchell International, San Diego, CA • Vehicle damaged beyond repair is termed as “Totaled Out” • Insurance Company needs to analyze amount of money paid out • Summarized and Detailed reports about claims filed are required • Search criteria consists of company hierarchy, types of coverage, date ranges, etc.
Players of Auto Insurance Industry Insurance company pays Settlement Value Body Shop Adjuster creates a Valuation Request (estimate) Claimant brings damaged car to the body shop If cost of repair is more than market price of the vehicle Then The vehicle gets Totaled Out
Designing the Solution Data Mart (TLV) • Functionally restricted to TLV Reports • Operational data • Data Sources/Formats standardization • Data Load • Initial load • Periodic Updates • Identifying Dimensions • Define metrics values and aggregate amounts to be included in the Fact table • Design supporting structures like indexes, synonyms, database links, materialized views, global temp tables, foreign references • Design the complete Data Model with schema of execution
Implementing the Design (TLV) • Extract, Transform & Load • Initial Load involves • Staff Dimension • All user who ever created a Valuation request – Adjuster • Office Dimension • All Offices and company hierarchy, along with its geo-location details • Valuation Info Dimension • All the Valuation requests created thus far and its details • Time Dimension • All possible dates and summarizations • Valuation History Dimension • Currently not loaded, reserved for future extension • Valuation Fact • Contains aggregate values, counts, reference keys • Periodic Update • Data Mart refreshes nightly – flushes all the tables and reloads (poor design)
Business Reporting (TLV) • Claim Detail Report Procedure • Receives selection criteria as input parameters and returns a reference cursor • Shows details of all the claims submitted, which meet the selection criteria • Claim Summary Report Procedure • Input report selection criteria output is reference cursor • Shows summarized counts, averages, percentages for claims which meet the selection criteria
Performance enhancement techniques • Use Star schema for query execution and Bitmap indexes for all tables • Nightly refresh of data mart • Use ‘truncate’ instead of ‘delete’ • ‘Rebuild’ indexes instead of ‘drop-recreate’ • Run report procedures against views, offers flexibility to data mart structure • Build and unit test each chunk of extraction queries separately
Sample Queries that the data mart processes Select a.item1, a.item2, …, b.item1, c.item1, c.item2 from xxx_fact a, xxx_dim b, xxx_dim c where a.id_1=b.id_1 and b.id_2=c.id_2 and b.item5 = <param1> and a.item6 = <param2> and c.item9 = <param3> Order by c.item1
Questions & Suggestions Thank you