600 likes | 612 Views
This article explores the challenges in designing financial warehouses and the lessons learned by Steve Simon, a SQL Server MVP and Senior BI Development Engineer at Atrion Networking Corporation. The article discusses practical examples, architecture changes, and techniques for optimizing warehouse design.
E N D
Challenges to designing financial warehouses, lessons learnt Why Atrion PASS Data Architecture Virtual Chapter Steve Simon MVP SQL Server BI http://www.infogoldusa.com
Steve Simon Steve Simon is a SQL Server MVP and a Senior Business Intelligence Development Engineer with Atrion Networking Corporation, Providence RI USA. He has been involved with database design and analysis for over 25 years. Steve has presented numerous papers at PASS summits over the years including PASS Europe, in addition to numerous presentations at SQL Saturday events. He is the chairperson of the Oracle/ SQL Server virtual chapter and is a PASS regional mentor.
Cold facts • Warehouse design will change with time • Two practical examples are used in this presentation.
The FDR • (Financial Data Warehouse) • or • How things can run amuck!!
Relational Spaghetti SELECT DISTINCT t.ip_address, f.fund FROM mf_job_dataset_fund f inner join mf_job j on f.job_name = j.job_name inner join mf_oncall_list o on j.job_name = o.job_name LEFT JOIN mf_client_job c ON c.job_name = j.job_name LEFT JOIN refclient RC ON f.fund = RC.Fund LEFT JOIN tmp_all_successor_jobs s ON s.extract_job = j.job_name LEFT JOIN mf_transmission t ON s.successor_job = t.job_name AND t.from_dataset NOT LIKE 'KKKK‘ LEFT JOIN mf_job j2 ON t.job_name = j2.job_name WHERE f.fund IN (‘AAAA’) AND j.job_name = f.job_name
The challenges were • Users expect reports to be rendered in under 30 seconds • Re submitting reports when no results come back. Middleware failure and tie-ups. • So what is the solution?
Back to Michael Jackson • Tables based upon subject areas AND REPORT TYPE (de-normalized) • Well indexed. • Easy to populate with what is required.
A ‘DAL’ is like a bowling ball Parameters in Process the ‘goodies’ Results out
GL • POS Joins that you would never expect • GL • DAL • Obscure • GL • View2 • Obscure • GL • View1 • Obscure • REF • View2 • POS • DAL • Obscure • REF • View1 • Obscure • GL • View3 • Obscure • REF • View3
In short • From 10 sites we found a lot of commonality. • Looked for ways to pull data with most optimal execution plans (across the board). • Millions of records in most tables. • M.J. to the rescue.
Example of view hell Column Name ID Data Type Null? FUND_ID 1 VARCHAR (8 ) N AGNT_BANK_FINS_NUM 2 VARCHAR (5 ) Y ASOF_CLIENT_SW 3 VARCHAR (1 ) Y BASE_CNTRY_CD 4 VARCHAR (2 ) Y BNFCY_TAX_ID 5 VARCHAR (9 ) Y BOND_SRC_CD 6 VARCHAR (2 ) Y CASH_COST_MTHD_IND 7 VARCHAR (1 ) Y CASH_SELL_TRANS_CD 8 VARCHAR (2 ) Y CLIENT_ACCT_NUM 9 VARCHAR (12 ) Y ……………………………. CLIENT_FUND_NUM 100 VARCHAR (4 ) Y CLIENT_ID 101 VARCHAR (8 ) Y
Architecture changes
Architecture prior to DAL DATABASE TABLE TABLE VIEW VIEW
Architecture with DAL DATABASE TABLE TABLE DAL
Meanwhile back in the grocery business
Never less than 900 million rows partitioned & event data 2.3 billion
End Client Data Session Data Warehouse Data
;with customerKeys as ( select customerKey, customerID from [DataWarehouse].[dbo].[CustomerHelper_MWG] ch join AcmePath.dbo.tempAcmeCoupontCustomers_1plusSessions t on t.fkCustomerID = ch.CustomerID ) SELECT sum(sales) as sales, basketID, k.CustomerID FROM [DataWarehouse].[dbo].[FactDailySales] fds join customerKeys k on fds.customerKey = k.customerKey where DateKey between 20120619 and 20120715 Group by basketID, k.CustomerID 3hr 21 minutes
;with customerKeys as ( select customerKey, customerID from [DataWarehouse].[dbo].[CustomerHelper_MWG] ch join AcmePath.dbo.tempAcmeCoupontCustomers_1plusSessions t on t.fkCustomerID = ch.CustomerID ) SELECT sum(sales) as sales, basketID, k.CustomerID FROM [SuperWarehouse].[dbo].[FactDailySales] fds join customerKeys k on fds.customerKey = k.customerKey where DateKey between 20120619 and 20120715 Group by basketID, k.CustomerID
Monitoring performance issues using Reporting Services
SELECT fc.CustomerID, fc.OrderID, fc.DateKey, pr.Level1CategoryID, pr.Level2CategoryID, pr.Level3CategoryID, pr.Level4CategoryID , SUM(fc.Dollars) as Dollars, SUM(fc.Units) as Units, SUM(fc.TotalWeight) as [Weight], SUM(fc.Units + (case fc.TotalWeight when 0 then 0 else 1 end)) as TotalUnits INTO rpt.Acme_Order FROM dwh.SalesOrderDetail fc INNER JOIN dwh.DimProduct pr on fc.ProductKey = pr.ProductKey GROUP BY fc.CustomerID, fc.OrderID, fc.DateKey, pr.Level1CategoryID, pr.Level2CategoryID, pr.Level3CategoryID , pr.Level4CategoryID