300 likes | 416 Views
Welcome. Building And Deploying A Data Warehouse With SQL Server ™ 7.0 Speaker name Title Microsoft Corporation. Background On Data Warehouse Goals. Right Information Integrated from multiple sources Summarized from details Right Format Flexible analysis Flexible delivery Right Time
E N D
Building And Deploying A Data Warehouse With SQL Server™ 7.0Speaker nameTitleMicrosoft Corporation
Background On Data Warehouse Goals • Right Information • Integrated from multiple sources • Summarized from details • Right Format • Flexible analysis • Flexible delivery • Right Time • Fast turnaround from business processes • Rapid query response
Data Warehouse Process Data characteristics • Targeted • Specialized (OLAP) • Raw detail • No/minimal history • Integrated • Scrubbed • History • Summaries Source OLTP systems Data marts Data warehouse • Load • Index • Aggregation • Replication • Data set distribution • Access and analysis • Resource schedulingand distribution • Extract • Scrub • Transform • Design • Mapping Meta data System monitoring Source: The Enterprise Group, Ltd.
Lower costs Acquisition, deployment, administration Scalability Individual decision maker to enterprise Integration Microsoft BackOffice and Office Access, metadata, movement, management Microsoft® SQL Server Data Warehousing Strategy
Building Using Data Warehouse/Data Mart Design End-User Tools Operational Data Data Transform/ Cleansing Data Marts Information Dir Managing Repository (Persistent Shared Meta-Data) Meta-Data Flow Data Flow Microsoft Data Warehousing Framework Data Warehouse Management
Products And Components • Microsoft SQL Server 7.0 • Data Transformation Services • Replication Services • Microsoft Repository • English Query • OLAP Services • Office 2000 • Excel 2000 PivotTables • Web Components Coveredtoday
Data Transformation Services - What Is It? • A generalized facility for importing, exporting and transforming data between different and same data sources
MainframeDB2 MVS, VSAM, CICS/IMS, NCR Teradata OLE DB/ODBC Oracle DTS SQL/400 SQL Server™ DB2/NT Data Compatibility • 100% OLE DB architecture (including ODBC) • Relational and non-relational sources • Text files and desktop sources
DTS Package • A self-contained definition of all the tasks to be performed as part of a transformation • Execute from GUI, scheduler, command line, or script • Contain multiple steps • Execute serially or in parallel • Provides simple “workflow” • Supports transactions across steps • Supports Visual Basic® Scripting, JScript™, Perl script
Typical Transformations • Data quality and validation • Missing values, scrubbing, exception handling • Data integration • Heterogeneous query, eliminating duplicates • Data transformation • Value mapping, merging/splitting columns • Data aggregation • All managed by central metadata with support for business rules and data lineage
Repository MTS DTS Form VBA SQL 7.0 Repository - What Is It? • Infrastructure for sharable and reusable metadata about applications, components, data… • Components • Information models • COM interfaces and SQL schema • Repository engine • Modelling and admin tools
Data Warehousing Extensions • Provide new domains for data warehousing • DTS and OLAP model definitions • Implement common data warehousing infrastructure • Reducing the cost of data warehousing tools • Allowing ISVs to focus on features • Developed with 60 vendors in an open design process, specifications available on our Web site • http://www.microsoft.com/repository
English Language Query - What Is It? • Translates English questions into SQL • Who wrote the most books? select dbo.authors.au_fname, dbo.authors.au_lname, count(*) as q_count into #t000 from dbo.titleauthor, dbo.authors where dbo.titleauthor.au_id=dbo.authors.au_id group by dbo.titleauthor.au_id, dbo.authors.au_fname, dbo.authors.au_lname select #t000.au_fname as "First Name", #t000.au_lname as "Last Name", #t000.q_count as "count" from #t000 where #t000.q_count= (select max(t1.q_count) from #t000 t1)
Microsoft EQ Components • Authoring tool (a.k.a., domain editor) • Creates English query domain • COM Automation Server • Accepts English, returns SQL commands, requests for clarification or answer • Runs on the client- or middle-tier • Can be used from any automation controller: Visual Basic, Visual C++®, Visual J++™, VBScript, JScript, ASP
Author Entity Relationship Entity Entities/Relationships Who wrote the most books?
1. User submitsEnglish question 6. Answer displayed to user as HTML table 5.Answer retrieved from server as ADO recordset 2.Question submitted to English Query engine 3. SQL statement(s) returned in EQ response object 4. SQL submitted to server via ADO Conn.Execute English Query Application Internet Information Server ASP application Browser SQL Serverdatabase ADO English Query
OLAP Services - What Is It? • New OLAP server for SQL Server • Powerful, easy-to-use analysis tools • Integration with • Windows NT®, SQL Server, Office, third-party products • Any OLE DB data source • Scalable • From desktop to enterprise
Intelligentaggregations • Significantly smaller databases for same performance • Faster initial and incremental load times Flexible storagearchitecture • Supports MOLAP, ROLAP, and HOLAP equally well • Application requirements determine storage Ease-of-mgmt • Lower TCO • Broader accessibility of data warehousing PivotTable Service • Client-side cache - improves performance • Client/server architecture for Excel PivotTables • Mobile/disconnected analysis support OLAP Services Provides
Flexible Architecture • SQL 7 OLAP Services supports MOLAP, ROLAP, and HOLAP • Users and applications see only cubes
Aggregations • Precalculated aggregations increase query speed • But lead to data explosion • Example: OLAP Council Benchmark (APB-1) • 20 MB source database • Vendor 1: 2.1GB* • Vendor 2: 7.5GB
Sales Hardware Software Computerproducts Boston 100 150 250 + New York 250 100 350 Northeast 350 250 Hierarchy Data cells: 4 Ratio: 2.25 Aggregation Agg cells: 5 Data Explosion 600 Data explosion ratio depends on the number of dimensions, the levels of the hierarchies, and the parent-child ratios
How SQL 7 OLAP ServicesHandles Data Explosion • Aggregation wizard finds the “80-20” rule in the data • The 20 percent of all possible pre-aggregations that provide 80 percent of the performance gain • Analyses level counts for each dimensions and parent-child ratios for each level
PivotTable Service • An in-process desktop multidimensional component • Shares OLAP server code • Leverages OLAP server (where present) • Ships with • SQL 7, Excel 2000, and Visual Studio • Provides • In memory data- and query-caching • Multidimensional formula engine • Local cube persistence
Data Warehouse Process Data characteristics • Targeted • Specialized (OLAP) • Raw detail • No/minimal history • Integrated • Scrubbed • History • Summaries Source OLTP systems Data marts Data warehouse • Load • Index • Aggregation • Replication • Data set distribution • Access and analysis • Resource schedulingand distribution • Extract • Scrub • Transform • Design • Mapping Meta data System monitoring Source: The Enterprise Group, Ltd.