1 / 37

Chapter 6: Designing and Building the Data Warehouse

Chapter 6: Designing and Building the Data Warehouse. Modern Data Warehousing, Mining, and Visualization: Core Concepts by George M. Marakas. 6-1: The Enterprise Model Approach To DW Design. The first step in planning is to define the information requirements of the enterprise.

zia
Download Presentation

Chapter 6: Designing and Building the Data Warehouse

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 6: Designing and Building the Data Warehouse Modern Data Warehousing, Mining, and Visualization: Core Concepts by George M. Marakas © 2003, Prentice-Hall

  2. 6-1: The Enterprise Model Approach To DW Design • The first step in planning is to define the information requirements of the enterprise. • Enterprise model – a number of separate models that, when properly integrated, provide a coherent picture of the enterprise. • One important element of the EM is the corporate process model, a highly-structured description of the elementary processes in the enterprise. • Another is the corporate data model, which describes the entities generated by or required by the processes. © 2003, Prentice-Hall

  3. Design Strategies • Top-down DW design – the data warehouse design is based on the enterprise model itself. It implies a strategic, rather than operational, perspective of the data. • Bottom-up DW design – focuses more on making use of the data available in the current system. This is less effort than the top-down approach, but may end up with a DW that does not satisfy all of the organization’s information needs. © 2003, Prentice-Hall

  4. The Concept of Enterprise Integration Integration must occur on three distinct levels: • Horizontal integration: each application is fully integrated within its own boundaries • Vertical integration: the application is in harmony with its stated business requirements. • Enterprise integration: the data and processes throughout the organization, and all interactions between them, will be standardized. © 2003, Prentice-Hall

  5. Entity identification – what are the building blocks? ER diagramming – how do entities relate to others? Process identification – what is the workflow? Entity life cycle analysis – does work modify entities? Event analysis – how do events cluster to processes? Association diagramming – how do objects interact? Critical factor analysis – what is needed for success? SWOT analysis – what are strengths, weaknesses? Information needs analysis – what data does each business area need most? Systems evaluation – are current systems sufficient? Activities Associated With EM Construction © 2003, Prentice-Hall

  6. 6-2: The Data Warehouse Project Plan First on the agenda is a readiness assessment. Readiness has five key factors: • Strong management sponsorship – perhaps the most important factor • Compelling business motivation – the value of a DW is what it does for the business • Quality of the IS-business partnership – the IS group needs to be a central part of the business © 2003, Prentice-Hall

  7. 6-2: The Data Warehouse Project Plan (cont.) First on the agenda is a readiness assessment. Readiness has five key factors: • Culture of the organization – the organization must place a high value on information and analysis • DW feasibility – are the current data accurate and standardized enough to make integration feasible? © 2003, Prentice-Hall

  8. A Litmus Test for DW Readiness Kimball, et al developed some tests that make it relatively easy to determine readiness. Table 12-2 contains full details, but some tests are: For management sponsorship: low-level readiness is signaled by not being able to access top execs or their lack of understanding for what DW development entails. For the IS-business partnership: low-level readiness is signaled by a firm using too many outside IS consultants, or if the DW team was formed without any IS participation. © 2003, Prentice-Hall

  9. Addressing DW Readiness Shortfall • If the litmus test identifies only a few shortfalls, there are several techniques that can be applied to address lack of readiness. • A business requirements analysis performed at a high level can identify the areas to target. • Another approach is to do a business requirements prioritization which will help us determine in what order to proceed. • We can do a pilot study or isolated rollout of technology as a proof of concept. © 2003, Prentice-Hall

  10. Example of a Project Prioritization Matrix Project 6 would be the first choice for implementation, then Project 7. The two lowest in rank are 1 and 8. © 2003, Prentice-Hall

  11. Defining the DW Project Scope • The scope establishes a boundary for the project and defines both what it is and what it is not. • One of the first things to do is decide “what we are scoping”. Near term? Three years? • An unwritten rule is that the project manager wants to make as few promises as possible. • You want to avoid “scope creep”, where the project continues to expand. © 2003, Prentice-Hall

  12. Guidelines for Developing Initial Scope • It should be a joint project between IS and other business functions. • The initial scope should be meaningful and manageable. • The initial focus should be on a single business requirement from a single process. • The number of initial users should be limited. • Success criteria should be developed at the start, along with the scope. © 2003, Prentice-Hall

  13. Defining the Business Justification • Once the scope is established, the next step is to develop the business case. • This is really no mystery, just an identification of project (1) anticipated benefits and (2) likely costs. • If a project cannot be justified before it begins, no such justification will be coming after money and time are invested in it. © 2003, Prentice-Hall

  14. Issues in Making the Business Case • The best way to view the DW project is as a major capital investment by the corporation. • There are three perspectives from which to argue for the project: facts, faith and fear. • During the 1980s, when companies made heavy IS investments, there really was no increase in productivity. On the other hand, studies have shown that IS staffers were the company’s most productive. • This productivity paradox is perhaps due to the fact that many IS investments were not justifiable in the first place. © 2003, Prentice-Hall

  15. Three bases of project Justification • “The data warehouse project will have a net present value of $753,000.” • “The data warehouse will yield a minimum reduction in operating cost of $193,000 annually.” • “The estimated increase in market share is 14.7 percent within the first 24 months of operation.” • “A data warehouse is part of the infrastructure, we can’t cost justify it like a new fleet of trucks.” • “It seems reasonable to assume that this data warehouse will reduce our costs of servicing this market segment.” • “Trust me. This is why you hired me as IT director.” • “Our competitors are doing this even as we speak.” • “Our shareholders will view us as technologically behind if we don’t do this now.” • “We have a small window of opportunity here and we are wasting precious time trying to decide.” Facts Faith Fear © 2003, Prentice-Hall

  16. Economic Feasibility Analysis • This activity is a type of cost-benefit analysis, complicated by the fact that many benefits are intangible and hard to measure. • Tangible benefits are either cost reductions or revenue increases. • Intangible benefits include improved morale, increased product quality, decrease in time to market, reduction in turnover, increased competitive advantage and more timely information. © 2003, Prentice-Hall

  17. Tangible and Intangible Costs • Tangible development costs include salaries, consultant fees, hardware and software purchases and data conversion. • Tangible operating costs include annual licenses, upgrades, repairs, user training and depreciation. • Intangible costs include disruption to environment, loss of goodwill, reduction in morale and diversion of attention to daily responsibilities. © 2003, Prentice-Hall

  18. An Example of Measuring Intangibles Suppose we believe that there is about a 60 percent chance that one out of every 10 workers will be distressed during the initial implementation. Our normal profits are $2,000,000, and we believe they would be zero if the entire workforce was distressed. Profit reduction is: (.1)(2000000) = $200,000 and it has a 60% chance of occurring, so we estimate the “distress” cost at (.6)(200000) = $120,000 Is this right? Probably not! But, isn’t it better to have an estimate based on reasonable assumption than making no estimate at all? © 2003, Prentice-Hall

  19. Economic Feasibility Measures • Assessment of feasibility has to recognize the time value of money since many costs occur immediately while benefits accrue over several years. • On the next two slides we show a project that returns a net present value of $890,806 over five years. The internal rate of return is 38% and breakeven is estimated at 2 years and 7 months. © 2003, Prentice-Hall

  20. Spreadsheet Model of NPV Analysis © 2003, Prentice-Hall

  21. Example of Graphical Breakeven Analysis © 2003, Prentice-Hall

  22. Developing the DW Project Plan • Staffing issues: Table 12-8 lists 14 different staffing roles in a typical DW project. Some key roles are project manager, business systems analyst, database manager, application developer and quality assurance analyst. © 2003, Prentice-Hall

  23. Developing the DW Project Plan (cont.) • The Formal Plan:two key characteristics of a well-designed DW project plan are high integration and high detail. Although each one is different, the plan should contain a list of required resources, an estimate of project schedule, estimated start and end dates, a clear list of task dependencies and a set of flags that will be signals that the project is late. © 2003, Prentice-Hall

  24. 6-3: Specifying the Analysis and Design Tools • While an entity-relationship modeling approach from relational database design could be used, the dimensional modeling approach to logical design is more often used for a data warehouse. • One reason is that an enterprise-level ERM would be too complex to understand. • There are three basic elements in the dimensional model (1) a central table called the fact table, (2) two or more smaller tables called the dimensional tables, and (3) and a set of relationships called the star join. © 2003, Prentice-Hall

  25. Dimensional Model for a Retail Sales Forecast © 2003, Prentice-Hall

  26. ERM versus DM • The relationship of the two is that an ER diagram can be decomposed into several DM diagrams. • The ERM represents multiple processes while a DM just one. • The star join is flexible and scalable. New data rows can be added to existing tables, and entirely new tables can be accommodated. © 2003, Prentice-Hall

  27. Designing a DW Fact Table In general, there are four basic steps: • Choosing the data mart for the small group of end users we deal with. • Fact table granularity (the smallest defined level of data in the table) is determined. • Fact table dimensions are selected. • Determine the facts for the table. In most cases, the granularity is at the transaction level, so the fact is the amount. © 2003, Prentice-Hall

  28. 6-4: Warehouse Architecture Specification and Development The next phase is the design of the technical architecture. Certain elements need to be included: • Common sources: a data mart needs to be stored in a central location so others can use it. • Common dimensions: these can be shared as well. • Common business rules: the DW team needs to identify as many of these as possible. © 2003, Prentice-Hall

  29. 6-4: Warehouse Architecture Specification and Development (cont.) The next phase is the design of the technical architecture. Certain elements need to be included: • Common semantics: there should be enterprise-wide definition of terms. • Common metrics: things should be measured the same way across the enterprise. © 2003, Prentice-Hall

  30. DW Architecture Key Component Areas • Data architecture – each area in a business draws on different dimensions. Where they intersect they need to be defined the same (the customer that buys is the same we bill). • Infrastructure architecture – issues of size, scalability, and capacity need to be addresses. • Technical architecture – this is driven by the metadata catalog. Services should draw parameters from tables, not hard-code them. © 2003, Prentice-Hall

  31. The Data Staging Process An important part of the technical architecture is the data staging process, which covers five major areas: • Extraction of data from multiple sources • Transformation and cleansing • Loading the data, often to multiple targets • Security access and encryption policies • Job control, which includes scheduling, monitoring and notification. © 2003, Prentice-Hall

  32. A Word About Metadata • In the sequence of steps from specifying the warehouse model, to implementing it, to using it for queries, few steps actually involve “real” data. All the rest involve metadata. • For architectural definition, the major elements of the metadata catalog should include business rules, transformation information, operations information and tool-specific information. © 2003, Prentice-Hall

  33. Developing a DW Technical Architecture • An architecture requirements document is drafted first. Next to each business requirement are listed the architectural implications. • Developing a DW architecture is a difficult task and needs to be faced with the proper approach. © 2003, Prentice-Hall

  34. Developing a DW Technical Architecture (cont.) • There are numerous prescriptions for this, but a simple four-layer process is often sufficient: • business requirements • technical architecture • standards definition • application product development. © 2003, Prentice-Hall

  35. Benefits of a Sound Architecture • Provides an organizing framework – shows where the components are and how they fit • Improved flexibility and maintenance – allows plug and play, permits quick addition of new data sources • Faster development and reuse – developers are better able to understand the DW process and content © 2003, Prentice-Hall

  36. Benefits of a Sound Architecture (cont.) • Management and communications tool – sets our expectations and defines our responsibilities • Coordinate parallel efforts – multiple independent efforts have a better chance to converge © 2003, Prentice-Hall

  37. 6-5: Data Warehouse Project Success Factors Building a DW is much like building any other complex information system in terms of what makes it a success. Some guidelines for a successful project are: • Have a strong project sponsor • Generate user buy-in • Identify business needs • Start with a narrow scope • Understand the organizational culture • Create clear requirements documents • Define all data and communication requirements • Prototype often © 2003, Prentice-Hall

More Related