100 likes | 275 Views
The Data Warehouse and Technology. The issue of volumes of data is so important that it pervades all other aspects of data warehousing. Technological requirements for the data warehouse. Managing large amount of data Managing multiple media Index/Monitor data Interfaces to Many Technologies
E N D
The Data Warehouse and Technology • The issue of volumes of data is so important that it pervades all other aspects of data warehousing.
Technological requirements for the data warehouse • Managing large amount of data • Managing multiple media • Index/Monitor data • Interfaces to Many Technologies • Programmer/Designer Control of Data placement • Parallel storage/Management of data • Language Interface • Efficient Loading of Data • Efficient Index Utilization • Compaction of Data • Compound keys • Variable-length data • Lock Management • Index Only Processing • Fast Restore
DBMS Types and the Data Warehouse • Data warehouse manage massive amount of data because they contain the following: • Granular, atomic detail • Historical information • Summary as well as detail
Multidimensional DBMS and the Data Warehouse • Consider the difference between the multidimensional DBMS (m-DBMS) and the data warehouse (D/W) • The D/W holds massive amount of data; the m-DBMS holds at least an order of magnitude less data • The D/W is geared for a limited amount of flexible access; the m-DBMS is geared for very heavy and unpredictable access and analysis of data • The D/W contains data with a very lengthy time horizon – from 5 to 10 years; the m-DBMS holds a much shorter time horizon of data • The D/W allows analyst to access its data in a constrained fashion; the m-DMBS allow unfettered access • Instead of the D/W being housed in a m-DBMS, the m-DBMS and the D/W enjoy a complementary relationship
Multidimensional DBMS come in several flavor • The relational foundation for multidimensional DBMS data marts: • Strengths: • Can support a lot of data • Can support dynamic joining of dta • Has proven technology • Is capable of supporting general-purpose update processing • If there is no known pattern of usage of data, then the relational structure is as good as any other • Weaknesses: • Has performance that is less than optimal • Cannot be purely optimized for access processing
Multidimensional DBMS come in several flavor • The cube foundation for multidimensional DBMS data marts: • Strengths: • Performance that is optimal for DSS processing • Can be optimized for very fast access of data • If pattern of access of data is known, then the structure of data can be optimized • Can easily be sliced and diced • Can be examined in many ways • Weaknesses: • Cannot handle nearly as much data as standard relational format • Does not support general-purpose update processing • May take a long time to load • If access is desired on a path not supported by the design of the data, the structure is not flexible • Questionable support for dynamic joins of data
Context and Content • Three types of Contextual information must be managed: • Simple contextual information • Complex contextual information • External contextual information • Simple contextual information relates to the basic structure of data itself, and includes such things as these: • The structure of data • The encoding of data • The naming conventions used for data • The metrics describing the data, such as: • How much data there is • How fast the data is growing • What sectors of the data are growing • How the data is being used • Simple contextual information has been managed in the past by dictionaries, directories, system monitors, and so forth
Context and Content • Complex contextual information describes the same data as simple contextual information, but from a different perspective. This type of information address such aspects of data as these: • Product definition • Marketing territories • Pricing • Packaging • Organization structure • Distribution • Complex contextual information is some of the most useful and, at the same time, some of the most elusive information there is to capture.
Context and Content • External contextual information is information outside the corporation that nevertheles plays an important role in understanding information over time. Some examples of external contextual information include the following: • Economic forecasts: • Inflation • Financial trends • Taxation • Economic growth • Political information • Competitive information • Technological advancements • Consumer demographic movements • External contextual information says nothing directly about a company but says everything about the universe.