240 likes | 389 Views
Data Warehouse. . Group 5 Kacie Johnson Summer Bird Washington Farver Jonathan Wright Mike Muchane. Outline. I. Data warehouse definition and integrated technologies II. OLAP and OLTP III. The concept of data warehousing IV. How data warehouses are used by companies
E N D
Data Warehouse . Group 5 Kacie Johnson Summer Bird Washington Farver Jonathan Wright Mike Muchane
Outline I. Data warehouse definition and integrated technologies II. OLAP and OLTP III. The concept of data warehousing IV. How data warehouses are used by companies V. History of data warehousing VI. Advantages and Disadvantages VII. Future applications
Definition • A data warehouse is a logical collection of information gathered from many different operational databases used to create business intelligence that supports business analysis activities and decision-making tasks.
Business Intelligence • Business intelligence usually refers to the information that is available for the enterprise to make decisions on. A data warehousing (or data mart) system is the backend, or the infrastructural, component for achieving business intelligence
Data Mart • A database that has the same characteristics as a data warehouse, but is usually smaller and is focused on the data for one division or one workgroup within an enterprise.
Data mining tools are Software tools used to query information in a data warehouse. Consist of: Query-and-Reporting tools Intelligent Agents Multidimensional analysis tools (MDA) Statistical tools Data Mining Tools
A data warehouse uses OLAP (On-Line Analytical Processing) to collect, organize, and make data available for the purpose of analysis - to give management the ability to access and analyze information about its business. This type of data can be called “informational data”. OLAP
OLTP • Most data is collected to handle a company's on-going business. This type of data can be called "operational data". The systems used to collect operational data are referred to as OLTP (On-Line Transaction Processing).
Data Warehouse Is… • Subject Oriented • Integrated • Time Variant • Nonvolatile Collection of Data for Management’s Decisions
Building Blocks • Source Data • Date Staging • Data Storage • Information Delivery • Metadata • Management and Control
Design of DW • Integration: facilitates an overview and analysis in the data warehouse • Separation: operations used for reporting, decision support, analysis and controlling
Dimensions and Measures • Dimensions: categorizes each item in a data set in non-overlapping regions. • Measures: a property that can be summed or averages using pre-computed aggregates.
Types of Data Warehouse • Financial • Insurance • Human Resources • Global • Data Mining/Data Mining and Exploration • Telecommunications
Before DW • Executives and decision makers could get critical information that already existed on the organization • The available data was exceedingly difficult to get (“data in jail”) • Only a fraction of the data captured, processed and stored was actually available (“data poor”)
DW In Companies • Validation: where users validate what they already believe to be true (45%) • TacticalReporting: where the user uses the data for tactical reasons (40%) • Exploration: where the user searches for knowledge not already known (15%)
Why the volume of data is exploding: • DWs carry historical data • DWs carry detailed data • DWs carry data for which there is no known need • DWs carry eCommerce data
Advantages • Cut costs • Boost revenues • Saves time • Better customer service • Avoids old data • Queries or reports without impacting the performance of the operational systems • Combines related data from separate sources • Increased data consistency • Improves access to a wide variety data
Disadvantages • Can complicate business processes. • Data warehousing can have a learning curve that may be too long for impatient firms. • Can require a great deal of "maintenance.” • The cost to capture data, clean it up, and deliver it . • Inability to adapt quickly to changing business conditions or requirements.
Future Developments • Development of parallel DB servers with improved query engines will make it possible to access huge data bases in much less time • Another new technology is data warehouses that allow for the mixing of traditional numbers, text and multi-media. The availability of improved tools for data visualization (business intelligence) will allow users to see things that could never be seen before.