140 likes | 168 Views
Data Warehouse. Data Warehouse. Data warehouse and Data warehousing
E N D
Data Warehouse www.assignmentpoint.com
Data Warehouse • Data warehouse and Data warehousing • According to W.H. Inmon, a leading architect in the construction of data warehouse systems, a data warehouse is a subject-oriented, integrated, time-variant and nonvolatile collection of data in support of management’s decision making process. • Data warehousing is a collection of methods, techniques, and tools used to support knowledge workers—senior managers, directors, managers, and analysts—to conduct data analyses that help with performing decision-making processes and improving information resources. www.assignmentpoint.com
Data Warehouse • OLTP VS OLAP • OLTP (On-line Transaction Processing) • Major task of traditional relational DBMS • Day-to-day operations: purchasing, inventory, banking, manufacturing, payroll, registration, accounting, etc. • OLAP (On-line Analytical Processing) • Major task of data warehouse system • Data analysis and decision making www.assignmentpoint.com
Data Warehouse • OLTP VS OLAP • Distinct Features (OLTP Vs OLAP) • User and system orientation: OLTP is customer oriented, but OLAP is market oriented. • Data Contents: OLTP manages current data in detailed, but OLAP system manages large amount of historical and consolidated data. • Database Design: OLTP usually adopts ER data model and application oriented database design, but OLAP system adopts star and subject oriented database design. • View: OLTP system focuses on current data within an enterprise or department, but OLAP systems often spans multiple versions of a database schema and integrate information from many data stores. • Access pattern: OLTP access patterns consist of short, atomic transactions, require update operation, but OLAP systems require only read-only operations, also may involve complex queries. www.assignmentpoint.com
Data Warehouse • List some applications of data warehouse. • Three kinds of data warehouse applications • Information processing • - Supports querying, basic statistical analysis, and reporting using crosstabs, tables, charts and graphs • Analytical processing • - Mmultidimensional analysis of data warehouse data • - Supports basic OLAP operations, slice-dice, drilling, pivoting • Data mining • - Knowledge discovery from hidden patterns • - Supports associations, constructing analytical models, performing classification and prediction, and presenting the mining results using visualization tools www.assignmentpoint.com
Data Warehouse • List some applications of data warehouse. • Trade Sales and claims analyses, shipment and inventory control, customer care and public relations • Craftsmanship Production cost control, supplier and order support • Financial services Risk analysis and credit cards, fraud detection • Transport industry Vehicle management • Telecommunication services Call flow analysis and customer profile analysis • Health care service Patient admission and discharge analysis and bookkeeping in accounts departments www.assignmentpoint.com
Data Warehouse • STAR Schema • A data warehouse requires a concise, subject-oriented schema that facilitates on-line data analysis. It can be in the form of a Star schema, Snowflake schema or a Fact constellation schema. • Star schema: • A fact table in the middle connected to a set of dimension tables. It is the most common modeling paradigm. In this model, the data warehouse contains: • A large central table (fact table) containing the bulk of data with no redundancy and • A set of smaller attendant tables (dimension table), one for each dimension www.assignmentpoint.com
Data Warehouse • STAR Schema • In the following figure, Sales are considered along with four dimensions, namely time, item, branch and location. The schema contains a central fact table sales that contains keys to each of four dimensions, along with three (3) measures dollars_sold, units_sold and avg_sales. Each dimensional table contains a set of attributes. www.assignmentpoint.com
Data Warehouse • STAR Schema www.assignmentpoint.com
Data Warehouse • OLAP operations • Typical OLAP Operations: • Roll up (drill-up): summarize data • - by climbing up hierarchy or by dimension reduction • Drill down (roll down): reverse of roll-up • - from higher level summary to lower level summary or detailed data, or introducing new dimensions. • Slice and dice: Slice performs selection on one dimension of the data cube. The dice operation defines a subcube by performing a selection on two or more dimensions. • Pivot (rotate): re-orient the cube, visualization, 3D to series of 2D planes. • Drill across: Execute queries involving more than one fact table • Drill through: Makes use of relational SQL facilities through the bottom level of the cube to its back-end relational tables. www.assignmentpoint.com
Data Warehouse • OLAP operations • Other OLAP operations may include: • Ranking the top N or bottom N items in lists. • Computing moving averages, growth rates, interest, depreciation, statistic function. • OLAP offers analytical modeling capabilities, computing measure across multiple dimensions. It can generate summarizations, aggregations and hierarchies. • OLAP provides powerful data analysis tools. www.assignmentpoint.com
Data Warehouse • Three-tier data warehouse architecture • Three-tier data warehouse architecture is shown in the following figure: • The bottom tier is a warehouse database server that is almost always a relational database system. Data from operational databases and external sources are extracted using application program interface known as gateways. A gateway (ODBC, OLE-DB, JDBC) is supported by the underlying DBMS and allows client programs to generate SQL code to be executed at a server. www.assignmentpoint.com
Data Warehouse • Three-tier data warehouse architecture www.assignmentpoint.com
Data Warehouse • Three-tier data warehouse architecture • The middle tier is an OLAP server that is typically implemented using a relational OLAP (ROLAP) model (maps operations on multidimensional data to standard relational operations) a multidimensional OLAP model (directly implements multidimensional data and operations) • The top tier is a client, which contains query and reporting tools, analysis tools, and /or data mining tools. www.assignmentpoint.com