1 / 20

Data Wharehousing OLAP Data Mining

Data Wharehousing OLAP Data Mining. S. Costantini Università degli Studi di L’Aquila stefcost@di.univaq.it. Ringraziamenti (Acknowledgment).

Mia_John
Download Presentation

Data Wharehousing OLAP Data Mining

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Wharehousing OLAP Data Mining S. Costantini Università degli Studi di L’Aquila stefcost@di.univaq.it

  2. Ringraziamenti (Acknowledgment) • Part of this material is taken from: Database Systems: The Complete Book, by Hector Garcia-Molina, Jeff Ullman, and Jennifer Widom, edited by Prentice-Hall. • URL: http://www-db.stanford.edu/~ullman/dscb.html S. Costantini / Data Wharehousing

  3. Cos’è in sostanza un Data Wharehouse? • E’ una vista materializzata • Aggiornata a intervalli stabiliti (a seconda dell’applicazione) • E’ un cosiddetto “sistema di integrazione di dati” perché può contenere dati provenienti da vari database (detti “sorgenti”) S. Costantini / Data Wharehousing

  4. Perché i Data Warehouse? • Perché le query di analisi statistica ed esame dei dati per estrarne varie informazioni (dette query “OLAP”, vedi seguito) sono pesanti e diminuiscono troppo la performance del sistema. Però non necessitano della versione più aggiornata dei dati. S. Costantini / Data Wharehousing

  5. Perché i Data Wharehouse • Allora conviene separare le query usuali dalle query OLAP, creando per queste ultime un Data Wharehouse • Per le query OLAP il modello relazionale non è ottimale, quindi nel creare un Data Wharehouse il modello dei dati viene modificato. S. Costantini / Data Wharehousing

  6. Observation • Traditional database systems are tuned to many, small, simple queries. • Some new applications use fewer, more time-consuming, complex queries. • New architectures have been developed to handle complex “analytic” queries efficiently. S. Costantini / Data Wharehousing

  7. The Data Warehouse • The most common form of data integration. • Copy sources into a single DB (warehouse) and try to keep it up-to-date. • Usual method: periodic reconstruction of the warehouse, perhaps overnight. • Frequently essential for analytic queries. S. Costantini / Data Wharehousing

  8. OLTP • Most database operations involve On-Line Transaction Processing (OTLP). • Short, simple, frequent queries and/or modifications, each involving a small number of tuples. • Examples: Answering queries from a Web interface, sales at cash registers, selling airline tickets. S. Costantini / Data Wharehousing

  9. OLAP • Of increasing importance are On-Line Application Processing (OLAP) queries. • Few, but complex queries --- may run for hours. • Queries do not depend on having an absolutely up-to-date database. S. Costantini / Data Wharehousing

  10. OLAP Examples • Amazon analyzes purchases by its customers to come up with an individual screen with products of likely interest to the customer. • Analysts at Wal-Mart look for items with increasing sales in some region. S. Costantini / Data Wharehousing

  11. Data Warehouses • Doing OLTP and OLAP in the same database system is often impractical • Different performance requirements • Analysis queries require data from many sources • Solution: Build a “data warehouse” • Copy data from various OLTP systems • Optimize data organization, system tuning for OLAP • Transactions aren’t slowed by big analysis queries • Periodically refresh the data in the warehouse S. Costantini / Data Wharehousing

  12. Common Architecture • Relational Databases handle OLTP. • Local databases copied to a central warehouse overnight. • Analysts use the warehouse for OLAP. S. Costantini / Data Wharehousing

  13. Definition of data warehousing A data warehouse is a subject-oriented, integrated, time-variant and non-volatile collection of data in support of management’s decision making process. S. Costantini / Data Wharehousing

  14. Loading the Data Warehouse Data is periodically extracted Data is cleansed and transformed Data Staging Area Users query the data warehouse Source Systems (OLTP) Data Warehouse S. Costantini / Data Wharehousing

  15. Data Mining • Data mining is a popular term for queries that summarize big data sets in useful ways. • Examples: • Clustering all Web pages by topic. • Finding characteristics of fraudulent credit-card use. S. Costantini / Data Wharehousing

  16. Data Warehouse Enterprise “Database” Customers Orders Transactions Vendors Etc… Etc… • Data Miners: • “Farmers” – they know • “Explorers” - unpredictable Copied, organized summarized Data Warehouse Data Mining S. Costantini / Data Wharehousing

  17. Market-Basket Data • An important form of mining from relational data involves market baskets = sets of “items” that are purchased together as a customer leaves a store. • Summary of basket data is frequent itemsets = sets of items that often appear together in baskets. S. Costantini / Data Wharehousing

  18. Data Mining Flavors • Directed – Attempts to explain or categorize some particular target field such as income or response. • Undirected – Attempts to find patterns or similarities among groups of records without the use of a particular target field or collection of predefined classes. S. Costantini / Data Wharehousing

  19. Data Mining Examples in Enterprises • Government • Track down criminals (Police also) • Treasury Dept – suspicious int’l funds transfer • Phone companies • Supermarkets & Superstores • Mail-Order, On-Line Order S. Costantini / Data Wharehousing

  20. Data Mining Examples in Enterprises • Financial Institutions • Insurance Companies • Web sites • Many others… S. Costantini / Data Wharehousing

More Related