Intelligent Query Processing Techniques for Efficient Data Mining in Data Warehouses

ISDA'2003 Data Mining Techniques in Index Techniques Ying Wah Teh and Abu Bakar Zaitun tehyw@.um.edu.my, zab@um.edu.my University of Malaya Faculty of Computer Science and Information Technology January 5, 2020 1

Contents  Introduction  Query Processing Techniques  Evaluation of Data Mining Prototypes  Conclusion January 5, 2020 2

Introduction  What data to gather and how to conceptually model the data and manage its storage  Logical database design  Physical database design  Very large data storage nowadays  Redundant data structures the intelligent way of managing storage  Fast access to data  Selecting the right elements to build redundant data structures  Only a few data warehouse administrators can do justice to the task of picking the right redundant data structures.  January 5, 2020 3

Query Processing Techniques  Historical Perspectives  File Processing / Full Scan / Sequential Scan  Simple index  B-Tree index  Present Scenarios of Query Processing Techniques  BitMap Index  Single-column indexes January 5, 2020 4

File Processing  A programmer needs to know at least one-third generation language for writing a data retrieval program to access the relevant information from a file system.  Query processing techniques (sequential scan or full scan)  It is more suitable for the small data volume environment. January 5, 2020 5

January 5, 2020 6

Simple Indexes / Hashed Key  DBMSs were developed that included simple indexes.  It allows users to access information very quickly by a unique value.  It creates a list of record identification which acts as pointers to records.  Exactly key value to access data. January 5, 2020 7

January 5, 2020 8

B-tree indexes  Partial key lookups and exactly key lookup.  It is a very costly to create for every query.  The intelligent way of handling the B-Tree index. January 5, 2020 9

January 5, 2020 10

Present Scenario  Issues a query that only requires a small portion of the result of relations and the predicate is non-primary key.  Only one RID index can be used at a time. January 5, 2020 11

BitMap Index  Bit-vector approach  A RID occupies at least 8 bits, while a BitMap index occupies only 1-bit pointer to a tuple of the relation.  Work well only with low-cardinality data (Female, Male).  The intelligent way of handling the BitMap is the vital issue. January 5, 2020 12

Single-column indexes  Index intersection offers greater flexibility  A good strategy would be to define single- column indexes on all columns that will be frequently queries and let index intersection handle situation.  The intelligent way of handling the single- column indexes is the vital issue. January 5, 2020 13

Our Research Perspective  Most researchers apply data mining at the application level of data warehouse.  We applied data mining in the physical design of data warehouses to optimise the base relation. January 5, 2020 14

Architecture of One-column Index Selection January 5, 2020 15

Evaluation of Data Mining Prototypes January 5, 2020 16

Conclusion  It is necessary to have an intelligent way of handling the various query processing techniques (such as indexes).  Data mining techniques can be used in the physical design of a data warehouse to generate single- column indexes.  The positive results from the study should motivate further efforts to make it into a fully functional SQL engine. January 5, 2020 17

Thank You Questions? tehyw@um.edu.my January 5, 2020 18

Intelligent Query Processing Techniques for Efficient Data Mining in Data Warehouses

Intelligent Query Processing Techniques for Efficient Data Mining in Data Warehouses

Presentation Transcript

Conferences in January 2020

January 1, 2020

Wednesday, January 1, 2020

Thursday, January 2, 2020

Thursday, January 2, 2020

02 January 2020

3 January 2020

Friday, January 3, 2020

Saturday, January 4, 2020

January 4, 2020

04 January 2020

Saturday, 04 January 2020

04 January 2020

January 5, 2020

Sunday, January 5, 2020

Sunday, January 5, 2020

Monday, January 6, 2020

6 January 2020 | Brussels

DECEMBER 2019 – JANUARY 2020

Tuesday, 07 January 2020

08 January 2020

GalwayNewsletter January 2020