60 likes | 176 Views
Real-time Knowledge Discovery and Dissemination Panel Position. Dr. Bhavani Thuraisingham The University of Texas at Dallas bhavani.thuraisingham@utdallas.edu. February 2008. What is Real-time?. Hard Real-time The transactions/queries/updates must meet strict timing constraints
E N D
Real-time Knowledge Discovery and Dissemination Panel Position Dr. Bhavani Thuraisingham The University of Texas at Dallas bhavani.thuraisingham@utdallas.edu February 2008
What is Real-time? • Hard Real-time • The transactions/queries/updates must meet strict timing constraints • Soft Real-time • The goal is to have as many transactions/queries/updates to meet their timing constraints • Other concepts • Firm real-time. Near real-time, Fast
Real-time Transaction/Query Processing • Transactions may be assigned priorities based on their deadline; e.g., transactions that have short deadline may have higher priorities • If two transactions compete for a resource then the higher priority transaction is given the resource • If a transaction T1 is of higher priority than transaction T2, and if T1 wants a resource that T2 has then T2 may be aborted and the resource is given to T1 • Approximate query processing is a technique developed for handling queries that have to return results in real-time • Main memory data management is an approach for real-time data management
Real-time Knowledge Discovery (RT-KDD) • How does a data mining technique meet the timing constraint? • E.g., if an association rule mining algorithm has a 5 minutes constraint, then should it output as many rules as possible within 5 minutes • How does this affect the accuracy of the results? • Will there be an increase in false positives and negatives? • Approximate data mining • Are there techniques analogous to techniques in approximate query processing • Are incomplete results better than no results • What are the applications for RT-KDD • Give the results to the war fighter in 5 minutes so that he can take appropriate actions
Real-time Knowledge Discovery (Concluded) • Should we take each data mining techniques and develop a real-0time version of it • This is how research is proceedings in privacy preserving data mining • Privacy preserving association rule mining, Privacy preserving decision treed etc. • Is a real0time data warehouse needed for real-time data mining? • If so, how do we build a real-time data warehouse? • Can we do without a human in the loop? • If the war fighter needs answer within 5 minutes, then is there time for a human to analyze the results? • Does real-time knowledge discovery mean knowledge discover to be carried out during operational time instead of analytical processing?
Directions • There is research now on steam knowledge discovery and active data management; this will contribute toward RT-KDD • We also need to start a research program on integrating real-time processing research with KDD research • When we started with privacy preserving data mining there were questions as to whether we can have privacy as well as useful data mining research; the community is now more accepting of this research. Thee are opportunities for RT-KDD • We need to focus on the types of applications that RT-KDD will be needed. • Focus more on software real-time approach than a hard real-time approach • Building models in real-time as well techniques meeting real-time constraints