100 likes | 238 Views
Promising “Newer” Technologies to Cope with the. Knowledge Discovery and Data Mining (KDD) Agent-based Technologies Ontologies and Knowledge Brokering Non-traditional data analysis techniques. Information Flood. Model Generation As an Example To Explain / Discuss Technologies.
E N D
Promising “Newer” Technologies to Cope with the • Knowledge Discovery and Data Mining (KDD) • Agent-based Technologies • Ontologies and Knowledge Brokering • Non-traditional data analysis techniques Information Flood Model Generation As an Example To Explain / Discuss Technologies
Why Do We Need so manyData Mining / Analysis Techniques? • No generally good technique exists. • Different methods make different assumptions with respect to the data set to be analyzed • Cross fertilization between different methods is desirable and frequently helpful in obtaining a deeper understanding of the analyzed dataset.
Data Mining and Business Intelligence Increasing potential to support business decisions End User Making Decisions Business Analyst Data Presentation Visualization Techniques Data Mining Data Analyst Information Discovery Data Exploration Statistical Analysis, Querying and Reporting Data Warehouses / Data Marts OLAP, MDA DBA Data Sources Paper, Files, Information Providers, Database Systems, OLTP
Decision Trees • Example: • Conducted survey to see what customers were interested in new model car • Want to select customers for advertising campaign training set
One Possibility age<30 Y N city=sf car=van Y Y N N likely unlikely likely unlikely
Another Possibility car=taurus Y N city=sf age<45 Y Y N N likely unlikely likely unlikely
Summary KDD • KDD: discovering interesting patterns from large amounts of data • A natural evolution of database technology, in great demand, with wide applications • A KDD process includes data cleaning, data integration, data selection, transformation, data mining, pattern evaluation, and knowledge presentation • Mining can be performed in a variety of information repositories • Data mining functionalities: characterization, discrimination, association, classification, clustering, outlier and trend analysis, etc. • Multi-disciplinary activity • Important Issues: KDD-methodologies and user-interactions, scalability, tool use and tool integration, preprocessing, interpretation of results, finding good parameter settings when running data mining tools,…
Where to Find References? • Data mining and KDD (SIGKDD member CDROM): • Conference proceedings: KDD, and others, such as PKDD, PAKDD, etc. • Journal: Data Mining and Knowledge Discovery • Database field (SIGMOD member CD ROM): • Conference proceedings: ACM-SIGMOD, ACM-PODS, VLDB, ICDE, EDBT, DASFAA • Journals: ACM-TODS, J. ACM, IEEE-TKDE, JIIS, etc. • AI and Machine Learning: • Conference proceedings: Machine learning, AAAI, IJCAI, etc. • Journals: Machine Learning, Artificial Intelligence, etc. • Statistics: • Conference proceedings: Joint Stat. Meeting, etc. • Journals: Annals of statistics, etc. • Visualization: • Conference proceedings: CHI, etc. • Journals: IEEE Trans. visualization and computer graphics, etc.