90 likes | 321 Views
Top10 DM Algorithms. AMCS/CS 340: Data Mining. Xiangliang Zhang King Abdullah University of Science and Technology. Story behind the Top10. What are the most influential data mining algorithms in the research community ?
E N D
Top10 DM Algorithms AMCS/CS 340: Data Mining Xiangliang Zhang King Abdullah University of Science and Technology
Story behind the Top10 • What are the most influential data mining algorithms in the research community ? • Covering the most important research and development topics in DM, such as classification, clustering, statistical learning, association analysis, and graph mining 2 Xiangliang Zhang, KAUST AMCS/CS 340: Data Mining
International Conference on Data Mining (ICDM) in 2006 • Call for nomination • Who? Winners from ACMKDD Innovation Award and IEEE ICDM Research Contributions Award • How? Each nominates up to 10 best-known algorithms in data mining, giving (a) the algorithm name, (b) a brief justification, and (c) a representative publication reference • Verify each nomination for its citations on Google Scholar. Remove the nominations whose citations < 50 • Vote by PC members of KDD06, ICDM06 and SDM06 • ------------------------- • The same Top10 algorithms are voted from 145 attendees of ICDM06 3 Xiangliang Zhang, KAUST AMCS/CS 340: Data Mining
Who are they? • C4.5, • k-Means, • SVM, • Apriori, • EM, • PageRank, • AdaBoost, • k-NN, • Naive Bayes, • CART 4 Xiangliang Zhang, KAUST AMCS/CS 340: Data Mining
We will discuss • C4.5, • k-Means, • SVM, • Apriori, • DbScan, • PageRank, • AdaBoost, • k-NN, • Naive Bayes, • CART • Hierarchical clustering 5
Class Discussion • Class on Oct 31, Monday, 1:00 -- 2:30 • --- Moved from Nov 2, Wednesday • Discussion of Top10 DM algorithms • Win a chance to drop the homework with the lowest score 6 Xiangliang Zhang, KAUST AMCS/CS 340: Data Mining
Class Discussion • Self-clustering into 10 groups (no more than 2 per team) • Markingyour team members in my roster • Randomly choosing one of Top10 algorithms • Discussing inside your group • Presentingyour algorithm to the rest + questions (8 mins) 7 Xiangliang Zhang, KAUST AMCS/CS 340: Data Mining
What to present? • A brief description of your algorithm • Discuss the impact of your algorithm, • merits and issues • Review current and further research on your algorithm (use Google….) • Vote for 3 best algorithms which are clearly presented, and thoroughly discussed 8 Xiangliang Zhang, KAUST AMCS/CS 340: Data Mining
Who are the best? 9 Xiangliang Zhang, KAUST AMCS/CS 340: Data Mining