210 likes | 575 Views
Data Mining on ICDM Submission Data Shusaku Tsumoto Ning Zhong and Xindong Wu Data Mining on ICDM Submission Data 38 countries, 445 Submissions Regular Papers: 39 (9%) Short Papers: 66 (14.8%) High Acceptance Ratio (Regular) Germany: 4/15 (26.7%)
E N D
Data Mining on ICDM Submission Data Shusaku Tsumoto Ning Zhong and Xindong Wu ICDM 2004 Business Meeting 11/4/2004
Data Mining on ICDM Submission Data • 38 countries, 445 Submissions • Regular Papers: 39 (9%) • Short Papers: 66 (14.8%) • High Acceptance Ratio (Regular) • Germany: 4/15 (26.7%) • Finland: 2/ 9 (22.2%) • USA: 20/109 (18.3%) ICDM 2004 Business Meeting 11/4/2004
Country ICDM 2004 Business Meeting 11/4/2004
Data Mining on ICDM Submission Data • Top 5 Areas of Submissions: • Data mining applications • Data mining and machine learning algorithms and methods • Mining text and semi-structured data, and mining temporal, spatial and multimedia data • Data pre-processing, data reduction, feature selection and feature transformation • Soft computing and uncertainty management for data mining • High Acceptance Ratio Areas (Regular+Short) • Quality assessment and interestingness metrics of data mining results 5/10 50.0% • Data pre-processing, data reduction, feature selection and feature transformation 14/35 40.0% • Complexity, efficiency, and scalability issues in data mining 4/11 36.4% ICDM 2004 Business Meeting 11/4/2004
Corresponding Analysis(Country vs Final Decision) r2=0.177 Slovenia Regular Finland Italy Australia India Hong Kong Canada r1=0.378 Germany USA Reject UK France Japan Short ICDM 2004 Business Meeting 11/4/2004
Corresponding Analysis(Topics vs Final Decision) r2=0.184 Collaborative Filtering Applications Short Reject DM Methods Quality-assessment Soft-computing Preprocessing, Feature Selection r1=0.280 Security, privacy Regular Statistics and probability High-performance Human-machine interaction and visualization ICDM 2004 Business Meeting 11/4/2004 Post-processing
Corresponding Analysis • Country vs Final Decision • Regular: Germany, USA • Short: ? • Reject: Most of the countries are located near this region. • Topics vs Final Decision • Regular: Quality Assessment, Preprocessing/Feature Selection • Short: DM/ML Methods, Collaborative Filtering • Reject: DM Applications ICDM 2004 Business Meeting 11/4/2004
Rule Mining on ICDM Submission Data • Datasets • Sample Size: 445 • Attributes: 5 • Paper No. : ordered by submission date • # of Authors • # of Characters in Title • Country • Category • Analyzed by Clementine 7.1 (and SPSS12.0J) ICDM 2004 Business Meeting 11/4/2004
Rule Mining (C5.0)on ICDM Submission Data • C5.0 • [Topic=Mining semi-structured data,…] & [129< Paper No.<=369] => Reject (Confidence 0.87, Support 10) • [Country=USA] & [Topic=Mining semi-structured data,…] & [Paper No.>369] & [# of Authors <=3] =>Accept (Confidence 0.667, Support 3) • [Topic=Preprocessing/Feature Selection] & [# of Authors>4] => Accept (Confidence: 1.0, Support 3) • Topic, Paper No, # of Authors : Important Features ICDM 2004 Business Meeting 11/4/2004
Rule Mining (GRI)on ICDM Submission Data • Generalized Rule Induction • [# of Authors <2] & [Paper No. <120.5] => Rejected (Confidence 96.0%, Support 24) • [# of Chars in Title< 27] & [Paper No. > 212] => Accepted (Confidence 100%, Support 5) • Paper No., # of Chars in Title, # of Authors: Important Features ICDM 2004 Business Meeting 11/4/2004
Multidimensional Scaling(2004) Country Decision Paper No. Review Score Topics # of Authors # of Chars in Title ICDM 2004 Business Meeting 11/4/2004
Summary (2004) of Mining on ICDM Submission Data • Do not submit a paper too fast ! • Reflection not only on the contents, but also on the titles needed • Mining Text/Web/Semi-structured Data are very popular. • # of Application papers are growing now. (But, many: rejected) • Strong Topics • Preprocessing/Feature-Selection • Postprocessing • Security and Privacy • Several topics are emerging in ICDM2004: • Mining Data Streams • Collaborative Filtering • Quality Assessment ICDM 2004 Business Meeting 11/4/2004
Comparison between 02-04Review Scores: Box-plot ICDM 2004 Business Meeting 11/4/2004
Comparison between 02-04Countries ICDM 2004 Business Meeting 11/4/2004
Multidimensional Scaling(2003 and 2004) Topological structure w.r.t. similarities seems not to be changed in 2003 and 2004. Country Decision Paper No. 2004 Topics Review Score # of Authors Country # of Chars in Title 2003 Decision Paper No. Review Score Topics # of Authors ICDM 2004 Business Meeting 11/4/2004 # of Chars in Title
Data Mining on ICDM Submission Data • Acknowledgements • Many thanks to • PC chairs, Vice Chairs and PC members • All the authors • All the contributors to ICDM2004 • See you again in ICDM2005! ICDM 2004 Business Meeting 11/4/2004
Multidimensional Scaling(2004) Country Decision Paper No. Review Score Topics # of Authors # of Chars in Title ICDM 2004 Business Meeting 11/4/2004
Multidimensional Scaling(2003) Country Decision Paper No. Review Score Topics # of Authors # of Chars in Title ICDM 2004 Business Meeting 11/4/2004