1 / 22

A Combinatorial Fusion Method for Feature Mining

A Combinatorial Fusion Method for Feature Mining. Ye Tian, Gary Weiss, D. Frank Hsu, Qiang Ma Fordham University Presented by Gary Weiss. Introduction. Feature construction/engineering often a critical step in the data mining process

claire
Download Presentation

A Combinatorial Fusion Method for Feature Mining

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Combinatorial Fusion Method for Feature Mining Ye Tian, Gary Weiss, D. Frank Hsu, Qiang Ma Fordham University Presented by Gary Weiss

  2. Introduction • Feature construction/engineering often a critical step in the data mining process • Can be very time-consuming and may require a lot of manual effort • Our approach is to use a combinatorial method to automatically construct new features • We refer to this as “feature fusion” • Geared toward helping to predict rare classes • For now it is restricted to numerical features, but can be extended to other features

  3. How does this relate to MMIS? • One MMIS category is local pattern analysis • How to efficiently identify quality knowledge from a single data source • Listed data preparation and selection as subtopics and also mentioned fusion • We acknowledge that this work probably is not what most people think of as MMIS

  4. How can we view this work as MMIS? • Think of each feature as piece of information • Our fusion approach integrates these pieces • Fusion itself is a proper topic for MMIS since it can also be used with multiple info sources • The fusion method we employ does not really care if the information (i.e., features) are from a single source • As complexity of constructed features increases, each can be viewed as a classifier • Each fused feature is an information source • This view is bolstered by other work on data fusion that using ensembles to combine each fused feature

  5. Description of the Method • A data set is a collection of records where each feature has a score • We assume numerical features • We then replace scores by ranks • Ordering of ranks determined by whether larger or small scores better predict class • Compute performance of each feature • Compute performance of feature combinations • Decide which combinations to evaluate/use

  6. Step 1: A data set

  7. Step 2: Scores replaced by Ranks

  8. Step 3: Compute Feature Performance • Performance measures how well feature predicts minority class • We sort rows by feature rank and measure performance on top n%, where n% belong to minority class • In this case we evaluate on top 3 rows. Since 2 of 3 are minority (class=1), performance = .66

  9. Step 3 continued

  10. Let F6 be fused F1F2F3F4F5 Rank combination function is average of ranks Compute rank of F6 for each record Compute performance of F6 as in step 3 Step 4: Compute Performance of Feature Combinations

  11. Step 5: What Combinations to Use? • Given n features there are 2n – 1 possible combinations • C(n,1) + C(n,2) … C(n.n) • This “fully exhaustive” fusion strategy is practical for many values of n • We try other strategies in case not feasible • k-exhaustive strategy selects k best features and tries all combinations • k-fusion strategy uses all n features but fuses at most k features at once

  12. Combinatorial Fusion Table

  13. Combinatorial Fusion Algorithm • Combinatorial strategy generates features • Performance metric determines which are best • Used to determine which k features for k-fusion • Also used to determine order of features to add • We add a feature if it leads to a statistically significant improvement (p ≤ .10) • As measured on validation data • This limits the number of features • But requires a lot of computation

  14. Example Run of Algorithm

  15. Description of Experiments • We use Weka’s DT, 1-NN, and Naïve Bayes methods • Analyze performance on 10 data sets • With and without fused features • Focus on AUC as the main metric • More appropriate than accuracy especially with skewed data • Use 3 combinatorial fusion strategies • 2-fusion, 3-fusion, and 6-exhaustive

  16. Results Summary Results over all 10 Data Sets Results over 4 Most Skewed Data Sets (< 10% Minority)

  17. Discussion of Results • No one of the 3 fusion schemes is clearly best • The methods seem to help, but the biggest improvement is clearly with the DT method • May be explained by traditional DT methods having limited expressive power • They can only consider 1 feature at a time • Can never perfectly learn simple concepts like F1+F2 > 10, but can with feature-fusion • Bigger improvement for highly skewed data sets • Identifying rare cases is difficult and may require looking at many features in parallel

  18. Future Work • More comprehensive experiments • More data sets, more skewed data sets, more combinatorial fusion strategies • Use of heuristics to more intelligently choose fused features • Performance measure now used only to order • Use of diversity measures • Avoid building classifier to determine which fused features to add • Handle non-numerical features

  19. Conclusion • Showed how a method from information fusion can be applied to feature construction • Results encouraging but more study needed • Extending the method should lead to further improvements

  20. Questions?

  21. Detailed Results: Accuracy

  22. Detailed Results: AUC

More Related