1 / 1

Data not in the pre-defined feature vectors that can be used to construct predictive models.

0. represent. ANN. Petal.Length< 2.45. |. F1 F2 F4 Data1 1 1 0 Data2 1 0 1 Data3 1 1 0 Data4 0 0 1 ………. 1. DT. setosa. Petal.Width< 1.75. versicolor. virginica. 0. 1. SVM. LR. ANN. Petal.Length< 2.45. |. represent. DT. setosa. Petal.Width< 1.75.

oren-garcia
Download Presentation

Data not in the pre-defined feature vectors that can be used to construct predictive models.

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 0 represent ANN Petal.Length< 2.45 | F1 F2 F4 Data1 1 1 0 Data2 1 0 1 Data3 1 1 0 Data4 0 0 1 ……… 1 DT setosa Petal.Width< 1.75 versicolor virginica 0 1 SVM LR ANN Petal.Length< 2.45 | represent DT setosa Petal.Width< 1.75 F1 F2 F4 Data1 1 1 0 Data2 1 0 1 Data3 1 1 0 Data4 0 0 1 ……… Any classifiers you can name versicolor virginica SVM LR Any classifiers you can name Direct Mining of Discriminative and Essential Frequent Patterns via Model-based Search Tree Wei Fan, Kun Zhang, Hong Cheng, Jing Gao, Xifeng Yan, Jiawei Han, Philip S. Yu, Olivier Verscheure Motivation: Itemset Mining Motivation Experiments Proposed Algorithm • Data not in the pre-defined feature vectors that can be used to construct predictive models. • Applications: • Transactional database • Sequence database • Graph database • Frequent pattern is a good candidate for discriminative features, especially for data of complicated structures. • 5 Datasets:UCI Machine Learning Repository • Scalability Study: Problems Direct Mining & Selection via Model-based Search Tree • Case Study Procedures asFeature MinerOr Be Itself asClassifier 1 2 3 4 5 6 7 dataset mine & select 1 mine & select • Accuracy of Mined Itemsets 5 2 mine & select 6 7 3 4 …….. Few Data + + …….. • Why Frequent Patterns? • A non-linear conjunctive combination of single features • Increase the expressive and discriminative power of the feature space • Examples: • Exclusive OR problem & Solution Divide-and-Conquer Based Frequent Pattern Mining Mined Discriminative Patterns • Mine and Select most discriminative patterns; • Represent data in the feature space using such patterns; • Build classification models. • Take Home Message: • Highly compact and discriminative frequent patterns can be directly mined throughModel based Search Treewithout worrying about combinatorial explosion. • Software and datasets are available by contacting the authors. Graph Mining y • Analyses: • Scalability of pattern enumeration • Upper bound • “Scale down” ratio • Bound on number of returned features • Subspace pattern selection • Non-overfitting • Optimality under exhaustive search map data to higher space 1 0 • Scalability Study • 11 Datasets: • 9 NCI anti-cancer screen datasets • PubChem Project. • Positive class : 1% - 8.3% • 2 AIDS anti-viral screen datasets • URL: http://dtp.nci.nih.gov. • H1: 3.5%, H2: 1% L1 1 0 x L2 mine & transform Data is non-linearly separable in (x, y) Data is linearly separable in (x, y, xy) Conventional Frequent Pattern-Based Classification: Two-Step Batch Method • Predictive Quality of Mined Frequent Subgraphs Basic Flows: • Problems of Separated Mine & Select in Batch Method • Mine step:Issues of scalability and combinatorial explosion • Dilemma of setting minsupport • Promising discriminative candidate patterns? • Tremendous number of candidate patterns? • Select step:Issue of discriminative power AUC DataSet Frequent Patterns 1---------------------- ---------2----------3 ----- 4 --- 5 -------- --- 6 ------- 7------ Mined Discriminative Patterns 1 2 4 select mine AUC of MbT, DT MbT VS Benchmarks Mine frequent patterns; Select most discriminative patterns; Represent data in the feature space using such patterns; Build classification models.

More Related