150 likes | 283 Views
Multi-Kernel Multi-Label Learning with Max-Margin Concept Network. 1 Wei Zhang, 1 Xiangyang Xue , 2 Jianping Fan , 1 Xiaojing Huang, 1 Bin Wu, 1 Mingjie Liu 1 Fudan University, China ; 2 UNCC, USA { weizh , xyxue }@ fudan.edu.cn. Content. Motivation Overview
E N D
Multi-Kernel Multi-Label Learning with Max-Margin Concept Network 1Wei Zhang, 1Xiangyang Xue, 2Jianping Fan , 1Xiaojing Huang, 1Bin Wu, 1Mingjie Liu 1Fudan University, China; 2UNCC, USA {weizh, xyxue}@fudan.edu.cn IJCAI-2011
Content • Motivation • Overview • Concept Network Construction • The Proposed Model • Multi-Kernel Multi-Label Learning • Experiments • Conclusions
Motivation • Semantics richness requires multiple labels for sufficient data semantic description, so multi-label is necessary. • When multiple labels are available for a single sample, there can be strong inter-label correlations. • Similarity diversity cannot be characterized effectively by one single kernel, so multi-kernel is necessary.
Overview Inter-label dependency and similarity diversity are leveraged simultaneously in the proposed method. • A concept network is constructed to capture inter-label correlations for classifier training. • Maximal margin approachis used to effectively formulate the feature-label associations and the label-label correlations. • Specific kernels are learned not only for each label but also for each pair of the inter-related labels.
Concept Network Construction • A concept network is constructed to characterize the inter-label correlations and to learn the inter-related classifiers. • Each concept corresponds to one certain node in concept network. • If two concepts are inter-related, there is an edge between the corresponding two nodes. Empirical conditional probabilities: If then
The Proposed Model • Our model captures the feature-concept associations and the inter-concept correlations in a unified framework: • are functions mapping sample features x to kernel spaces with respective to the node and the edge, respectively. • ,
Max-Margin Method for Model Learning • By considering both site and edge potentials in a unified framework, we sufficiently leverage the associations between features and labels, and the correlations among labels and their dependence on the features. • To learn the proposed model, the objective function is defined as: • where • and constraints !
Learning Interdependent Classifiers • we factor the proposed global model formulation as the sum of local models: where • our optimization can be approximately decoupled into c interdependent sub-problems:
Similarity Diversity by Multi-kernel • The dual of the optimization problem is as follows: • where We would employ multi-kernel technique to implement both the concept specific and the pairwise concept specific feature mappings such that similarity diversity can be effectively characterized.
Multi-kernel Learning • We first define an original kernel regardless of label information using Gaussian kernel, and decompose the Gram kernel matrix by spectral decomposition: • To incorporate the label information, we learn the concept-specific kernel matrix for each label by maximizing the similarities between data with the same label: • To sufficiently leverage the correlations among the concepts and their dependence on the input features, the pairwise label specific kernel matrix can be learned by: • Both the concept-specific kernel matrix and the pairwise label specific kernel matrix share the common basis as the original kernel K:
Model Inference • For any new image, the inference problem is to find the optimal label configuration • The size of multi-label space is exponential to the number of classes, and it is intractable to enumerate all possible label configurations to find the best one. We employ an approximate inference technique (ICM): • Initialize a multi-label configuration • In each iteration, given , we sequentially update using the local model: If > then ; otherwise
Experiments We compare our method in web page classification with the state-of-the-art methods: • RML [Petterson and Caetano,2010]; ML-KNN [Zhang and Zhou, 2007]; • Tang’s method [Tang et al., 2009]; and RankSVM[Elisseeff and Weston, 2002].
Experiments • We consider other real applications in experiments: image annotation, music emotion tagging, and gene categorization.
Conclusions: Inter-label dependency and similarity diversity are simultaneously leveraged in multi-kernel multi-label learning. • A concept network is constructed for characterizing the inter-label correlations effectively. • Maximal margin technique effectively captures the feature-label associations and the label-label correlations. • By decoupling the multi-label learning task into inter-dependant sub-problems label by label, the proposed method learns multiple interrelated classifiers jointly. • Specific kernels not only for each label but also for each pair of inter-related labels are learned to embed the label information and the inter-label correlations.
Thanks a lot! Q & A ?