1 / 28

TransRank: A Novel Algorithm for Transfer of Rank Learning

TransRank: A Novel Algorithm for Transfer of Rank Learning. Depin Chen, Jun Yan, Gang Wang et al. University of Science and Technology of China, USTC Machine Learning Group, MSRA depin.chen@mail.ustc.edu.cn. Content. Ranking for IR Paper motivation The algorithm: TransRank

elyse
Download Presentation

TransRank: A Novel Algorithm for Transfer of Rank Learning

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. TransRank: A Novel Algorithm for Transfer of Rank Learning Depin Chen, Jun Yan, Gang Wang et al. University of Science and Technology of China, USTC Machine Learning Group, MSRA depin.chen@mail.ustc.edu.cn

  2. Content • Ranking for IR • Paper motivation • The algorithm: TransRank • Results & future work

  3. Ranking in IR • Ranking is crucial in information retrieval. It aims to move the good results up, while the bad down. • A well known example: web search engine

  4. Learning to rank • Ranking + Machine learning = Learning to rank • An early work Ranking SVM, “Support Vector Learning for Ordinal Regression” , Herbrich et al [ICANN 99].

  5. Learning to rank for IR

  6. Existing approaches • Early ones Ranking SVM, RankBoost … • Recently IRSVM, AdaRank, ListNet ... • Tie-Yan Liu’s team at MSRA

  7. Content • Learning to rank in IR • Paper motivation • The algorithm: TransRank • Results & future work

  8. Training data shortage • Learning to rank relies on the full supply of labeled training data. • In real world practice …

  9. Transfer learning • Transfer learning definition Transfer knowledge learned from different but related problems to solve current problem effectively, with fewer training data and less time [Yang, 2008]. • Learning to walk can help learn to run • learning to program with C++ can help learn to program with JAVA • … • We follow the spirit of transfer learning in this paper.

  10. Content • Learning to rank in IR • Paper motivation • The algorithm: TransRank • Results & future work

  11. Problem formulation • St: training data in target domain Ss: auxiliary training data from a source domain • Note that, • What we want? A ranking function for the target domain

  12. TransRank • Three steps of TransRank:

  13. Step 1: K-best query selection • Query’s ranking direction query 11 in OHSUMED query 41 in OHSUMED

  14. The goal of step 1: We want to select the queries from source domain who have the most similar ranking directions with the target domain data. • These queries are treated to be most like the target domain training data.

  15. Utility function (1) • Preprocess Ss: select k best queries, and discard the rest. • A “best” query is the query, whose ranking direction is confidently similar with that of queries in St. • The utility function combines two parts: confidence and similarity.

  16. Utility function (2) • Confidence is valued using a separation value. The better different classes of instances are separated, the ranking direction will be more confident.

  17. Utility function (3) • Cosine similarity.

  18. Step 2: Feature augmentation • Daumé implemented cross-domain classification in NLP through a method called “feature augmentation” [ACL 07] . • For source-domain document vector (1, 2, 3) (1, 2, 3)(1, 2, 3, 1, 2, 3, 0, 0, 0) • For target-domain document vector (1, 2, 3) (1, 2, 3)(1, 2, 3, 0, 0, 0, 1, 2, 3)

  19. Step 3: Ranking SVM • Ranking SVM is the state-of-the-art learning to rank algorithm, proposed by Herbrich et al [ICANN 99].

  20. Content • Learning to rank in IR • Paper motivation • The heuristic algorithm: TransRank • Results & future work

  21. Experimental settings • Datasets: OHSUMED (the LETOR version), WSJ, AP • Features: feature set defined in OHSUMED. Same features are abstracted on WSJ and AP • Evaluation measures: NDCG@n, MAP • For Ranking SVM, we use SVMlight by Joachims. • Two group of experiments WSJ OHSUMED AP OHSUMED

  22. Compared algorithms • Baseline: run Ranking SVM on St • TransRank • Directly Mix: Step 1 + Step3

  23. Performance comparison 40% of target labeled data, k=10 source domain: WSJ source domain: AP

  24. Impact of target labeled data • From 5% to 100%, k=10 source domain: WSJ source domain: AP

  25. Impact of k 40% of target labeled data

  26. Future work • Web scale experiments, i.e. data from search engines • More integrated algorithm using machine learning techniques • Theoretical study for transfer of rank learning

More Related