1 / 22

Cost-Sensitive Learning for Large-Scale Hierarchical Classification of Commercial Products

Cost-Sensitive Learning for Large-Scale Hierarchical Classification of Commercial Products. Jianfu Chen, David S. Warren Stony Brook University. Classification is a fundamental problem in information management. UNSPSC. Vehicles and their Accessories and Components (25).

tea
Download Presentation

Cost-Sensitive Learning for Large-Scale Hierarchical Classification of Commercial Products

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Cost-Sensitive Learning for Large-Scale Hierarchical Classification of Commercial Products Jianfu Chen, David S. Warren Stony Brook University

  2. Classification is a fundamental problem in information management. UNSPSC Vehicles and their Accessories and Components (25) Food Beverage and Tobacco Products (50) Office Equipment and Accessories and Supplies (44) Segment Product description Email content Marine transport (11) Motor vehicles (10) Aerospace systems (20) Family Product and material transport vehicles (16) Safety and rescue vehicles (17) Passenger motor vehicles (15) Class Spam Ham Buses (02) Automobiles or cars (03) Limousines (06) Commodity

  3. How should we design a classifier for a given real world task?

  4. Method 1. No Design f(x) Training Set Test Set Try Off-the-shelf Classifiers SVM Logistic-regression Decision Tree Neural Network ... Implicit Assumption: We are trying to minimize error rate, or equivalently, maximize accuracy

  5. Method 2. Optimize what we really care about What’s the use of the classifier? How do we evaluate the performance of a classifier according to our interests? Quantify what we really care about Optimize what we care about

  6. Hierarchical classification of commercial products UNSPSC Textual product description Vehicles and their Accessories and Components (25) Food Beverage and Tobacco Products (50) Office Equipment and Accessories and Supplies (44) Segment Marine transport (11) Motor vehicles (10) Aerospace systems (20) Family Product and material transport vehicles (16) Safety and rescue vehicles (17) Passenger motor vehicles (15) Class Buses (02) Automobiles or cars (03) Limousines (06) Commodity

  7. Product taxonomy helps customers to find desired products quickly. • Facilitates exploring similar products • Helps product recommendation • Facilitates corporate spend analysis Toys&Games Looking for gift ideas for a kid? dolls puzzles building toys ...

  8. We assume misclassificationof products leads to revenue loss. Textual product description of a mouse Product ... ... ... Desktop computer and accessories ... ... pet mouse keyboard lose part of the potential revenue realize an expected annual revenue

  9. What do we really care about? A vendor’s business goal is to maximize revenue, or equivalently, minimize revenue loss

  10. Observation 1: the misclassification cost of a product depends on its potential revenue.

  11. Observation 2: the misclassification cost of a product depends on how far apart the true class and the predicted class in the taxonomy. Textual product description of a mouse Product ... ... ... Desktop computer and accessories ... ... pet mouse keyboard

  12. The proposed performance evaluation metric: average revenue loss revenue loss of product x • example weight is the potential annual revenue of product x • error function is the loss ratio • the percentage of the potential revenue a vendor will lose due to misclassification from class y to class y’. • a non-decreasing monotonic function of hierarchical distance between y and y’, f(d(y, y’))

  13. Learning – minimizing average revenue loss Minimize convex upper bound

  14. Multi-class SVM with margin re-scaling

  15. Multi-class SVM with margin re-scaling Convex upper bound of plug in any loss function

  16. Dataset • UNSPSC (United Nations Standard Product and Service Code) dataset • Product revenues are simulated • revenue = price * sales

  17. Experimental results Average revenue loss (in K$) of different algorithms

  18. What’s wrong? Revenue loss ranges from a few K to several M

  19. Loss normalization • Linearly scale loss function to a fixed range , say [1, 10] The objective now upper bounds both 0-1 loss and the average normalized loss.

  20. Final results 7.88% reduction in average revenue loss! Average revenue loss (in K$) of different algorithms

  21. Conclusion empirical risk, average misclassification cost: What do we really care about for this task? Minimize error rate? Minimize revenue loss? Performance evaluation metric regularized empirical risk minimization A general method: multi-class SVM with margin re-scaling and loss normalization How do we approximate the performance evaluation metric to make it tractable? Model + Tractable loss function Optimization Find the best parameters

  22. Thank you! Questions?

More Related