1 / 25

Estimating Business Targets

Estimating Business Targets. Advisor: Dr. Hsu Graduate: Yung-Chu Lin Data Source: Datta et al., KDD01, pp. 420-425. Abstract . Propose a new solution to the classical econometric task of frontier analysis Combine nearest neighbor methods and classical statistical methods

alton
Download Presentation

Estimating Business Targets

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Estimating Business Targets Advisor: Dr. Hsu Graduate: Yung-Chu Lin Data Source: Datta et al., KDD01, pp. 420-425. IDSL seminar

  2. Abstract • Propose a new solution to the classical econometric task of frontier analysis • Combine nearest neighbor methods and classical statistical methods • Identify under marketed customers • Benchmark regional directory divisions IDSL seminar

  3. Outline • Motivation • Objective • Historical approaches • Target estimation methodology • Case study • Conclusion • Personal opinion IDSL seminar

  4. Motivation • Setting targets is a critical task • Setting the target of each entity to the average amongst the entities traditionally • Two challenges • The characteristics of the entities will have a heavy influence on the outcome • The inherent unsupervised nature of the problem IDSL seminar

  5. Objective • Provide a methodology for estimating unsupervised maximal or minimal targets • Setting revenue target expectations for individual customers • Revenue target setting for regional yellow page directories IDSL seminar

  6. Historical Approaches • Mathematical programming • Economics IDSL seminar

  7. Mathematical Programming • where is the target for xi, a vector for the ith observation • Sensitivity to errors or outliers since it assumes that all observed targets define the possible space IDSL seminar

  8. Economics • where is a non-negative error term • The requirement of a model for the error term and for g IDSL seminar

  9. Target Estimation Methodology • Nearest neighbor vs. clustering • The neighborhoods • The distance function • Target estimation from the neighborhoods • A heuristic for comparing neighborhoods IDSL seminar

  10. Nearest Neighbor vs. Clustering • Time complexity • Clustering is better than nearest neighbor • Problem of clustering • Two similar entities fall into different cluster • Dimension higher, influence more serious • But nearest neighbor is not so IDSL seminar

  11. The Neighborhoods • xi: ith observation • yi: the variable containg its target value • ni: neighborhood for xi, where ni is a set of observations {xi, xj, …} IDSL seminar

  12. The Distance Function Continuous  standardize e.g. Continuous- (2,1)(3,4)  Nominal- (a,b)(a,c)  IDSL seminar

  13. Target Estimation From the Neighborhoods • Let yi(1), yi(2), …, yi(k) be the order statistics, so that yi(1) is the largest IDSL seminar

  14. A Heuristic for Comparing Neighborhoods • Maximal frontier  E(xi) will range from 0 to 1 • Minimal frontier  E(xi) >=1 IDSL seminar

  15. Case Study • Target revenues for directory book advertisers • Target revenue for regional directories IDSL seminar

  16. (1) Target Revenues for Directory Book Advertisers • Goal • Find businesses that have low spending relative to those with otherwise similar characteristics • Three categories of data available • Advertiser: e.g. number of employees • Directory: e.g. distribution size • Market : e.g. median household income IDSL seminar

  17. Calculating Nearest Neighbors • Standardize continuous data: natural log • K=4 • Weight the variables equally • But decrease the weights for many of the directory and market variables IDSL seminar

  18. Distribution for E(x) for Advertisers IDSL seminar

  19. A Decision Tree to Predict phi -xi IDSL seminar

  20. (2) Target Revenue for Regional Directories • Goal • Benchmark regional directory divisions • Separate the data into two sets • Training set: 80% • Test set: 20% • K=4 IDSL seminar

  21. Book Type • System book • an entire serving area • System-neighborhood book • A smaller number of geographic areas in the franchise area • Neighborhood book • Areas outside of the telephone company’s franchise area IDSL seminar

  22. Four Different Distributions labeled according to the legend IDSL seminar

  23. The x-axis shos log(distribution) and the y-axis E(x) Neigborhood books System books Non-system books IDSL seminar

  24. Conclusion • Present a general data mining methodology for estimating business targets by frontier analysis • First case • Increase sales focus on the under-marketed customers • Increase the potential revenue by several million • Second case • Estimate optimal revenue performance targets for directory divisions • Increase for directory books is a minimum of several million dollars IDSL seminar

  25. Personal opinion • Combine several existed methodologies or disciplines can make new powerful one IDSL seminar

More Related