William K. Cheung, James T. Kwok, Martin H. Law, Kwok-Ching Tsui

Mining customer ratings for product recommendation using the support vector machine and the latent class model William K. Cheung, James T. Kwok, Martin H. Law, Kwok-Ching Tsui Intelligent Systems Research Group, BT Laboratories Hong Kong Baptist University

Records of other customers (possibly with ratings) Recommender System . . . What is a Recommender System?

Product Recommendation in E-commerce Products Recommendations www.amazon.com

Product Recommendation in E-commerce Products Recommendations www.cdnow.com

Content-based Recommender System The Support Vector Machine (SVM) The Extended Latent Class Model (ELCM) Personal Profile Records of other customers (possibly with ratings) Collaborative Recommender System . . . Ratings Ratings Overview

Presentation Outline • Content-based Recommendation • Existing Solutions and Their Limitations • Our Proposed Solution - the SVM • Collaborative Recommendation • Existing Solutions and Their Limitations • Our Proposed Solution - the Extended LCM • Experimental Evaluation • Conclusion and Future Works

Content-based Recommender System Personal Profile Content-based Recommendation • Matching between the personal profile and the features extracted from product descriptions. • Assumptions: • Customer personal profiles are available. • Detailed product descriptions are available so that a set of representative features can be extracted. • Both the profiles and the product descriptions share the same representation.

Some Existing Solutions • Keyword Matching • problems of synonymy and polysemy. • Pattern Classification Approaches • f={ f1(y), f2(y), … fm(y)} the set of features for product y • ax(f(y)) the classifier output for customer x’s interest obtained via training, such that • Examples of classifiers: • Naïve Bayes, k-NN, C4.5 (decision tree)

Feature Selection Problem • The performance of content-based recommendation depends heavily on the discriminative power of the features selected to be extracted. • Too few features => hard to learn useful profiles (shallow analysis) • Too many features => hard to estimate the classifier’s parameters with good generalisation performance.

Our Proposed Solution - the use of SVM • The Support Vector Machine has been shown to be able to achieve good generalisation performance for classification of high-dimensional data sets and its training can be framed as solving a quadratic programming problem. • => ones can simply use all extracted features for the input and there is no need for feature selection at all.

Pattern Classification...

Which line is the best?(Training and Generalization)

Support Vector Machine (SVM) • Intuitively, maximize the margin between classes • Theoretically sound • related to minimizing the VC-dimension under the theory of structural risk minimization margin

Solving for the line • Computationally, this leads to a quadratic programming problem • maximize a quadratic objective function subject to some linear constraints • no local maximum (cf neural networks)

Support Vectors • The line depends only on a small number of training examples.

Nonlinear Cases • use another coordinates system such that the “curve” becomes a “line”

 Kernels • Only inner products, (x)T (y) , are involved in the calculation • Under certain conditions, there exists a kernel K such that K(x,y)=(x)T (y) • e.g. Polynomial of degree d: K(x,y)=(xTy+1)d • replace xTy by (x)T (y)

Overlapping Cases • Impossible to perfectly separates the two classes • Include an error term • Instead of maximizing margin, minimize error +  / margin • Again, involves only quadratic programming

Records of other customers (possibly with ratings) Collaborative Recommender System . . . Product Ratings Product Ratings Collaborative Recommendation • Matching between the customer’s ratings with the ratings of others (the word-of-mouth approach). • Assumptions: • Customer ratings of a reasonably large group of customers are available. • Each product has been rated by some of the customers. • The product ratings are overlapping to certain degrees.

Some Existing Solutions • Memory-based Approach • Pearson Correlation Coefficient • … and its variants • suffer from the sparsity and the first-rater problems. • Model-based Approach • solve the sparsity problem by incorporating a priori models. • E.g., Naïve Bayes Classifier, Bayesian Network, Latent Class Model

Limitations • The sparsity problem (lacking sufficient ratings) • The first-rater problem (encountering new products) 5 - - 4 - - - - Customer x1 - 5 4 - - - - - Customer x2 1 - 4 - 4 - - - Customer x3 5 - - - - - - - A New Customer xn

Recommended ! Recommended ! Grouping Preference Ratings - to solve the sparsity problem Preference Pattern #1 Preference Pattern #2 5 - - 4 - - - - Customer x1 - 5 4 - - - - - Customer x2 1 - 4 - 4 - - - Customer x3 5 - - - - - - - A New Customer xn

Recommended ! Integrating Product Contents - to solve the first-rater problem Preference Pattern #1 Preference Pattern #2 5 - - 4 - - - - Customer x1 - 5 4 - - - - - Customer x2 1 - 4 - 4 - - - Customer x3 5 - - - - - - - A New Customer xn

Our Proposed Solution - the use of LCM • The latent class model has been proposed by Thomas Hofmann et al. in IJCAI’99 for clustering preference ratings with promising results. • Limitation:only capable of recommending products to customers in the training set. • We extend their model so that • a) Existing products can be recommended to the customers not in the training set • b) New products can be recommended to the existing customers (not described in the paper).

Observed Hidden Customer X Preference Pattern Z Product Y Latent Class Model Model Training: Learn P(z), P(x|z) and P(y|z) using the EM algorithm. The model initialization is done by the K-means clustering.

Existing Products to Existing Customers • Compute the probabilities that x is interested in y • Products can then be sorted according to the values of P(y|x) for recommendation.

Inner product ofthe pdf of pattern z and the ratings of xn. Extension 1: Existing Products to New Customers xn is not inside the training set. Thus, we don’t have P(z|xn).

distance between yn and z in the feature space Extension 2: New Products to Existing Customers yn is not inside the training set. Thus, we don’t have P(yn|z).

Performance Measures • accuracy: the percentage of correct recommendations • recall: the percentage of interesting products that can be located in the output list • precision: the percentage of products in the output list which are really interesting to the customer. • break-even point: The point where recall = precision • expected utility: • its value is high if the products rated high appear early in the output list.

Experiment One: Setup(content-based by SVM) • Product ratings data set • EachMovie (from DEC) • Product description data set • Internet Movie Database (http://www.imdb.com) • Size of feature set = 6620, including • Release date, Runtime, Language, Director, Producer, Original music, Writing credit, ... • No. of products = 1628 • 5-fold cross-validation • ~1200 for training and remaining for testing • No. of customers = 100

Experiment One: Results(content-based by SVM)

Experiment Two: Setup(collaborative by ELCM) • Ratings data set • EachMovie (from DEC) • Training • No. of products = 500 • No. of customers = 90 • Testing • No. of customers = 10 • No. of products = 250 • Size of the product set where ratings are considered for matching, L = {10, 63, 83, 125, 250}

Experiment Two: Results(collaborative by ELCM)

Conclusion and Future Works • SVM and ELCM are empirically shown to be promising for content-based recommendation and collaborative recommendation, respectively. • Future works • ELCM • Model Enhancement - BiELCM, hierarchical, ... • Scalability issue of the EM algorithm for ELCM • Modelling dynamic preference patterns • Applications to cross-selling? • Integration of SVM and ELCM for improvement

William K. Cheung, James T. Kwok, Martin H. Law, Kwok-Ching Tsui

William K. Cheung, James T. Kwok, Martin H. Law, Kwok-Ching Tsui

Presentation Transcript

Kwok Ko (SLAC), Robert Ryne (LBNL)

“Disguises” By Jean Fong Kwok

Ms. Sally Kwok Nurse Health Education Unit

S.I. Dimitriadis , Yu Sun , K. Kwok , N.A. Laskaris , A. Bezerianos

Kwok Tsz Piu

Disguises by: Jean Fong Kwok

Mr Kwok-Leung Cheung Clinical Associate Professor, University of Nottingham

Terence Chun-Ho Cheung, Ron Chi- Wai Kwok, Kai-Pan Mark Department of Information Systems

Disguises Project Jean Fong Kwok

Disguises Jean Fong Kwok

Zora Toor James Kwok Emily-Jane Bartel Tingming (Rick) Sun

WP 12 CoL Taxon Placement Service (piping tools) Viktor Didziulis Kwok Yin Cheung

Kwok Tsz Yan

William Cheung and James Lei

ELE 2860 H. T. Tsui

J Chen, P W Cheng, H C Huang and H S Kwok

Thomas Nadeau Yacine El Mghazli Kwok Ho Chan

Conducted by: Cheng Wen Chi Chiu Kwok Shing Choi Kwok Yam Advised by Prof. Danny Tsang

R. Kwok – IceBridge Science Team Member

Dr. Felix H W Chan Dr. James K H Luk Dr. L W Chu Prof. Timothy Kwok Prof. Daniel T P Lam

Illustrations: By Amy Kwok

“Disguises” By Jean Fong Kwok