210 likes | 575 Views
A Vector Space Model for Automatic Indexing. Enhanced Vector Space Models for Content-based Recommender Systems. G. Salton, A. Wong and C. S. Yang. Cataldo Musto. Presenter Sawood Alam <salam@cs.odu.edu>. A Vector Space Model for Automatic Indexing. G. Salton, A. Wong and C. S. Yang
E N D
A Vector Space Model for Automatic Indexing • Enhanced Vector Space Models for Content-based Recommender Systems G. Salton, A. Wong and C. S. Yang • CataldoMusto Presenter SawoodAlam<salam@cs.odu.edu>
A Vector Space Model for Automatic Indexing G. Salton, A. Wong and C. S. Yang Cornell University
Introduction • In document retrieval, best indexing space is where each entity lies far away from others • Density of the object space becomes a measure of indexing system • Retrieval performance correlate inversely with space density
Document Space • Di = (di1, di2, di3, …, dij)
Enhanced Vector Space Models for Content-based Recommender Systems CataldoMusto Dept. of Computer Science University of Bari, Italy cataldomusto@di.uniba.it
Introduction • Vector Space Models (VSM) in Information Retrieval is an established practice • Investigate the impact of vector space models in Information Filtering • Recommender system
Problems of VSM • High dimensionality • Becoming more serious due to emerging social apps and micro-blogging, generating lots of web content and new vocabulary • Inability to manage document semantics • Order of the term occurrence in the document
Components • Context vector for each term • Values in {-1, 0, 1} • Vector Space representation of a term (t) • Vector Space representation of a document (d) • Vector Space representation of a user profile (pu)
Indexing Technique • Random Indexing-based model • Weighted Random Indexing-based model • Semantic Vector-based model • Weighted Semantic Vector-based model
Conclusions • First prototype with naive weighting scheme is comparable to other content based filtering techniques like Bayesian classifier • Other complex weighting schemes should perform better • User profiles may be studied based on Linked Data rather than keyword based user profiles