170 likes | 327 Views
Algorithmes pour le web. “A Unified Approach to Personalization Based on Probabilistic Latent Semantic Models of Web Usage and Content”. Plan de présentation. Introduction Probabilistic Latent Semantic Models of Web User Navigations A Recommendation Framework
E N D
Algorithmes pour le web “A Unified Approach to Personalization Based on Probabilistic Latent Semantic Models of Web Usage and Content”
Plan de présentation • Introduction • Probabilistic Latent Semantic Models of Web User Navigations • A Recommendation Framework Based on the Joint PLSA Model • Description of Data Sets • Conclusion Hanieh Fakhfouri
Introdution • Qu’est-ce que web usage mining • Différentes catégories de comportement • Différentes sortes de Data mining Techniques • LSA,SVD,PLSA Hanieh Fakhfouri
Probabilistic Latent Semantic Models of Web User Navigations • Usage Data preprocessing phase • P = {p1, p2, . . . , pn} • U = {u1, u2, . . . , um} • Web Session Data: UPm×n, • Content preprocessing techniques • Application de “ text mining and information retrieval techniques”, nous permet de représenter chaque pageview comme un “ attribute vector”: • content preprocessing techniques donne A = {a1, a2, . . . , as} : qui contient content observation data • APs×n Hanieh Fakhfouri
Probabilistic Latent Semantic Models of Web User Navigations • Content preprocessing techniques Hanieh Fakhfouri
Probabilistic Latent Semantic Models of Web User Navigations • Variable cachée • zk €Z = {z1, z2, · · · , zl} est associé • À chaque observation (ui, pj) • À chaque observation (at, pj) • Notre but : • Trouver Z = {z1, z2, · · · , zl} Hanieh Fakhfouri
The probabilistic latent factor model • Peut êtres présenté de cette façon: • select a user session ui from U with probability Pr(ui); • select a latent factor zk associated with ui with probability Pr(zk|ui); • given the factor zk, generate a pageview pj from P with probability Pr(pj |zk). Hanieh Fakhfouri
The probabilistic latent factor model likelihood Hanieh Fakhfouri
Expectation-Maximization (EM) algorithm • 2 phases : Expectation (E) step, Maximization (M) step • Résultat : Pr(zk), Pr(ui|zk), Pr(at|zk), Pr(pj zk), pour chaque zk € Z, ui € U, at € A, and pj € P. (M) (E) Hanieh Fakhfouri
A Recommendation FrameworkBased on the Joint PLSA Model • Characterizing Web User Segments • Qu’est-ce qu’un « user segment » ? • prototypical” user sessions : highest Pr(u|zk) • Using the Joint Probability Model for Personalization Hanieh Fakhfouri
Characterizing Web User Segments • Pr(ui|zk) Hanieh Fakhfouri
Using the Joint Probability Model for Personalization Hanieh Fakhfouri
Expériences • Description of the Data Sets • CTI data : data set is based on the server log data from the host Computer Science department. 21,299 user sessions (U) and 692 Web pageviews (P), where each user session consists of 9.8 pageviews in average. • Realty data : data set is based on server logs of a local affiliate of a national real estate company. 24,000 user sessions from 3,800 unique users. Hanieh Fakhfouri
Expériences • Le 1ier exemple genère les « latent factors » ou les facteurs cachées en utilisant «PLSA model » Hanieh Fakhfouri
Expériences • Utilisation de WAVP Hanieh Fakhfouri
Conclusion • Utilisation de formules complexes • Résultats intéressantes et la flexibilité de modèle • Résultat des expériences montrent clairement que le modèle de PLSA donne lieu à une représentions presque correcte de comportement des utilisateurs. Hanieh Fakhfouri
Références • Dai, H., and Mobasher, B. 2002. Using ontologies to discover domain-level web usage pro.les. In Proceedings of the 2nd Semantic Web Mining Workshop at ECML/PKDD 2002. • Anderson, C.; Domingos, P.; and Weld, D. 2002. Relational markov models and their application to adaptive web navigation. In Proceedings of the Eighth ACM (KDD-2002). • Berry, M.; Dumais, S.; and OBrien, G. 1995. Using linear algebra for intelligent information retrieval. SIAM Review 37:573–595. • Hofmann, T. 1999. Probabilistic latent semantic indexing.In Proceedings of the 22nd International Conference on Research and Development in Information Retrieval. • Mobasher, B.; Dai, H.; Luo, T.; Sun, Y.; and Zhu, J. 2000. Integrating web usage and content mining for more e.ective personalization. In E-Commerce and Web Technologies: Proceedings of the EC-WEB 2000 Conference, • ………… Hanieh Fakhfouri