150 likes | 308 Views
Res earch Paper Recommender System. Scienstein. Monica D ăgădiţă. Outline. Article recommender systems Why Scienstein ? Citation analysis methods Text m ining Document rating User interface Conclusions. Article recommender systems. Purpose : find relevant articles Methods used
E N D
Research Paper Recommender System Scienstein Monica Dăgădiţă
Outline • Article recommender systems • Why Scienstein? • Citation analysis methods • Text mining • Document rating • User interface • Conclusions
Article recommender systems • Purpose : find relevant articles • Methods used • Content based filtering • Collaborative filtering • Key elements of an article • Citations • Author • Content
Why Scienstein? • 2008 - PhD students BélaGipp and JöranBeel • Appeared as an alternative to academic search engines • Improves simple keyword-based search • Citation analysis • Distance Similarity Index (DSI) • In-text Impact Factor (ItIF) • Author analysis • Source analysis • Implicit/explicit ratings
Citation analysis methods • Problems • Homographs • The Mathew Effect • Self citations • Citation circles • Ceremonial citations • Scienstein’s approach – 4 citation analysis methods
Citation analysis methods(2) • Cited by • Papers that cite the input document – A&B • Reference list • Papers referenced in the input document – C&D • Bibliographic coupling • Papers that cite the same article(s) – BibCo • Co-citation • Papers cited in the same document – CoCit
Citation analysis methods(3) • In-text citation frequency analysis (ICFA) • the frequency with which a research paper is cited within a document • In-text Impact Factor (ItIF) • The higher the ItIF, the closer related is the input document to the cited document
Citation analysis methods(4) • In-text citation distance analysis (ICDA) • the distance between references within a text -> the degree of their similarity • Distance Similarity Index (DSI) • calculates the similarity of two documents based on the citation distance
Text mining • Existing techniques • Additional features • Classification based on details given in the acknowledgements section • Collaborative annotations and classifications • Creating new categories • classifying publications about archaeological sites according to their geographic location -> Google Maps Extension
Document rating • Explicit ratings • Improve a user’s own recommendation accuracy • Problem: a large amount is needed • Implicit ratings • Time spent with mouse over a paragraph • Time spent reading an article • Printed articles
Conclusions • Scienstein - the first hybrid recommender system for research papers • Known methods • Keyword analysis • Ratings • New methods • In-text Impact Factor (ItIF) • Distance Similarity Index (DSI) • Hybrid system (content based and collaborative filtering) => more powerful tool