130 likes | 279 Views
CSE 450 – Web Mining Seminar Professor Brian D. Davison Fall 2005. A Presentation on When Experts Agree: Using Non-Affiliated Experts to Rank Popular Topics K. Bharat & G. A. Mihaila WWW10 Conference, May 2001, Hong Kong by Osama Ahmed Khan 10/06/2005. Problem. Query on Popular Topic
E N D
CSE 450 – Web Mining SeminarProfessor Brian D. DavisonFall 2005 A Presentation on When Experts Agree: Using Non-Affiliated Experts to Rank Popular Topics K. Bharat & G. A. Mihaila WWW10 Conference, May 2001, Hong Kong by Osama Ahmed Khan 10/06/2005
Problem • Query on Popular Topic • Content Analysis Solution • Most Authoritative Pages
Technical Terms • Expert • Recommendation • Non-affiliation
Hilltop Algorithm • Expert Lookup • Detecting Host Affiliation • Expert Selection • Expert Indexing • Target Ranking • Computing Expert Score • Computing Target Score
Detecting Host Affiliation • Conditions • Same first 3 octets of IP 127.0.0.1 127.0.0.15 • Same rightmost non-generic token of hostname www.ibm.com www.ibm.co.mx • Union-Find Algorithm
Expert Selection • Retrieve all webpages with: Out-degree > Threshold (k) (e.g. k = 5) • Expert will have: URLs pointing to k distinct non-affiliated hosts
Expert Indexing • Inverted Index • Mapping Keywords to Experts • Key Phrases • Match Positions
Computing Expert Score • Condition • Atleast 1 URL with all query keywords • Expert Score: (S0, S1, S2) Si = SUM{key phrases p with k-i query terms} * LevelScore(p) * FullnessFactor(p,q) Expert_Score = 232 * S0 + 216 * S1 + S2
Computing Target Score • Condition • Atleast 2 non-affiliated experts • Target Score: Edge_Score(E,T) = Expert_Score(E) * SUM{query keywords w} * occ(k,T) Target_Score = Sum{Edge_Score(E,T)}
Evaluation • Locating Specific Popular Targets
Evaluation (Contd.) • Gathering Relevant Pages
Conclusion • Characteristics • Popular Queries • Expert Subset • Hilltop vs. • PageRank • Topic Distillation