100 likes | 248 Views
A Survey on Social Network Search Ranking. Web vs. Social Networks. Limitations of web (hyperlink-based) search It underestimates recently published content It has a bias in favor of large community (e.g., Michael Jordan, the basketball player or the computer scientist?).
E N D
Web vs. Social Networks • Limitations of web (hyperlink-based) search • It underestimates recently published content • It has a bias in favor of large community (e.g., Michael Jordan, the basketball player or the computer scientist?)
Roadmap for the following • PeerSpective1.0 (HotNet ‘06) • Demonstrate why social network search matters • Network-Aware Searching (VLDB ‘08) • Query + Importance of user (relative to the query user) • Efficient Search Ranking in Social Networks (CIKM ’07) • Propose some challenges of social (network-aware) searching
PeerSpective 1.0 (HotNet ‘06) • An experiment uses social nets to search the Web • Idea: users can query their friends’ viewed pages • Results from friends appear alongside Google results • Ranking:
PeerSpectiveExperimental Results • Run PeerSpective with 10 users for 1 month • 51,410 distinct URLs viewed • 1,730 Google searches • Google contains only 62.5% URLs • 30.4% of URLs previously viewed by someone in network • 13.3% of URLs previously viewed but not in Google • 7.7% of (top 10) result clicks are on PeerSpective-only results
Network-Aware Searching (VLDB ‘08) • The query content + Importance of users (relative to the query user) • Overlap-based similarity • Indirectly connected users • Add a uniform background • Social frequency • tfu(d,t) is typically 0 or 1
Network-Aware Searching Example • O(A,A)=1, O(A,B)=2/4, O(B,C)=2/4, O(C,D)=2/5, O(A,E)=2/4, O(E,D)=0/5 • PA(D)=max(1/10,0)=1/10 • FA(D)=0.1*1/5+(1-0.1)*1/10=0.11 • Similarly, FA(A)=0.92, FA(B)=0.47, FA(C)=0.245, FA(E)=0.47 • sfA(z,a)=0.92*1 + 0.47*1 + 0.245*0 + 0.11*1 + 0.47*0 = 1.5 Tags of document z by user A, B, C, D, and E: B C c,d a,c D A a,d,e a,b E b,c
Efficient Search Ranking in Social Networks (CIKM ’07) • Consider usernames as query terms only • Idea: search ranking is based on the path length • Challenge: large size of SN prevents efficient computation of shortest path at query time • Orkut: 40 million • Facebook: more than 200 million active users
Efficient Search Ranking in Social Networks: Approaches • Pre-compute all distance b2n any pair • Trivial • Non-scalable: 40 million users 40,000,0002=16*1014 • On-the-fly ranking • BFS in real-time • Each user has 100 friends, distance 3 1,000,000 users • Co-friend ranking • Mixture of above two • Store “friends of friends” for each user and search from the list
Conclusion • Network aware search is not a big problem • However, how to search “in real time”? • Search limited number of hops • Approximated shortest path • Pre-compute (partial) data