220 likes | 370 Views
Authoritative Sources in a Hyperlinked Environment Jon M. Kleinberg. Presented By: Talin Kevorkian Summer 2010. Overview. Why Do We Care? Introduction Information Objective
E N D
Authoritative Sources in a Hyperlinked EnvironmentJon M. Kleinberg Presented By: Talin Kevorkian Summer 2010
Overview • Why Do We Care? • Introduction Information • Objective • Approaches and Observed Results • Related Work • Generalization • Conclusion • Evaluation of Pros and Cons Authoritative Sources in a Hyperlinked Environment
Why Do We care? • Complexity of WWW as a Hypertext Corpus • Nature of the Hyperlinked Environment Structure • Efficiency (Longer Response Time) and Storage Problems Because of Huge Amount of Results Return to the User Authoritative Sources in a Hyperlinked Environment
Introduction Information • Query Types • Specific • E.g. ”Does Windows 7 Support Oracle 10g?” • Scarcity Problem • Broad-Topic • E.g. “Sql Programming Language ” • Abundance Problem • Authority Notion • Similar-Page • E.g. “Similar Pages to Oracle.com” Authoritative Sources in a Hyperlinked Environment
Introduction Information • Link-Based Model • Encoding latent human judgment • Conferred Authority • Creating Balance Between Popularity and Relevance • Relation Between Authority and Hubs Authoritative Sources in a Hyperlinked Environment
Objective • Presenting the Link-Based Model for the Conferral Authority • Exploring Authoritative WWW Sources in the Global Range Authoritative Sources in a Hyperlinked Environment
Approaches and Observed Results • Focused Subgraph Algorithm for WWW • Authorities and Hubs Computation • Approach for Similar-Page Queries • Sample Observed Results Authoritative Sources in a Hyperlinked Environment
Focused Subgraph Algorithm for WWW • Inputs: • Query String σ • Text-based Search Engine • Outputs: • Set of Hyperlinked Pages as a Directed Graph G(V,E) • Root Set Rσ • Sub Set Sσ • Almost Small in size • Containing Most of Relevant Pages • Covering Most of the Strongest Authorities • Links Type in G[Sσ] • Transverse • Intrinsic Authoritative Sources in a Hyperlinked Environment
Authorities and Hubs Computation • Solution to the approach of Ordering Pages by Their In-degree • Confusion Between Strong “Authorities” and “Universally Popular“ Pages • Containing Mutually Reinforcing Relationship Concept Authoritative Sources in a Hyperlinked Environment
Authorities and Hubs Computation • Iterate Algorithm • Input: • Set of n linked pages Gσ • Outputs: • Updated Authority Weight (thru operation I) • Updated Hub Weight (thru Operation O) • Filter Algorithm • Input: • Set of n linked pages Gσ • Outputs: • Reporting Pages with Top c Authorities • Reporting Pages with Top c Hubs Authoritative Sources in a Hyperlinked Environment
Approach for Similar-Page Queries • First Step: What Do Users of the WWW Decide to be Related to a Page When They Create any Pages and Hyperlinks • Second Step: Applying Link Structure to the Concept of “Similarity” • Third Step: Using concept of Authorities and Hubs Authoritative Sources in a Hyperlinked Environment
Sample Observed Results(For Broad-Specific Queries) Authoritative Sources in a Hyperlinked Environment
Sample Observed Results (For Similar-Pages Queries) Authoritative Sources in a Hyperlinked Environment
Related Work Link Structure is Related to: • Definition of Standing, Impact and Influence Concepts • WWW Ranking Techniques • Data Clustering Authoritative Sources in a Hyperlinked Environment
Standing, Impact and Influence Concepts • Social Network • Proposed Standing Measure • Katz Theory: Based on Path-Counting • Hubbell Theory : Based on Nodes Weight-Propagation • Scientific Citations • Proposed Impact/Influence Measure • Garfield’s Impact Theory • Pinski-Narin Influence Theory Authoritative Sources in a Hyperlinked Environment
WWW Ranking Techniques • Ranking Measure Proposal: • Botafogo-Rivlin-Shniderman Theory • Carriere-Kanzman Theory • Brin-Page Theory and Contrast with This Paper Approach Authoritative Sources in a Hyperlinked Environment
Data Clustering • Clustering needs : • Similarity Functions • Bibliographic Coupling • Co-Citation • Cluster Producer Functions • Small-Griffith Approach • Dimension-Reduction • Spectral Graph partitioning • Centroid Scaling Authoritative Sources in a Hyperlinked Environment
Generalization • Specific Queries • Diffusion Concept • Set of Hubs and Authorities can be Separated from each other Because: • Query String has different Meaning like “Jaguar” • Query String is a Highly Polarized Subject Like “Abortion” • Query String can be Applied in Multiple Communities like “Randomized Algorithms” Authoritative Sources in a Hyperlinked Environment
GeneraliztionSample Results Authoritative Sources in a Hyperlinked Environment
Conclusion • Basic Elements of Paper Approach • Applying Notation of Authoritative Sources • Selecting High Quality of Results • Dealing with Scale Problem • Exploring Structure of Hubs and Authorities Authoritative Sources in a Hyperlinked Environment
Evaluation of Pros and Cons • Pros: • Clearly Describe the Algorithms and Applied Approaches • Provide Tangible Examples and Results • Enough Connection to Related Works • Cons: • Ignoring the Textual Contents of pages • Complexity in the Nature of Quality Judgment • Concentrating mostly on Broad-Topic Queries Authoritative Sources in a Hyperlinked Environment
Q & A Authoritative Sources in a Hyperlinked Environment