Evaluating Hiera r chical Clustering of Search Results

Evaluating Hierarchical Clustering of Search Results SPIRE 2005, Buenos Aires Departamento de Lenguajes y Sistemas Informáticos UNED, Spain Juan Cigarrán Anselmo Peñas Julio Gonzalo Felisa Verdejo nlp.uned.es

Overview • Scenario • Assumptions • Features of a Good Hierarchical Clustering • Evaluation Measures • Minimal Browsing Area (MBA) • Distillation Factor (DF) • Hierarchy Quality (HQ) • Conclusion

Scenario • Complex information needs • Compile information from different sources • Inspect the whole list of documents • More than 100 documents • Help to • Find the relevant topics • Discriminate from unrrelevant documents • Approach • Hierarchical Clustering – Formal Concept Analysis

Problem • How to define and measure the quality of a hierarchical clustering? • How to compare different clustering approaches?

Physics Physics d1 d1, d2, d3, d4 Astrophysics d4 Astrophysics d4 Nuclear physics Nuclear physics d2, d3 d2, d3 Previous assumptions • Each cluster contains only those documents fully described by its descriptors

Physics Jokes d1 d2 Jokes about physics d3 Jokes about physics d3 Physics Jokes d1 d2 Jokes about physics d3 Previous assumptions • ‘Open world’ perspective

+ + + + + - + + - - - + - - + + + + - - Good Hierarchical Clustering • The content of the clusters. • Clusters should not mix relevant with non relevant information

+ + + + - - - + + + + - - - - - - + + + + + + - - - Good Hierarchical Clustering • The hierarchical arrangement of the clusters • Relevant information should be in the same path

Good Hierarchical Clustering • The number of clusters • Number of clusters substantially lower than the number of documents • How clusters are described • Cognitive load of reading a cluster description • Ability to predict the relevance of the information that it contains (not addressed here)

Evaluation Measures • Criterion • Minimize the browsing effort for finding ALL relevant information • Baseline • The original document list returned by a search engine

Evaluation Measures • Consider • Content of clusters • Hierarchical arrangement of clusters • Size of the hierarchy • Cognitive load of reading a document (in the baseline): Kd • Cognitive load of reading a node descriptor (in the hierarchy): Kn • Requirement • Relevance assessments are available

+ - + - + - + Minimal Browsing Area (MBA) • The minimal set of nodes the user has to traverse to find ALL the relevant documents minimising the number of irrelevant ones

Distillation Factor (DF) • Ability to isolate relevant information compared with the original document list (Gain Factor, DF>1) • Equivalent to: • Considers only the cognitive load of reading documents

+ Document List Doc 1 + Doc 2 - Doc 3 + Doc 4 + Doc 5 - Doc 6 - Doc 7 + - + - + - + Distillation Factor (DF) • Example Precision = 4/7 Precision MBA = 4/5 DF(L) = 7/5 = 1.4

Precision = 4/8 Precision MBA = 4/4 DF = 8/4 = 2 - + + + + - - - Distillation Factor (DF) • Counterexample: • Bad clustering with good DF • Extend the DF measure considering the cognitive cost of taking browsing decisions  HQ

- + - - - + +- - + MBA |Nview|=8 Hierarchy Quality (HQ) • Assumption: • When a node (in the MBA) is explored, all its lower neighbours have to be considered: some will be in turn explored, some will be discarded • Nview : subset of lower neighbours of each node belonging to the MBA

Hierarchy Quality (HQ) • Kn and Kdare directly related with the retrieval scenario in which the experiments take place • The researcher must tune K=Kn/Kd before conducting the experiment • HQ > 1 indicates an improvement of the clustering versus the original list

- + - - - + +- - + - + + + + - - - Hierarchy Quality (HQ) • Example

Conclusions and Future Work • Framework for comparing different clustering approaches taking into account: • Content of clusters • Hierarchical arrangement of clusters • Cognitive load to read document and node descriptions • Adaptable to the retrieval scenario in which experiments take place • Future work • Conduct user studies to compare their results with the automatic evaluation • Results will reflect the quality of the descriptors • Will be used to fine-tune the kd and kn parameters

Thank you!

Evaluating Hiera r chical Clustering of Search Results

Evaluating Hiera r chical Clustering of Search Results

Presentation Transcript

Search Results

Clustering Web Search Results

Clustering Web Search Results

Bloom Based Filters for Hiera r chical Data

Evaluating Results

Clustering and Exploring Search Results using Timeline Constructions

Clustering and Exploring Search Results using Timeline Constructions

Topical Clustering of Search Results

Clustering in R

Evaluating search engines

Semantic, Hierarchical, Online Clustering of Web Search Results

Clustering Search Results Using PLSA

Online Clustering of Web Search results

Clustering of search engine results by Google

Clustering Personalized Web Search Results

Evaluating search engines

Evaluating Search Interfaces

IL Step 4: Evaluating Search Results

Clustering Search Results Using PLSA

Evaluating Software Clustering Algorithms

Evaluating Results of Learning

Presentation of search results