Ch 9 Relevance feedback and Query expansion

Ch 9Relevance feedback and Query expansion 2009. 11. 30. 최성빈

Overview • 동의어 - 같은 concept이다른 단어로 제시될 수 있다 이것은 대부분의 검색시스템의 recall에 영향을 끼친다 ex> plane vs aircraft • 사용자들은 매뉴얼하게 쿼리를 조정하곤 한다 • 이번장에서는 시스템적으로 query refinement를 수행하는 방법을 다뤄본다

Overview • Global method : 쿼리나 검색결과에 독립적으로 쿼리를 확장 혹은 재구성(reformulate)하는 방법 Query expansion / reformulation with a thesaurus or WordNet Query expansion via automatic thesaurus generation Techniques like spelling correction (discussed in Chapter 3) • Local method : 쿼리에 매치된 초기 검색결과 문서와 관련하여 쿼리를 조정하는 방법 Relevance feedback Pseudo relevance feedback, also known as Blind relevance feedback (Global) indirect relevance feedback

9.1 Relevance feedback and pseudo relevance feedback • Relevance feedback : 최종 결과를 향상시키기 위해, 사용자를 검색 과정에 포함시킨다 방법 • The user issues a (short, simple) query. • The system returns an initial set of retrieval results. • The user marks some returned documents as relevant or nonrelevant. • The system computes a better representation of the information need based on the user feedback. • The system displays a revised set of retrieval results. 이런 과정을 반복할 수 있다 • RF also be effective in tracking user’s evolving information need : 특정 문서를 보는 과정에서, 사용자가 자신이 찾고자 하는 정보 요구에 대한 이해를 refine할 수 있다.

query: ‘bike’

9.1.1 The Rocchio algorithm for relevance feedback • relevance feedback 정보를 vector space model로 통합해서 모델링한다 • The underlying theory • That is, the optimal query is the vector difference between the centroids of the relevant and nonrelevant documents • However, this observation is not terribly useful, precisely because the full set of relevant documents is not known: it is what we want to find

9.1.1 The Rocchio algorithm for relevance feedback

9.1.1 The Rocchio algorithm for relevance feedback • The Rocchio (1971) algorithm : Salton의 SMART system에서 제시되어 1970년대에 알려짐

9.1.1 The Rocchio algorithm for relevance feedback • Relevance feedback은 precision 및 recall 모두를 향상시킬 수 있지만, recall값을 증가시키는데 가장 유용한 것으로 보고되어 왔다. • 이것은 기법이 쿼리를 확장하는 방법을 취하는 이유도 있지만, 부분적으로는 use case의 영향도 있다 - 사용자가 높은 recall을 원하는 경우에, 검색결과를 리뷰하고 반복적인 검색을 수행한다. • Positive feedback이 Negative feedback보다 더 유용한 값을 보여왔다 -> 대부분의 IR시스템은 r < b a = 1 b = 0.75 r = 0.15 가 합리적

9.1.1 The Rocchio algorithm for relevance feedback • Positive feedback이 Negative feedback보다 더 유용한 값을 보여왔다 -> 대부분의 IR시스템은 r < b a = 1 b = 0.75 r = 0.15 가 합리적 Relevance feedback Variants • Image 검색시스템 : onlypositive feedback -> r =0 • Use only the marked nonrelevant document which received the highest ranking from the IR system as negative feedback so, |Dnr| = 1 • 여러 relevance feedback variant를 평가한 실험 결과 중 결정적인 것은 없지만, 한 연구에 따르면 Idedec-hi 방법을 가장 효과적인, 혹은 적어도 일관된 성능을 지닌 방법으로 제시하였다. • Idedec-hi : minimum negative feedback from non-relevant documents

9.1.2 Probabilisticrelevance feedback • 쿼리를 vector space에서 reweight하지 않고, 사용자가 relevant한 문서와 nonrelevant한 문서를 제시했을 때, classifier를 생성한다. 그 한가지 방법으로 Naïve Bayes 확률 모델이 있다 • R : Boolean indicator variable expressing the relevance of a document • N : the total number of documents • d ft : the number that contain t • VR : the set of known relevant documents • VRt: the subset of VR containing t This gives a basis for another way of changing the query term weights

9.1.3 When does relevance feedback work? relevance feedback만으로 풀 수 없는 경우들 • Misspellings If the user spells a term in a different way to the way it is spelled in any document in the collection, then relevance feedback is unlikely to be effective. This can be addressed by the spelling correction techniques of Chapter 3 • Cross-language information retrieval Documents in another language are not nearby in a vector space based on term distribution. Rather, documents in the same language cluster more closely together • Mismatch of searcher’s vocabulary versus collection vocabulary If the user searches for laptop but all the documents use the term notebook computer, then the query will fail, and relevance feedback is again most likely ineffective.

9.1.3 When does relevance feedback work? • Relevance feedback의 성공은 몇 가지 가정에 의존한다 • 1. 사용자는 initial query를 생성할 수 있는 충분한 지식을 가지고 있어야 한다 – initial query가 그들이 찾고자 하는 문서에 근접해 있어야 한다 • 2. relevant document가 서로 비슷해서 clustering이 되어야 한다. - Rocchio model의 경우 relevant document를 사실상 하나의 cluster로 다룬다 -> 이런 방법은 초기 검색된 결과가 multimodal class인 경우 어려움이 있다

9.1.3 When does relevance feedback work? multimodal class의 예> • Subsets of the documents using different vocabulary, such as Burma vs. Myanmar • A query for which the answer set is inherently disjunctive, such as ‘Pop stars who once worked at Burger King’ • Instances of a general concept, which often appear as a disjunction of more specific concepts, for example, felines

9.1.3 When does relevance feedback work? • Relevance feedback은 사용자에게 대중적이지 않다 - feedback을 제공하는데 부정적이다. • Relevance feedback의 적용을 통해 생성되는 긴 쿼리는 computing cost를 증가시켜 응답시간을 늘린다. - 한 연구에서는, 여기에 해결책의 하나로 reweighting에 사용되는 term의 개수를 제한하는 방법이 제시됨

9.1.4 Relevance feedback on the web Indirect Relevance feedback • clickstream data(어떤 링크를 사용자가 실제 클릭했는지)의 활용 • web link structure의 활용 - reader보다는 page author에 의해 feedback이 제공됨

9.1.5 Evaluation of relevance feedback strategies • There is some subtlety to evaluating the effectiveness of relevance feedback in a sound and enlightening way 평가 전략1 • initial query 와 수정된 쿼리를 이용한 검색에서 precision-recallgraph를 각각 계산해서 비교한다 • MAP값에서 대개 50%이상의 향상을 가져오지만, 대부분은 사용자가 선택한 relevance document의 순위 향상에 의함 • 공정한 평가를 위해서는 사용자가 선택하지 않은 문서들을 평가해야 한다.

9.1.5 Evaluation of relevance feedback strategies 평가전략2 • 두 번째 round부터는, 전체 collection을 대상으로 하지 않고, residual collection을 대상으로 평가한다. • relevant document의 수가 작은 경우, 초기검색에서 relevant document의 많은 부분을 가져갈 수 있다 • 서로 다른 relevance feedback 방법 간 비교는 가능하지만, relevance feedback 자체의 공정한 성능 평가는 어렵다 평가전략3 • Collection을 두 개를 구축해서, 한 collection에서 초기검색 및 수정된 쿼리를 얻은 뒤, 다른 collection에서 initial query와 수정된 쿼리 각각의 검색 성능을 평가한다.

9.1.5 Evaluation of relevance feedback strategies 평가전략 4 - user study • 사용자가 relevant document를 얼마나 빨리 찾는지? • 사용자가 제한된 시간 내에 얼마나 많은 relevant document를 찾는지? • 가장 공정한 평가 방법이며, 실제 시스템 사용과 유사하다

9.1.6 Pseudo relevance feedback • blind relevance feedback이라고도 한다 • 초기검색결과 중 상위 k개의 문서를 relevant한 것으로 가정하고 relevance feedback을 수행한다 -> relevance feedback의 매뉴얼한 부분을 자동화한다 • Dangers of an automatic process ex> query : ‘copper mine’ 에서 top k document 가 Chile에 있는 mine인 경우, 쿼리의 방향이 Chile가 될 수 있다

9.1.7 Indirect relevance feedback • feedback에 간접적인 소스를 활용한다 • implicit feedback이라고도 한다 • explicit feedback에 비해 신뢰성이 떨어지지만, 사용자 판단에 관련된 evidence가 없는 pseudo feedback에 비해 유용하다. • 사용자가 feedback을 주는 것을 꺼려함에도, 웹검색엔진의 경우 많은 양의 implicit feedback을 얻을 수 있다.

9.2 Global methods for query reformulation 쿼리 확장 • relevance feedback에서는 사용자가 초기 쿼리 외에 추가적인 정보를 제공 • 여기서는 시스템이 쿼리에 대한 연관 검색어를 제공

9.2 Global methods for query reformulation 쿼리 확장을 위한 Thesaurus 구축 방법 • Use of a controlled vocabulary that is maintained by human editors

9.2 Global methods for query reformulation • A manual thesaurus human editor가 concept별로 동의어 집합을 구축해 둠 • An automatically derived thesaurus 단어 별 co-occurrence 통계를 이용해 thesaurus를 구축 • Query reformulations based on query log mining 쿼리의 양이 많은 웹 검색에 적합하다

9.2 Global methods for query reformulation Automatic thesaurus generation 두 가지 접근법 • 단어 빈도수를 활용하는 법 • 텍스트의 문법적 분석을 통해 문법적 관계를 활용

9.2 Global methods for query reformulation • While some of the thesaurus terms are good or at least suggestive, others are marginal or bad • Term ambiguity easily introduces irrelevant statistically correlated terms ex> apple computer -> apple red fruit computer • automatic thesaurus에서 문서에 이미 높게 correlated된 단어들이므로, 이런 형식의 쿼리 확장은 추가적인 문서를 얻는데 제한적일 수 있다.

9.2 Global methods for query reformulation • 쿼리 확장은 recall을 증가시키는 데 유용하지만, 매뉴얼하게thesaurus를 구축하고 업데이트 하는데 많은 비용이 수반된다. • 쿼리 단어가 모호한 용어를 사용할 때, 쿼리 확장이 precision 값을 떨어뜨릴 수 있다. ex> ‘interest rate’ -> ‘interest rate fascinate evaluate’ • 전체적으로 쿼리 확장은 Relevance feedback에 비해 덜 성공적이다. 하지만 사용자에게 보다 이해되기 쉽다는 장점이 있다.

Ch 9 Relevance feedback and Query expansion