E N D
1. Information Filtering Evaluation of Filtering Systems
IEEE Paper Contest Fall 2002
2. Introduction to information filtering
What is filtering
Other info. seeking processes
Paradigms
Profile Modeling
Evaluation of filtering systems
Privacy in filtering systems
3. Other info. seeking processes
4. Filtering vs. Retrieval
5. 3 subtasks of Filtering Collection
Active
Passive
Selection
Display
Interactive
Non Interactive
6. Two paradigms of filtering systems Content-Based
SIFT
InfoScope
Social
Tapestry
Uses a Client/Server mechanism to generate a ranked list
GroupLens
Chicken and the Egg problem
7. A Typical filtering system
8. User modeling & Machine Learning User Model
Explicit (like SIFT)
Implicit (in machine learning)
User’s behavior
Elements of the environment
Evidence of User’s behavior
Explicit feedback
Implicit feedback (InfoScope)
9. sources of implicit evidence about user’s interests Read/Ignored
Saved/Delete
Replied or not
Reading time
10. Machine learning approaches Rule induction
Instance based
Statistical classification
Neural networks
Genetic algorithms
and more
11. Evaluation strategies Precision and Recall
problems:
Recall needs total number of rel. docs.
Precision does not tell everything.
12. Utility Functions
Linear Utility Functions:
LF1=3R - 2N if p(rel)>.4
LF2=3R - N if p(rel)>.25
13. Major problems
The average will be dominated by topics with large retrieved sets.
Difficult to compare performance across topics
14. Solutions Nonlinear Utility functions:
NF1= 6R^.5 – N
NF2= 6R^.8 – N
Scaling
15. Scaling
Divide by max utility scores for each topic
problems:
It is flawed by negative scores.
Inconsistency with precision and recall.
16. Suppose we have two systems where:
Precision(X)>Precision(Y)
Recall (X)> Recall(Y)
if U(X) and U(Y) are negative or we use
nonlinear utility we can have:
U(X) < U(Y) !!!
17. A more sophisticated formula
Us(S,T)=
(max(U(S,T),U(S)) -U(S))/(max U(T)-U(S))
Problem:
Evaluation highly dependent on the
value of S.
18. TREC 9:Resorting to the good old friend
Precision-Oriented function:
T9P=(rel. ret. Docs)/ max (target , ret. Docs)
19. Privacy Privacy becomes an issue when a system collects information about its user
It’s important either in commercial and personal application
20. Privacy in content-based Filtering Preventing unauthorized access to profiles
Password
Encryption
preventing reconstruction of useful information about user profile
Traffic analysis problem
21. Privacy in social filtering Using pseudonym
Encrypted transmission of annotation to authorized users
22. resources A Conceptual Framework for Text Filtering
Douglas W. Oard & Gary Marchionini
Information filtering and information retrieval: two sides of the same coin?
Nicholas J. Belkin & W. Bruce Croft
The TREC-7 Filtering Track Final Report
The TREC-8 Filtering Track Final Report
David A. Hull & Stephen Robertson
The TREC-9 Filtering Track Final Report
Ellen M. Voorhees