380 likes | 511 Views
The Recommendation Problem The AIS Approach Algorithm Walkthrough Results and Discussion. A Recommender System based on the Immune Network. Dr Uwe Aickelin. Prediction What rating would I give this film? Prediction quality can be assessed by absolute error. Recommendation
E N D
The Recommendation Problem The AIS Approach Algorithm Walkthrough Results and Discussion A Recommender System based on the Immune Network Dr Uwe Aickelin
Prediction What rating would I give this film? Prediction quality can be assessed by absolute error Recommendation Give me a ‘top 10’ list of films I might like Recommendation quality can be assessed by a ranking ‘discordance’metric The Recommendation Problem“What movies would you predict/recommend?”
Innate vs Acquired Humoral Cell Mediated vs T Cell(CD-4, Helper) Binds to MHC-antigen complex Secretes cytokines to help… B Cell Secretes Antibody which binds to antigen and recruits phagocytes (innate) T Cell(CD-8, Killer) Kills cell (viruses) The Biological Immune System How do we protect the body against infection? (Antigens)
The Recommendation Problem EachMovie databaseUser profiles (3M votes 70k users) User Profile: set of tuples {movie, rating} Me: My user profile Neighbour: User profile of someone else Similarity metric: Correlation score between user profiles Neighbourhood: Group of neighbours similar to me Recommendations: generated from neighbourhood
The AIS Approach EachMovie databaseUser profiles User Profile: set of tuples {movie, rating} Me: My user profile Neighbour: User profile of someone else Similarity metric: Correlation score between user profiles Neighbourhood: Group of neighbours similar to me Recommendations: generated from neighbourhood Antigen Antibody Stimulation Suppression Antibody – Antigen Binding Antibody – Antibody Binding Group of antibodies similar to antigen and dissimilar to other antibodies
Ab2 Ab1 Ag Ab4 Ab3 The AIS Algorithm Start with empty AIS Encode target user as an antigen Ag WHILE (AIS not full) && (More users) DO Add next user as an antibody Ab IF (AIS at full size) Iterate AIS FI OD Generate recommendations from AIS
Algorithm walkthrough: Encoding Suppose we have 5 users and 4 movies DATABASE u1={(m1,v11),(m2,v12),(m3,v13)} u2={(m1,v21),(m2,v22),(m3,v23),(m4,v24)} u3={(m1,v31),(m2,v32),(m4,v34)} u4={(m1,v41),(m4,v44)} u5={(m1,v51),(m2,v52),(m3,v53), (m4,v54)} • We do not have user votes for every film • We want to predict the vote of user u4 on movie m3
AIS Encode user for whom to make predictions as an antigen Ag AIS u4 DATABASE u1, u2, u3, u4, u5 Ag Algorithm walkthrough (1) Start with empty AIS DATABASE u1, u2, u3, u4, u5
Add next user as an antibody Ab1 AIS u1 DATABASE u1, u2, u3, u4, u5 Ag Ab1 Add users 2 and 3 … AIS u2,u3 Ag Ab1 Ab2 Ab3 DATABASE u1, u2, u3, u4, u5 Algorithm walkthrough (2) Add antibodies until AIS is full…
Ab3 Ab1 Ag Ab2 Algorithm walkthrough (3) After some more iterations… the AIS has filled up: Table of matching Scores between Ab and Ag MS14, MS24, MS34 Table of matching Scores between Antibodies MS12 = CorrelCoef(Ab1, Ab2) MS13 = CorrelCoef(Ab1, Ab3) MS23 = CorrelCoef(Ab2, Ab3)
AIS Ag Ag Ab1 Ab2 Ab3 AIS Ab2 Ab1 Ab2 Ab2 Ab1 Ab2 Ab2 Algorithm walkthrough (4) AIS is now at full size so begin iterations… Calculate new CONCENTRATION for each Ab, considering interactions with Ag (STIMULATION) and other Ab (SUPPRESSION) Notice that antibody 3 has been eliminated.
AIS Recommendation for user u4 on movie m3 will be highly based on vote on m3 of user u2 Ag Ab2 Ab1 Ab2 Ab2 Ab1 Ab2 Ab2 Algorithm walkthrough (5) If AIS not yet full and more users available, repeat. Otherwise: GENERATE RECOMMENDATION from CONCENTRATION and ANTIGEN Correlation.
Results • Tested against EachMovie database (15000 users, 1628 films) • Results compared to standard method (Pearson k-nearest neighbours) • Prediction : Results of same quality • Recommendation: Improved results, 4 out of 5 films correct versus 3 out of 5.
1. Stimulation and suppression affect neighbourhood size and number of users looked at
Evaluation • General purpose recommendation tool (e.g. Bookmarks) • Collaborative Filtering is a useful vehicle for examination of AIS dynamics: - Idiotypic effect for more varied population - Potential for distribution - Smaller neighbourhoods (vs computational cost) • Wider applicability (e.g. online community formation)
Speculation: online community formation • Idiotypic effects alter nature of community • How important is diversity? • Are there other network effects that can be used? (hubs, routers etc) • Distribution: the snowball effect • What about interacting communities? • Application areas: ad-hoc community formation, knowledge management, P2P routing…
AIS for Security • Change detection (Checksums) • ‘Self’ : files, network traffic, system calls • Antibodies creation: positive vs negative selection • Collaboration between different populations/sites • Representation: binary string or symbolic (rules) • Other IS features: activation thresholds (vs false positives) co-stimulation (vs false positives) memory detectors (secondary response) MHC masks to cover ‘holes’ (similar to self)
AIS for Security • Evaluation • Applied to network intrusion, virus detection… • Good results on test systems • BUT… • Negative Selection doesn’t scale • Inefficient to map entire non self universe • Changes over time • Appropriate representation of self • Appropriate matching • Primary response requires infection?
An immune response is triggered when the body encounters something foreign. The difference between self and non-self is learnt early in life. E.g. eliminate those T- and B-cells that react to self. Problems: No reaction to foreign bacteria in gut No reaction to food we eat The human body changes over its life Auto-immune diseases Tumours / Transplants Traditional Self - Non Self Distinction
Need for discrimination: What should be responded to? Respond to Danger not to “foreignness”. No need to attack everything that is foreign. Danger is measured by damage / distress signals. Advantages: Can take care of non-self but harmless Can take care of self but harmful The Danger Theory
Self-Nonself discrimination still useful. Nonself does not cause immune response. Danger Signals trigger immune response. A question of semantics? Can this model help us build an AIS for security applications? What would be ‘danger signals’? Danger Model Conclusions
Discussion Uwe Aickelin: http://www.aickelin.com/ Steve Cayzer: http://www-uk.hpl.hp.co.uk/people/steve_cayzer/
Antibody Antibody Antigen AIS Models - Idiotypic • Farmer et al 1986 • Paratope/Epitopes • Lock and Key • Interchangeable? • Behaviour • Matching • Idiotypic (Memory, auto-immune) Paratope Epitope
I1 P1 P2 I2 P3 I3 Antigen AIS Models - Idiotypic Internal Image of Antigen Jerne’s Big Idea (1974) Idiotype: specificity of antibody (epitopes to which it will bind) Idiotope: An idiotypic epitope Evidence: Antibodies produced against antibodies of same species (cf individual) Anti-Idiotypic Set + - Idiotypic Set
AIS Models - Idiotypic • In Words… • The idiotypicnetwork hypothesis (Jerne 1974) builds on the recognition that antibodies can match other antibodies as well as antigens. • A group of antibodies, which match an antigen, may be matched by other antibodies which may in turn be matched by yet other antibodies. This stimulatory effect will set up activation chains or loops. • Matched antibodies are suppressed, and this effect will encourage diversity • In Formulae…
AIS Models - Idiotypic • For N antibodies, n antigens. • xi is the concentration of antibody i • p and e stand for ‘paratope’ and ‘epitope.’ • s is the matching threshold. • G is a rectifier function which outputs 0 for all negative input. • k is the allowable overlap
Recommendation Approaches • Simple user comparisons (Pearson, cosine, k-Nearest Neighbour) • Problems: Sparsity, curse of dimensionality • Memory vs Model based approaches • Transformative and Transitive functions • Default votes, Content based, Learning algorithms • Challenge of distribution (vs centralization)
System Description: Encoding Users are represented as a set of tuples which represent their votes:
System Description: Matching We use the Pearson correlation measure The measure is amended as follows
System Description: Prediction We predict a rating by using a weighted average over the neighbourhood of a user:
System Description: Evaluation • Mean Absolute Error • Variance Precision vs Recall