1 / 11

A Model of Information Foraging via Ant Colony Simulation

Matthew Kusner. A Model of Information Foraging via Ant Colony Simulation. Information Foraging. Theory Background People search for information in roughly the same way that animals search for food in their surroundings. Information Scent Ex: “the text associated with Web links” (Fu, 2007)

leoma
Download Presentation

A Model of Information Foraging via Ant Colony Simulation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Matthew Kusner A Model of Information Foraging via Ant Colony Simulation

  2. Information Foraging • Theory Background • People search for information in roughly the same way that animals search for food in their surroundings. • Information Scent • Ex: “the text associated with Web links” (Fu, 2007) • Background knowledge • Recommendations

  3. Ant Colony Simulation • Pheromone trails • Laid by ants who've found food. • Followed by other ants with probability p. • Path Evaporation • Path Optimization • Simulation specifics

  4. AOL Data Set • 21 million queries (March 1– May 31, 2006) • 650k users19 million click-through events • Quantities: querytime of query • click URLuser IDclicked link rank

  5. Information Foraging → Ant Colony • user → ant • clicked link → food • information scent → pheromone path • website importance → food distance • where website importance is defined by: • 1. Rank • 2. Popularity of website • 3. Combination of above methods

  6. Distancing Methods • Ranking • Popularity • Combination [based on data in Joachims et al., 2005]

  7. Results • AOL user-visit per website vector • [numWvisits1, numWvisits2, ..., numWvisitsn] • Simulation ant-visit per food vector • [numAvisits1, numAvisits2, ..., numAvisitsn] • Pearson Correlation Score (PCS) • Permutation Test → 95% Coverage Interval • (AOL_datai, simulation_datai) selection with replacement • Bootstrapping → p-value • Shuffle AOL vector

  8. Results • Queries with significant p-values: • vacation” (ranking), “baseball” (ranking), “reebok” (ranking), “adidas” (ranking), “marbles” (ranking), “helicopter” (ranking), “car” (ranking), “potatoes” (ranking), “coffee” (ranking), “farming” (ranking), “rock” (popularity), “shirts” (ranking), “playstation” (ranking), “sega” (popularity), “tom cruise” (ranking), “mel gibson” (ranking), “burger king” (ranking), “chicago” (ranking), “los angeles” (ranking), and “paris” (ranking) • Distancing methods without 95% CI overlap: • Ranking: • “potatoes” - neither popularity, nor combination • “shirts” - not popularity • “playstation” - not popularity • “burger king” - not combination

  9. Discussion • Disadvantages of popularity and combination methods • “vacation” example • Possible reasons for 95% CI overlap • Randomness • Disregard of structure • Significance of queries with low p-values • Search engine matching • Future directions • Different Simulation • Other similarity metrics • Random beginnings

  10. References • Fu, W., & Pirolli, P. (2007). SNIF-ACT: a cognitive model of user navigation on the World Wide Web. Human-Computer Interaction, 22(4), 355-412. • T. Joachims, L. Granka, B. Pang, H. Hembrooke, and G. Gay (2005). Accurately Interpreting Clickthrough Data as Implicit Feedback, Proceedings of the ACM Conference on Research and Development on Information Retrieval (SIGIR).

More Related