1 / 39

WWW Conference Paper Review

WWW Conference Paper Review. Jonathan Artificial Intelligence Lab University of Arizona. 2014/9/9. 1. Overview of WWW Paper Review Summary. Outline. 2014/9/9. 2. Overview. Annual Conference 2008 Beijing, China 2009 Madrid, Spain 2010 Raleigh, US 2011 Hyderabad, India

kiona
Download Presentation

WWW Conference Paper Review

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. WWW Conference Paper Review Jonathan Artificial Intelligence Lab University of Arizona 2014/9/9 1

  2. Overview of WWW Paper Review Summary Outline 2014/9/9 2

  3. Overview • Annual Conference • 2008 Beijing, China • 2009 Madrid, Spain • 2010 Raleigh, US • 2011 Hyderabad, India • Submission Track • Research Papers • Poster • Other • Demo Proposal, Workshop Proposal etc. 2014/9/9 3

  4. Areas and Topics • Data Mining and Machine Learning • Deriving actionable insight from Web information sources: query logs, Web graph, click trails, text documents, etc. • Social Networks • Models, algorithms, systems and issues around social networks and collaborative environments. • Internet Monetization • Markets, auctions, games, pricing, advertising, and other Web-specific economic activities. 2014/9/9 4

  5. Areas and Topics • Security and Privacy • Semantic Web • Search • Bridging Structured and Unstructured Data • Software Architecture and Infrastructure • Performance, Scalability and Availability • Networking and Mobility • Users Interfaces and Rich Interaction • Rich Media • Web Services and Service-Oriented Computing 2014/9/9 5

  6. Academy University of Illinois Urbana-Champaign(11) Stanford University(4) Cornell University(3) Arizona State University(3) Industry Google(8) Microsoft(7) Yahoo!(5) IBM(2) Major Groups in Data Mining 2014/9/9 6

  7. Best Papers • 2008 • IRLbot: Scaling to 6 Billion Pages and Beyond, Hsin-Tsang Lee, Derek Leonard, Xiaoming Wang, and Dmitri Loguinov,(Texas A&M University) • 2009 • Hybrid Keyword Search Auctions.   Ashish Goel (Stanford University), Kamesh Munagala (Duke University) • 2010 • Factorizing Personalized Markov Chains for Next-Basket Recommendation.Steffen Rendle(Osaka University), Christoph Freudenthaler and Lars Schmidt-Thieme(University of Hildersheim). 2014/9/9 7

  8. More and more research is combining and integrating different approaches to the same problem as an innovation. In data mining field, increasing number of studies are combining both text mining and social network analysis. Trends 2014/9/9 8

  9. Factorizing Personalized Markov Chains for Next-Basket Recommendation S. Rendle (Osaka University, Japan) C. Fraudenthaler (University of Hildersheim, Germany) L.S. Thieme (University of Hildersheim, Germany) 2014/9/9 9

  10. 2010 Best Paper Research Question: Can we provide better product recommendation for different users? Matrix Factorization(MF) Markov Chain (MC) Methodology:Factorized Personalized Markov Chain Model(FPMC) Cinclusion: Proposed methodoutperforms MF model and non personalized MC model. Overview Non personalized MC FPMC 2014/9/9 10

  11. Task:estimate each ? in the cube. But too many ?s, too little data for each user. Solution: FPMC Factorize the cube, so each user u's transition probability from item i to j is influenced by the transitions by the same user, from the same item i and to the same item j. FPMC FPMC 2014/9/9 11

  12. May be helpful when modeling Sequential data which can be grouped(personalized). Potential application in AI Lab Combine authorship analysis and sequential text mining. Predict the next word/sentence/paragraph of a particular author. Take-away 2014/9/9 12

  13. Topic Modeling with Network Regularization Q. Mei (University of Illinois at Urbana-Champaign) D. Cai (University of Illinois at Urbana-Champaign) D. Zhang (University of Illinois at Urbana-Champaign) C. Zhai (University of Illinois at Urbana-Champaign) 2014/9/9 13

  14. Research Question: Can we improve topic modeling by incorporating knowledge on network structure? Methodology: Topic Modeling with Network Structure(TMN). Network Probabilistic Latent Semancitc Analysis(NetPLSA) was used for example. Conclusion:Proposed approach outperforms both pure text-oriented method and network-oriented methods. Overview 2014/9/9 14

  15. In general, TMN is a framework for combining arbitrary topic model and network constraints. It builds an objective function to balance between maximizing the likelihood of the generated topic model and minimizing the topic distribution differences of adjacent nodes on the network graph. Geographic topic distribution for Hurricane Katrina Topic Modeling with Network Structure 2014/9/9 15

  16. When we want to deal with text data to which a network structure is attached, we may find TMN framework helpful. Potential application in AI Lab Geopolitical topic modeling. Incorporate reply network into forum topic modeling. Take-away 2014/9/9 16

  17. Exploiting Social Context for Review Quality Prediction Y. Lu (University of Illinois at Urbana-Champaign) P. Tsaparas (Microsoft) A. Ntoulas (Microsoft) L. Polanyi (Microsoft) 2014/9/9 17

  18. Research Question: Can we improve review quality prediction by incorporating social context into text features? Methodology: Linear Regresion with Regulariziation constraints. Conclusion:Prediction accuracy is greatly increased. Overview 2014/9/9 18

  19. In the regression model, besides textual features, social context features are used as regularization constraints in the regression model. Author consistency Trust consistency Co-citation consistency Minimize both mean square error and the conflicts to the above three consistency conditions. Regression with social context constriants 2014/9/9 19

  20. A good example of utilizing text data with a network structure attached. When we want to give numerical scores for textual data, we can use relationship to adjust these scores. Potential application in AI Lab sentiment analysis. Take-away 2014/9/9 20

  21. Topic Initiator Detection on the World Wide Web X. Jin (University of Illinois at Urbana-Champaign) S. Spangler (IBM) R. Ma (IBM) J. Han (University of Illinois at Urbana-Champaign) 2014/9/9 21

  22. Research Question: How can we find the initiator on some topic in online media? Methodology: InitRank Conclusion: Proposed method outperforms baseline models such as sorting the documents by time. Overview 2014/9/9 22

  23. After extracting initiator indicator attributes for all documents on a topic, TCL graph is constructed TCL=Time+Content+Link Two kinds of relationship exist between document nodes in the graph. Link(Solid) Point to referenced document. Document similarity(dashed) Point to earlier document. Initiator values for nodes are initialized by other attributes such as centrality, novelty, originality and document length. Then, these values are optimized on the graph. InitRank:TCL Graph 2014/9/9 23

  24. Again, a good example of combining text mining and social network analysis. Potential application in AI Lab This framework may be useful in modeling the "paths" in information diffusion. Take-away 2014/9/9 24

  25. AdHeat: An Influence-based Diffusioin Model for Propagating Hints to Match Ads H. Bao (Google) E. Chang (Google) 2014/9/9 25

  26. Research Question: In social network, is targeting ads to a user based upon other users' influences better than targeting based on this user's features? Empirically, a user expertised in one area shows no interest in ads in this area. In this regard, this research attempts to target ads based upon other user's information that influence the target user best. Methodology: Heat diffusion model Conclusion: Influence based model outperforms traditional model in terms of click-through-rate(CTR). Overview 2014/9/9 26

  27. 1)Social Network Constructing a. Edge weights are calculated based on relationship attributes. b. Influence score for each user is calculated based on HITS(Hypertext Induced Topic Selection). 2)Hint-word Generation--LDA(Latent Dirichlet Allocation) 3)Influence Propagation--Heat Diffusion Equation u1 0.8 0.6 u2 0.4 u4 0.6 u3 0.5 AdHeat Model 0.8 0.6 0.4 0.2 2014/9/9 27

  28. Attributes for an instance(user) can also be modeled indirectly from other nodes by looking at their relationships. Potential application in AI Lab When clustering stakeholder groups, besides writing style, we can also pay attention to what topics are read most by an author to help identifying his group. Take-away 2014/9/9 28

  29. Incorporating Site-Level Knowledge to Extract Structured Data from Web Forums J Yang (Microsoft) R. Cai (Microsoft) Y. Wang (Chinese Academy of Science) J. Zhu (Tsinghua University) L. Zhang (Microsoft) W. Ma (Microsoft) 2014/9/9 29

  30. Research Question: Can we create a general forum crawler to extract structured data from any forums? Methodology: Markov Logic Networks(MLN) Conclusion: Proposed mechanism is shown to be quite promising. Overview 2014/9/9 30

  31. A probabilistic extension of first-order logics. A Markov logic contains multiple assertions called formulas, each of which is assigned a weight. An instance does not have to meet all the formulas to confirm the final assertion. This "fuzziness" handles the differences in various forum designs, and contributes to the generalizability of the forum crawler. *Example of dete- cting thread title: h: an HTML element Markov Logic Networks 2014/9/9 31

  32. A promising framework to facilitate spidering and parsing in future. MLN may be useful when you need to enhance compatibility of a system. Potential application in AI Lab Employ MLN to process textual information intelligently. Take-away 2014/9/9 32

  33. Summary 2014/9/9 33

  34. B. Hongji, E.Y. Chang. 2010. AdHeat: An Influence-based Diffusioin Model for Propagating Hints to Match Ads. In Proceedings of the 19th international conference on World wide web. S. Goel, R. Muhamad, D. Watts. 2009. Social Search in "Small-World" Experiments. In Proceedings of the 18th international conference on World wide web. X. Jin, S. Spangle, R. Ma, J, Han. 2010. Topic Initiator Detection on the World Wode Web. In Proceedings of the 19th international conference on World wide web. Y. Lu, P. Tsaparas, A. Ntoulas, L. Polyani. 2010. Exploiting Social Context for Review Quality Prediction. In Proceedings of the 19th international conference on World wide web. S. Rendle, C. Freudenthaler, L.S. Thieme. 2010. Factorizing Personalized Markov Chains for Next-Basket Recommendation. In Proceedings of the 19th international conference on World wide web. References 2014/9/9 34

  35. H. Lee, D. Leonard, X. Wang, D. Loguinov. 2008. IRLbot: Scaling to 6 Billion Pages and Beyond. In Proceedings of the 17th international conference on World wide web. A. Goel, K. Munagala. 2009. Hybrid Keyword Search Auctions. In Proceedings of the 18th international conference on World wide web. J Yang, R. Cai, Y. Wang,J. Zhu,L. Zhang, W. Ma. 2008. Incorporating Site-Level Knowledge to Extract Structured Data from Web Forums. In Proceedings of the 17th international conference on World wide web. J. Y, R. Cai, Y. Wang, J. Zhu, L. Zhang, W. Ma. 2009. Incorporating Site-Level Knowledge to Extract Structured Data from Web Forums. In Proceedings of the 18th international conference on World wide web. References 2014/9/9 35

  36. Social Search in "Small-World" Experiments S. Goel (Yahoo!) R. Muhamad (Columbia University) D. Watts (Yahoo!) 2014/9/9 36

  37. 2009 Best Paper Nominee Research Question: Are individuals able to find theoretically shortest path connecting to anyone in the social network? Every pair of individuals are connected by about 6 intermediaries. Topological distance Search distance Methodology: Message-forwarding experiment;Logistic Multilevel Regression Conclusion: The mean chain length in algorithmic sense is much larger than 6. Overview 2014/9/9 37

  38. Attrition Rate The probability of message forwarding to stop at some node. Motivation:estimate the real algorithmic chain length Chain length cannot be directly obtained since in experiment, more than 99% messages fail to reach final recipients because of "attrition". Attrition rate can be affected by network topology and individual difference. People with high social status(educated, wealthy etc.) tend to have lower attrition rate. Attrition in Connectivity 2014/9/9 38

  39. When modeling directed social relationship, we may take individual differences into account. Potential application in AI Lab Consider the attrition in opinion diffusion model. Take-away 2014/9/9 39

More Related