220 likes | 362 Views
Mining Bulletin Board Systems Using Community Generation. Ming Li, Zhongfei (Mark) Zhang, and Zhi-Hua Zhou PAKDD’08 Reporter : Che-Wei, Liang Date: 2008.07.10. Outline. Introduction General Model Interest-Sharing Group Identification Predicting User Behavior Using Generated Community
E N D
Mining Bulletin Board Systems Using Community Generation Ming Li, Zhongfei (Mark) Zhang, and Zhi-Hua Zhou PAKDD’08 Reporter: Che-Wei, Liang Date: 2008.07.10
Outline • Introduction • General Model • Interest-Sharing Group Identification • Predicting User Behavior Using Generated Community • Experiment
Introduction • Bulletin Board System (BBS) • Information exchanging and sharing platform • Consists of a number of boards • Users can read/post messages on different topics • Users with similar interests may have similar actions • Effective discovery of relationships between users of a BBS is essential
General Model • Consider the posted messages, • Use title to fully determine the topics of message • Extracted key words of titles • Mapped to collected topics • A BBS user tends to join in a discussion on topics that he or she is interested • Messages that users posted may reflect users’ interests • Users’ interests are time-dependent • Frequency of messages posted should also be assessed
General Model • Access pattern of BBS users • View of Topics • A set of topics and user access frequencies of the messages posted to different boards by different users along the timeline • View of Boards • A set of boards and frequencies of messages posted to the boards along the timeline
General Model • BBS model • A collection of users, each being represented by two timelines of actions on Boards view and Topics view
Interest-Sharing Group Identification • Given two timelines of actions X and Y of two users idx and idy • A Straight forward way • Similarity between Xi and Yj =
Interest-Sharing Group Identification • Average frequency differences of actions • Local similarity between Xi and Yj
Interest-Sharing Group Identification • Hybrid similarity between Xi and Y • Global similarity between X and Y
Predict User Behavior Using Generated Community • Given a user idi, • Predict what action idi may take in the near future • Actions that have been taken by idi may be closely related to idi’s future actions • Possible solution • Compute posterior probability
Predict User Behavior Using Generated Community • Resolved with interest-sharing groups • Similar users may take similar actions at some time instants
Experiment • Data Set • BBS of Nanjing University • messages collected from January 1st, 2003 to December 1st, 2005 on 17 most popular boards. • 4512 topics of 17 boards, 1109 users. • Evaluation set • 42 volunteers, 18 users interested in modern weapons, 12 users are fond of programming skills; rest of users are interested in computer games
Experiments on Community Generation • Neighborhood accuracy • Describes how accurate the neighbors of a user in a generated community share similar interests to that of the user • Component accuracy • Measures how well these generated groups represent certain interests that are common to the individuals of the groups
Experiments on Community Generation • Example • A generated community, 7 links between similar users, 10 links between dissimilar users • Neighborhood accuracy = (7+10)/21 = 0.810 Component accuracy = (7+0)/21 = 0.333
Experiments on Community Generation • Compare with CORAL
Experiments on Community Generation • Running time comparison
Experiments on User Behavior Prediction • 1056 days for training the probability model • Last 10 days for testing