200 likes | 287 Views
Twitter Community Discovery & Analysis Using Topologies. Andrew McClain Karen Aguar. Outline. Introduction Motivation Project Description Community Discovery Data Collection Analysis & Application Results. Introduction.
E N D
Twitter Community Discovery & Analysis Using Topologies Andrew McClain Karen Aguar
Outline • Introduction • Motivation • Project Description • Community Discovery • Data Collection • Analysis & Application • Results
Introduction • Many people use services like twitter to stay in contact with groups in which they are members or to interact with other people with similar interests • These groups are considered “communities”
Community? • A network or group of nodes with greater ties internally than to the rest of the network • There are various derivations of a community: • Some communities are tightly bound together • Others are loose associations of people
Motivation • To classify these communities & find real world implications of their digital associations • Project Description: Discovering communities & examining the properties of the graph to give us insight into the community itself. • Ex: Find the organizers of a hobby group by the twitter activity
Our Project Our project is composed of 2 main sections • Twitter community discovery • Analysis of the community graphs & its correlation to the real world community structure
Community Discovery • Collected data from a diverse number of individuals from known real-world communities • Generated graphs of the communities • Partitioned graphs based on in/out degrees to isolate the community
Community Discovery • Communities: • @CNN • @AthensGroupRide • @AthensChurch • @UniversityOfGA • @ChickFilA
Data Collection • Relationships Modeled: • Followed By/ Following • Replies to • Mentions • Parameters • 1.5 Levels • Limit # of people included in network • Most limited ~ 300
Analysis & Application • Manually reconstructed the hierarchy of the real-world known communities • Use Gephi to detect behavior patterns and structures in twitter communities • Shape, interconnectivity, how the information flows through it • Analyzed the relationships in the graphs against known community structures
Analysis via Gephi • Gephi -- open source graph visualization platform • We used Gephi to isolate the community from the noisy background
Analysis via Gephi • After isolating the communities, labels were sized based on in-degree • The assumption is that the people who are listened to are followed most in the community • The spline on the right shows the scale of the labels • At this time, the analysis of importance is done visually
Results What we found: • An interesting dichotomy between primarily online & primarily offline communities • “Celebrity” Noise Effect • Once a celebrity is introduced to a community, everyone follows them and they become a center individual in the community structure
Results • Online Community: • Athens Group ride --- Make predictions about who is / is not important (by looking at in-degree) • Athens Church – Most significant members are represented in the graph • A mega-church pastor introduces celebrity noise into the community • Offline Community • ChickFilA’s information distribution is largely a uni-directional relationship. It doesn’t receive much information. • Semi-Online Communities (in between) • CNN, UniversityofGa • Their graphs reveal information about the community structure such as large organizations involved, but not much about the individuals in the network
Results Athens Group Ride Ty_Magner, Philgaimon, Joeyrosskopf are determined to be most influential
Results University of GA
Results Athens Church Andy Stanley -celebrity effect
Results Chick-Fil-A Offline Community
Results CNN Extremely small community once filters are applied
Thank You! • Questions? References: Community Discovery in Social Networks: Applications, Methods and Emerging Trends S. Parthasarathy, Y. Ruan, V. Satuluri [2011] gephi.com nodexl.codeplex.com twitter.com