360 likes | 451 Views
“Never doubt that a small group of thoughtful, committed citizens can change the world. Indeed, it is the only thing that ever has.” --Margaret Mead. Thank You R Hackers of NYC. Harvesting & Analyzing Interaction Data in R: The Case of MyLyn. Sean P. Goggins, PhD Drexel University
E N D
“Never doubt that a small group of thoughtful, committed citizens can change the world. Indeed, it is the only thing that ever has.”--Margaret Mead
Harvesting & Analyzing Interaction Data in R: The Case of MyLyn Sean P. Goggins, PhD Drexel University outdoors@acm.org MyLyn Research Collaborators: PeppoValetto, PhD (PI) & Kelly Blincoe
I Study Small Groups I use electronic trace data, interviews, field notes, electronic content & surveys for raw data
Coolest Open* Data to Me • Group’s Emerging & Evolving • Group Formation & Development • The long tail of social computing, which I describe as everything *except* Wikipedia & Facebook • Groups constructing knowledge, creating information and forming identity. *Available, but not always easy to get in an analyzable form
Points • Harvesting Small, Open Data [MyLyn] • Analyzing • Temporal Changes in theMyLyn Network • Work • Talk • Libraries Used & Source Code • StatNet • iGraph • TNET • R Sourcecode and Data will be available for download at http://www.groupinformatics.org . If you use this data or scripts please cite: • Goggins, S. P., Laffey, J., Amelung, C., and Gallagher, M. 2010. Social Intelligence In Completely Online Groups. IEEE International Conference on Social Computing. 500-507. DOI=10.1109/SocialCom.2010.79. • Blincoe, K., Valetto, G., and Goggins, S. 2011. Leveraging Task Contexts for Managing Developers’ Coordination. Under Review.
Data for R An Example From the MyLyn Project
More About MyLyn: http://tasktop.com/blog/ http://www.eclipse.org/mylyn/ .zip file MyLyn Context Uploads Work Bug Database Talk MySQL Database Talk Talk HTML Parser
Talk Cues Work Talk
Coordination Requirements & Dependencies • MyLyn Data Has 2 Advantages • for Analysis compared to source • Control systems analysis: • You see files *viewed* together • Discourse on a Bug is directly connected to the files read and edited • Closer connection between analysis of work & talk. Talk Work
Harvesting Data for R An Example From the MyLyn Project
MyLyn Interaction Datamart Talk Work MyLyn Interaction Warehouse ETC CANS Work Talk
Analyzing Open Data with R An Example From the MyLyn Project
Analysis Tools • Eight Mylyn Releases (Temporal Analysis) • R Packages Used • TNET • iGraph • Statnet
The Dense Graph (Work) • Developers create a dense graph. Not a complete graph, but dense. Work
A Sparser Graph (Talk) • Commenter's create a sparse graph Talk
Release One (2.0) Analysis Release 1 Discussion Code Talk Work iGraph
STATNET for Discussion • StatNet Red = Bug Commenter Blue = Bug Opener Release 1 Talk StatNET
Release One Work & Talk
Release 1 (2.0) iGraph & Statnet Talk Red = Bug Commenter Blue = Bug Opener Clusters Release 1 StatNET In Degree & Out Degree iGraph
Release One (2.0): Filtered Code Discussion Talk Release 1 Work Google Summer Coder 304, 373, 399 & 143 form The Strongest Connections In both networks Red = Bug Commenter Blue = Bug Opener
Release One (2.0): Filtered Code Discussion Work Talk Release 1 457, 391 & 159 – Comment & Open Google Summer Coder 304, 373, 399 & 143 form The Strongest Connections In both networks Red = Bug Commenter Blue = Bug Opener
Compare Over Time First & Last Release
Release 1 (2.0) Compared to Release 8 (3.3) Release 1 Talk Release 8 304, 399, 143, 159, 173, 373 399, 118, 304, 159, 391, 416 StatNET & ordinary plotting
Release 1 (2.0) Compared to Release 8 (3.3) Work Release 1 143 & 304 disengaged Or missing entirely Release 8 304, 373, 399 & 143 Two disconnected Graphs in release 8 iGraph
Release Eight Work & Talk
Release 8 (3.3): Filtered Discussion Talk Code Release 8 Nobody is “Just Blue” Work Red = Bug Commenter Blue = Bug Opener
Release 8 (3.3): Filtered Discussion Release 8 Talk Code Work Notice 416 in Talk & Second Coder Graph Red = Bug Commenter Blue = Bug Opener
Release 8 (3.3) iGraph & Statnet 399, 118 & 159 are significant, But play with different clusters of Other people. Release 8 Talk Red = Bug Commenter Blue = Bug Opener Clusters StatNET In Degree & Out Degree Blue Cluster iGraph
Releases One Eight High Level Views Over Time
Discussion, Releases 1 – 8 Where there is no color, There are multiple, incomplete Graphs.
Code, Releases 1 – 8 One Possible explanation: A few central People who slowly but Observably begin to engage Other contributors in An open source software Development project. Structure evolves Key Groups Evolve iGraph
Next Step: The Story But that’s the research part, not the cool “R Stuff” Part
The People 373 304 399 159 143 Our next step is piecing together a narrative about the groups that emerged on this project, and describing each of the individuals. This is all open data. When we finish this part, we will publish one or more papers. For now, Let’s look at the cool “R Stuff”
Interaction Traces from Small Groups: The Case of MyLyn Sean P. Goggins, PhD Drexel University outdoors@acm.org Collaborators: PeppoValetto, PhD & Kelly Blincoe Questions? In the after session.