260 likes | 388 Views
SocialMapExplorer: Visualizing Social Networks of Massively Multiplayer Online Games in Temporal-Geographic Space. Y. Dora Cai , Iftekhar Ahmed, Andrew Pilny , Channing Brown, Yannick Atouba , M. Scott Poole. Growing Popularity of Online Games. 135 million gamers are playing worldwide
E N D
SocialMapExplorer:Visualizing Social Networks of Massively Multiplayer Online Gamesin Temporal-Geographic Space Y. Dora Cai, Iftekhar Ahmed, Andrew Pilny, Channing Brown, YannickAtouba, M. Scott Poole
Growing Popularity of Online Games • 135 million gamers are playing worldwide • Thousands of game titles have been developed • Enormous game logs have been generated and collected • Game logs are unique resource for Social Science studies • Many researchers are working on game log analysis
A New Tool: SocialMapExplorer • A web-based application for visualizing the social networks of online games • An application implemented using GoogleMap API, HTML, JavaScript • A highly interactive tool: Users can choose analysis variables, aggregation levels, time periods, and location regions • A tool using visual features (color, size, shape, weight and font) to represent various network features • A tool for visualizing data on a real map and tightly combining time and spatial information with other study attributes • A tool capable to process a terabyte-scale dataset with complex data structure • A tool applicable to the analysis of other data domains associated with temporal-geographic space • 3 modules: NetViewer, GroupDetector, and CorrelationFinder
Data Source for SocialMapExplorer • Anonymous game logs from EverQuest II • Over 100,000 players over a nine month period • A data set (2.2 Terabytes) hosted in an Oracle database • A variety of logs collected at the resolution of seconds • Hundreds of variables capturing player’s behavior • Player’s zip-code based on user registration
Work Flow for SocialMapExplorer • Step 1: Data summarization Apply data-mining/data-warehouse techniques to construct materialized views on data cubes • Step 2: Geocoding Match players’ zip-code with an official USA zip-code book and assign latitude/longitude coordinates for each player • Step 3: Data visualization Visualize data on real maps Step 1 and Step 2 are done on Gordon.
Module: NetViewer • Designed for analyzing network dynamics by visualizing social networks in time series • Trace networking events and make the linkage between involved parties • Able to choose different data sets based on user’s interest • Display networks at different intervals: minute/hour/day • Run in two modes: dynamic and static • AJAX technique was used to automatically reload partial display
Module: GroupDetector • Designed to detect groups and visualize group evolution • Scan game logs and identify the trigger events for group reorganization • Able to choose game tasks and time periods • Display single group or multiple groups • Can run in two modes: dynamic and static • Use AJAX technique to automatically reload partial display
Module: CorrelationFinder • Designed to discover the correlation between census data and game play • Visualize census variables as the background colors at the county level, and visualize the players’ behaviors as the foreground marker and links • Reveal hidden correlations by overlapping two-layer graphs • Able to choose analysis variables from census data and game behavior data • Able to select location and regions based on user’s interest • Visualize variables in a quantitative manner • Verify correlation by statistic methods Is there a correlation between them?
CorrelationFinder – Overlapping Technique • Two layers: • Each county of California is filled using gradient colors based on the population density • Player volume (aggregated to the zip-code level) is represented as markers with gradient colors Two layers:
CorrelationFinder: Population with Player Volume (in California) Correlation Coefficient = 0.923584123
CorrelationFinder: Age 18-44 with Chat Volume (in Arizona) Correlation Coefficient = 0.977620422
CorrelationFinder: Asian Population with Player Volume (in Washington) Correlation Coefficient = 0.864663465
CorrelationFinder: Farmer Population with Chat Volume (in New York) Correlation Coefficient = -0.49464465
Computation Complexity Major computation cost: • Data Summarization • Geocoding • Data Visualization • m – number of rows (R) in game logs • n – number of time and location attributes (A) • p – number of aggregation levels (L) • m – number of Players(P) in game logs • n – number of zip-code in the zip-code book(Z) • x – number of snapshots in time series (T) • m – number of edges (E) in drawing • n – number of markers (R) in drawing • p – number of links (L) in drawing Data summarization and Geocoding are done on Gordon. These two steps need only to be done once, and the result are stored in a database table.
Data Analysis on Gordon • Massive computer nodes with rich memory on Gordon speed up the data processing for step 1 and 2 On standalone sever: With 8 CPUs and 12GB RAM, data summarization and geocoding took over 500 hours On Gordon: 8 parallel jobs with each using 16 cores, all jobs done with 48 hours • Software stack, especially R, supported on Gordon allows the project to run lengthy and complex data analysis • The system support group and consulting office at SDSC always provide prompt services • We appreciate the effort of the SDSC’s Gordon team
Potential Usage for Other Data Domains • Epidemics of infectious diseases, such as influenza, tuberculosis and hepatitis • Dynamics of storms and hurricanes • Evolution of an earthquake • Trend of water contamination • Correlation between census data with diabetes, stroke and heart-attack • Correlation between census data with political election results • Correlation between census data with crime rate
Future Work • Develop new modules to enhance the functionality of SocialMapExplorer • Apply this tool to the analysis of other data domains associated with temporal-geographic space • Integrate this tool with new technologies: Cloud, Hadoop and non-relational databases Try it: http://db.ncsa.uiuc.edu/~ycai/sme_menu/sme_menu.html
Acknowledgement • National Science Foundation (NSF IIS-0729421 and NSF IIS-1247861) • Army Research Institute (ARI W91WAW-08-C-0106) • Air Force Research Lab (AFRL Contract No. FA8650-10-C-7010), • Army Research Lab (ARL) Network Science – Collaborative Technology Alliance (NS-CTA) via BBN TECH/W911NF-09-2-0053. • National Science Foundation via the XSEDE project’s Extended Collaborative Support Service under Grant NSF-OCI 1053575 • The Gordon group at SDSC • The Campus Cluster group at NCSA/UIUC
Population with Player Volume (in 48 States) Correlation Coefficient = 0.838687555