330 likes | 469 Views
Research on Online Social Networks: Time to Face the Real Challenges. Walter Willinger AT&T Research Labs Reza Rejaie , Mojtaba Torkjazi , Masoud Valafar University of Oregon Mauro Maggioni Duke University HotMetrics’09, Seattle WA. Motivation.
E N D
Research on Online Social Networks: Time to Face the Real Challenges Walter Willinger AT&T Research Labs Reza Rejaie, Mojtaba Torkjazi, Masoud Valafar University of Oregon Mauro Maggioni Duke University HotMetrics’09, Seattle WA
Motivation • Online Social Networks (OSNs) are becoming increasingly popular over the Internet • This growing popularity has motivated researchers to characterize user connectivity and user interaction in OSNs • Example: Facebook • Launched in 2004 and opened up to the general public in 2006 • More than 200M users as of Early 2009 and 300K new users per day • By late 2008, 300K images per second and 10 billion photos in total • Characterizing OSNs is critical for • Developing measurement and performance modeling/analysis tools • Improving OSN network architecture and system design • Understanding privacy and user behavior • Much of the existing OSN research studies seems to have lost sight of this unique opportunity for characterizing OSNs HotMetrics 2009 - Seattle WA
State-of-the-art in OSN Characterization • Main focus has been on • Simple connectivity structures (e.g. friendship graphs or interaction graphs) • Graph metrics such as node degree distribution, clustering coefficient, density, diameter • Little is known about the actual structure and dynamics • User arrival/departure to/from the system • User interactions • Growth rate of the system HotMetrics 2009 - Seattle WA
This Paper • We argue that OSN research has to change course • Abandon the traditional treatment of OSNs as static networks and become serious about dealing with dynamic nature of real OSNs • Come up with new techniques/tools for collecting and analyzing relevant data HotMetrics 2009 - Seattle WA
Static Friendship Graph • Caveat emptor: our toy examples are for illustration purposes only, they are not meant to describe real-world OSNs • Toy example of static friendship graph (TOYFB) • Hierarchical Scale-Freenetworks [Barabasi2002] • HSF(n, m), n: size of the cell, m: number of levels • HSF graphs show power law node degree distribution, rich local clustering properties, and well-defined cluster-within-cluster structure Step 1: HSF(5,0) Step 2: HSF(5,1) HotMetrics 2009 - Seattle WA Step 3: HSF(5,2)
Dynamic Friendship Graph (I) • Show a very elementary and highly stylized evolutionary process • 125 users join our toy OSN over time • Become friends with other users over time • Become inactive after a while • Consider time interval [0, 1], where graph structure changes at time points 1/16, 2/16, ..., and 16/16 • At any of these discrete points, some friendship relations become inactive, and some new friendships are being established HotMetrics 2009 - Seattle WA
Dynamic Friendship Graph (II) t = 1/16 time HotMetrics 2009 - Seattle WA
Dynamic Friendship Graph (II) t = 2/16 time HotMetrics 2009 - Seattle WA
Dynamic Friendship Graph (II) t = 3/16 time HotMetrics 2009 - Seattle WA
Dynamic Friendship Graph (II) t = 4/16 time HotMetrics 2009 - Seattle WA
Dynamic Friendship Graph (II) t = 5/16 time HotMetrics 2009 - Seattle WA
Dynamic Friendship Graph (II) t =6/16 time HotMetrics 2009 - Seattle WA
Dynamic Friendship Graph (II) t = 7/16 time HotMetrics 2009 - Seattle WA
Dynamic Friendship Graph (II) t = 8/16 time HotMetrics 2009 - Seattle WA
Dynamic Friendship Graph (II) t = 9/16 time HotMetrics 2009 - Seattle WA
Dynamic Friendship Graph (II) t = 10/16 time HotMetrics 2009 - Seattle WA
Dynamic Friendship Graph (II) t = 11/16 time HotMetrics 2009 - Seattle WA
Dynamic Friendship Graph (II) t = 12/16 time HotMetrics 2009 - Seattle WA
Dynamic Friendship Graph (II) t = 13/16 time HotMetrics 2009 - Seattle WA
Dynamic Friendship Graph (II) t = 14/16 time HotMetrics 2009 - Seattle WA
Dynamic Friendship Graph (II) t = 15/16 time HotMetrics 2009 - Seattle WA
Dynamic Friendship Graph (II) t = 16/16 time HotMetrics 2009 - Seattle WA
Dynamic Friendship Graph (III) • A full crawl of TOYFB is identical to HSF(5,2), since friend lists maintained by users do not reflect any de-activation of friendship relations • Properties of the temporal snapshots , TOYFB(t), are radically different from those of the static counterpart TOYFB • What are the effective and efficient methods for accurately capturing and systematically characterizing the dynamic nature of large-scale real-world OSNs? HotMetrics 2009 - Seattle WA
Measurement - Static • Crawling complete snapshots does not scale • Limited rate of crawling (e.g. 10 query/sec in Flickr) • Large population of OSNs (millions of users) • Partial snapshot is likely to be distorted and biased towards high degree nodes • Graph sampling is a promising approach for characterizing node properties [Stutzbach2006, Rasti2009] HotMetrics 2009 - Seattle WA
Measurement - Dynamic • Goal of sampling: collect a “representative” set of users • In an unknown graph changing underneath any measurement tool (e.g. crawler), what does “representative” mean? • Even with a solid definition for “representative” snapshot, how to develop appropriate sampling techniques to deal with dynamic nature of OSNs? • Knowing all the challenges for measuring OSNs, is there any innovative approach for future analysis? HotMetrics 2009 - Seattle WA
Analysis (I) • A new approach based on the following two observations in OSNs • Clustering at different spatial scales • Faster and more noisy temporal dynamics at finer levels of resolutions of the graph • Multi-Resolution Analysis (MRA) for graphs • Start from a coarse-scale representation that is typically small in size and has a slow dynamics • Use the insight gained at this scale to study the graph at the nextfiner levels of resolutions • Diffusion Wavelets (DW) as a principled approach for graph MRA • DW provides the necessary mathematical framework for performing the above graph coarsening intuition [Maggioni2004] HotMetrics 2009 - Seattle WA
Analysis (II) • TOYFB at scale 2 is smaller in size than TOYFB at scale 1 • TOYFB at scale 2 has slower dynamics than TOYFB at scale 1 Scale 3 Scale 2 Scale 1 HotMetrics 2009 - Seattle WA
Analysis (III) t = 1/16 time HotMetrics 2009 - Seattle WA
Analysis (III) t = 5/16 time HotMetrics 2009 - Seattle WA
Analysis (III) t = 9/16 time HotMetrics 2009 - Seattle WA
Analysis (III) t = 13/16 time HotMetrics 2009 - Seattle WA
Conclusion • Stop ignoring the dynamic nature of OSNs • For a better understanding of OSNs • Abandon current traditional measurement, modeling, analysis, and validation approaches • Replace them by new techniques that can account for the most of the dynamic features of real-world OSNs • New methodologies are required for advancing • OSN measurement: extend and develop current graph sampling techniques to consider churn in OSNs • OSN analysis: apply Multi-Resolution Analysis (MRA) methodology for large-scale dynamic graph structures HotMetrics 2009 - Seattle WA
Thank you! HotMetrics 2009 - Seattle WA