Inferring Peer Centrality in Socially-Informed P2P Systems

11th IEEE International Conference on Peer-to-Peer Computing Kyoto, Japan, 2011 Inferring Peer Centralityin Socially-Informed P2P Systems Nicolas Kourtellis, Adriana Iamnitchi Department of Computer Science & Engineering University of South Florida Tampa, USA

Socially-aware Applications • Applications collect and use social information: • Location, collocation, history of interactions, etc. • Build (implicit/explicit) social network of users • Use: reduce spam, provide recommendations, etc. • Wide range of system architectures • How does the social network of users affect the load in a P2P architecture? Company Servers P2P Networks Mobile Devices • PeerSoN • LifeSocial.KOM • Safebook • Prometheus • … • MobiClique • Yarta • ... Decentralization of user social data

Social Graphs & P2P Networks • Users connected with application-specific edges • User-contributed peers form a P2P network • User social graph is partitioned into subgraphs & stored on peers Questions: • How do applications traverse a distributed social graph? • What does it mean for the P2P routing?

Application Example • Invite user G’s 2-hop hiking contacts to a trip • Social graph traversals => many P2P lookups • Application performance affected by projection of social graph on peers => 1-hop={B, C, E} 2-hops={A, D, F, I}

Projection Graph • How do the properties of the projection graph compare with the properties of the social graph projected? Social Graph (SG) Projection Graph (PG) P2P Overlay

Projection Graph Model • Uses: • Study properties of peers such as centrality • Study how the social graph topology affects P2P routing & system performance

Outline • Motivation • Projection Graph Model • Social Network Centrality Metrics • Degree Centrality • Node Betweenness Centrality • Edge Betweenness Centrality • Centrality Calculation: Limitations • Experimental Questions • Experimental Methodology • Experimental Results • Impacts on Applications & Systems

Degree Centrality • Number of edges of a node • High degree centrality peers: Network Hubs • Can be targeted to directly influence many other peers with a message broadcast or distribute a search query

Node Betweenness Centrality • Measures the extent to which a node lies on the shortest path between two other nodes • High betweenness centrality peers:Control communication between distant peers • Can host data caches for reduced latency to locate data

Edge Betweenness Centrality • Measures the extent to which an edge lies on the shortest path between two nodes • High betweenness centrality edges: Connect distant parts of P2P network • Can be monitored to block malware traffic

Calculating Peer Centrality • Challenging because of: • Limited access to user data (e.g., privacy settings) • P2P network scale • Peer churn • Through experimental analysis on the social and projection graph, we investigate how to circumvent these limitations

ExperimentalQuestions • Can we approximate the centrality of peers using the centrality scores of their users? • How does the number of users storing data per peer affect the centrality scores of their peers? • Social graph is less dynamic than the P2P network • Calculate infrequently centrality score of users & use it to estimate their peer’s centrality Spoiler Alert! • [1, ~150] users/peer: Can estimate degree & betweenness centrality of peers with good accuracy • Above 150 users/peer: The projection graph becomes highly connected => peers do not differentiate in centrality

Experimental Methodology • Naturally-formed communities offer incentives for resource sharing  1 community subgraph mapped per peer • Projection graphs generated from 5 real social graphs • Communities detected via recursive Louvain algorithm* • Varied average community size: 5,10,20,…,1000 users/peer • Calculate correlation of centralities of users and their peers • Compare average centralities of users and their peers • Identify top centrality peers from their users’ scores *V. D. Blondelet al, “Fast unfolding of communities in large networks”, Journal of Statistical Mechanics: Theory and Experiment, vol. 10, 2008.

Correlation of Centrality Scores • [1-150] users/peer: • Projection graph resembles closely social graph • Highest correlation of social & projection graph metrics • Degree & node betweenness estimated from local information (cumulative scores) • After 150 users/peer: • Projection graph topology loses social properties • Highly connected network • Peers participate equally in graph traversal Users/Peer vs. Node Betweenness Users/Peer vs. Edge Betweenness Users/Peer vs. Degree

Comparison of Centrality Scores • Increase number of users/peer  turning point in projection graph • More connections with other peers  increase peer degree & betweenness to maximum • More social edges within peers  decrease edge betweenness to minimum Users/Peer vs. Degree Users/Peer vs. Node Betweenness Users/Peer Vs. Edge Betweenness

Finding High Betweenness Peers • Placing data caches on high betweenness peers can reduce latency to locate data • Can we identify such peers, knowing the top betweenness users or communities? • Top 5% betweenness centrality users => top betweenness centrality peers with 80–90% accuracy With Top-N% users With Top-N% communities Users/Peer Users/Peer

Summary of Findings • [1, ~150] users/peer: • Projection graph resembles closely social graph • Highest correlation of social & projection graph metrics • Degree & node betweenness can be estimated from local information (cumulative scores of users) • Cannot estimate well edge betweenness • Above 150 users/peer: • Projection graph topology loses social properties • A highly connected projection graph • No differentiation in peer centrality • Top betweenness centrality users can pinpoint the top betweenness centrality peers with good accuracy • Overall: Applications can calculate infrequently centrality score of users to estimate peer centrality • Social graph changes slowly compared to P2P network

Impact on Applications & Systems • Target high degree peers to: • Decrease search time • Increase breadth of search and diversity of results • Target high betweenness peers to: • Monitor information flow and collect traces • Place data caches and indexes of data location • Quarantine malware outbursts • Disseminate software patches • Tackle P2P churn • Predict centrality of peers to allocate resources • Reduce overlay overhead • Enhance routing tables with P2P edges for faster & more secure peer discovery

Thank you!This work was supported by NSF Grants:CNS 0952420 and CNS 0831785http://www.cse.usf.edu/dsg/nkourtel@mail.usf.edu

Inferring Peer Centrality in Socially-Informed P2P Systems

Inferring Peer Centrality in Socially-Informed P2P Systems

Presentation Transcript

P2P (Peer To Peer)

Peer-to-Peer (P2P) File Systems

Peer-to-peer (p2p) systems

Peer-to-Peer (P2P) Computing

Vulnerability in Socially-informed Peer-to-Peer Systems

P2P = “Structured Overlay Networks for Peer-to-Peer systems”

PEER TO PEER (P2P) NETWORK

P2P Interaction in Socially Intelligent ICT