Social and Information Networks Theory and Practice

Social and Information NetworksTheory and Practice AnirbanDasgupta Isabelle Stanton

Topics • The Structure of Networks • Small world networks • Generative models • The Long Tail • Community Detection • Cascades and Viral Processes • Computation on Large Graphs • Sampling and Surveying • Crowdsourcing • ….?

Coursework • 1 (group) Project • 2 Reaction Papers • 2-3 Experimental Assignments • Scribing

Office Hours • Isabelle – Soda 645 Time TBD • Anirban – By appointment http://cs294socialnetworks.org

Available Data Sets • Yahoo! Webscope data • Will be available in a few weeks • Social Network Crawls • LiveJournal, Twitter, Orkut, Flickr, YouTube, Facebook • SNAP archive • Citation Networks • HEP-th, dblp (over time), theory… • Physical Systems • Power grid, autonomous systems… • Web graphs • Notre Dame, Berkeley/Stanford, Wikipedia…

Complex Systems Around Us • We are surrounded by complex systems • Society is interaction of 7 billion individuals • Communication Systems (e.g. Internet) is formed by linking devices • Our cellsfunctionby interaction of proteins • Thoughts in our brain are formed by interactions of neurons • What are some common properties of these systems? How can we study them?

Why study Networks? Behind each of the complex systems, there is an underlying wiring diagram: the network We will never understand the complex system without understanding the network behind it Nodes: elements Links: interactions System: Graph/network

Network: Online Social Networks Nodes: members Links: “friend”

Network: Internet Nodes: routers Links: connections

Network: US power grid Nodes: power stations Links: power lines

Network: Economy Nodes: Companies Investment Pharma Research Labs Public Links: Collaborations Financial R&D http://ecclectic.ss.uci.edu/~drwhite/Movie

Network: Human Disease Nodes: Disease class Links: share gene

Network: Yeast Proteins Nodes: Proteins Links: chemical interaction

Network: Brain Human Brain has between 10-100 billion neurons. Nodes: neurons Links: connections

Why Study Networks?

Networks: Predicting the H1N1 outbreak

Satellite map of US

Result of US power grid outage

Network: US power grid Nodes: power stations Links: power lines

Without studying networks, we cannot …. • stop cascading outages in power-grids • forecast how disease spreadsin a society • design search engines like Google • understand how interaction of genomes create life • …

What do we study in networks? • Structure and evolution • How does a network look like? • How did it come to be like that? • Process and dynamics • Networks provide skeletons for information, for disease spreading, other dynamic processes

How would we study a network? • Empirical: Study network data to find out a particular principle • Data analysis, experiments, sociology surveys, … • Analyze: Is this principle surprising? How universal is this principle? • Statistics, probability, domain knowledge,… • Hypothesize: Build models that would explain the observed principle • Algorithms, graph theory, statistics, probability, domain knowledge…

Why now? • Data availability • Storage and computation are only getting cheaper • Massive amounts of data about human interaction • Universality • Networks arising from different fields of science and technology have surprisingly common properties • Shared Vocabulary • Statisticians, Cognitive Scientists, Physicists, Biologists, Computer Scientists,..

The story of“six degrees of separation”or“small world phenomenon”

Before there was the Internet • There were still social networks • How can we measure anything about them? • What do social networks look like? • How connected are we?

Milgram’s Experiment (1967) • Wanted to know about the global friendship network • If information is spreading through friends, how soon will it reach one particular person • Cannot really obtain the entire friendship network, so designed an experiment to find out this quantity Stanley Milgram

MA NE Milgram’s Experiment (1967) 300 people in midwest each given a letterTarget  stockbroker in Boston Can only forward the letter to someone you know! Goal: Reach the target

Milgram’s Experiment: Results 300 people in midwest each given a letterTarget  stockbroker in Boston Can only forward to someone you know! Total no. of chains 64 64 total Average number of steps 6.5 “six degrees of separation”

Six degrees of separation For almost all random pairs among 6 billion individuals There is a path with at most 6 steps

Experimental Problems • Selection bias • Starting points weren’t random but people who responded to an ad for ‘well-connected people’ • Highly disconnected groups aren’t sampled • Dropped chains • 232 of 296 never reached the target • 136 of 160 never reached the target • 16 of the 24 went through the same last hop

Was this a fluke? • Replicated by researchers using emails, Facebook • Similar property (short paths between pairs of nodes) also seen in other networks • protein-protein network, gene network • economic networks • language networks…

Six degrees of separation Is this surprising? • The average number of steps in chain was 6 • Why should there be 6 steps? • Hint: Suppose everyone has 100 friends, then? • But, your friends are friends among themselves !! Hermione Harry Ron

Small World Networks Alone these two properties aren’t very surprising. Together, they are. High ‘clustering’ Friends of my friends are likely to be my friends. Small diameter I have ~100 friends,who each have ~100 friends, and so on… So, I can reach everyone in s steps where 100s = n s = log(n)

Six degrees of separation • People do have moderately large (~100-1000) set of friends • But these friends typically occur in clusters • Everyone in a school, workplace, town… • In the presence of these properties, six degrees of separation is not obvious • Surprisingly, people can actually find the small paths…

The Small World concept in simple terms describes the fact despite their often large size, in most networks there is a relatively short path between any two nodes.

Why do we want to understand this?

Why Study Small World property? • Purely scientific: • Why is there something this universal ? • Many very concrete applications: • Designing peer-to-peer systems (Napster, Gnutella), building computer networks • How to spread information with limited budget, say about an upcoming movie • How to stop spreading of viral infections?

How can we explain this? • What if we could hypothesize how networks are formed? • Basic intuition: models have to contain element of structured relation as well as random elements • Example, for social networks • structured friendships: college classmates • different interests: people have different groups of friends • random friendships: met on a train-ride • Still on ongoing area of research…

The Structure of Social Networks • Small diameter • Strongly connected (many short paths) • There exist highly connected people • High clustering coefficient • There are ‘short range’ and ‘long range’ edges • Local routing algorithms are successful What other types of networks have this property?

Erdős–Rényi Graphs • Classic random graph model • G() – for n nodes, add every edge with probability

Erdős–RényiProperties • Not connected unless • No real clustering • Every vertex has the same expected degree • Doesn’t really have any underlying structure Not a good model of a social network

Watts-Strogatz Model • Parameters: • Construct a ring with vertices. Connect each to their nearest neighbors. • Rewire each edge with probability

Watts-Strogatz Properties • Has local and long-range edges • Path lengths • approach • Clustering Coefficient • starts at ¾, decreases to • Degree distribution • same as G(n,p) Key feature of the model is rewiring allows ‘weak ties’

What can we say about when short paths can be found with local information?

Kleinberg’s Small World Networks • How does the network structure affect being able to locally find short paths? • Start with a grid. • Add edge with probability • As changes, what happens?

Decentralized Routing • is given a message to send to • knows where is on the grid • Try to get the message to as fast as possible • can only see its own links Without the random edges, any message can be routed in time.

Kleinberg’s Results • All long range links equally likely • Short paths exist (whp) • They can’t be found with local information Thm: When , the expected delivery time of any decentralized algorithm is at least

Kleinberg’s Lower Bound

Algorithmically • Decentralized routing delivers messages in an expected steps • All others requires time Why ?

Geometry of the Network v x x u The expected length of x is based on r

Social and Information Networks Theory and Practice

Social and Information Networks Theory and Practice

Presentation Transcript

Game Theory, Social Networks, and Going Viral

ISA 562 Information Security Theory and Practice

Social Capital in Theory and Practice

MIS 300…Information Systems- Theory and Practice

Social networks and information sharing

SOCIAL THEORY exchange and networks

Compassionate Social Fitness: Theory and Practice

Computer Networks - Theory and Practice

Understanding Social Enterprise: Theory and Practice

Social Information Processing Theory

ISA 562 Information Systems Theory and Practice

Bridging Theory and Practice in Wireless Networks

Theory and Practice

SOCIAL INFORMATION PROCESSING THEORY

Networks in management: theory and practice

Information Behaviour and Web 2.0 Social Networks

Bridging Theory and Practice in Wireless Networks

Information Extraction : Theory and Practice

ISA 562 Information Security Theory and Practice

Information Spread and Information Maximization in Social Networks

Learning Networks: Theory and Practice

Theory and Practice in Wireless Networks