Language networks

SI/EECS 767 Yang Liu January 29, 2010 Language networks

Human language described as a complex network Introduction (Sole et al, 05)

Analyzing statistical properties Building models to explain the patterns Studying the origins and evolution of human language Statistical approaches to natural language processing Incentives

Words as vertices • Co-occurrence networks • (Dorogovtsev & Mendes, 2001; Masucci & Rodgers, 2008) • Semantic networks • (Steyvers, Tenenbaum, 2005) • Syntactic networks • (Cancho et al., 2004) • Sentences as vertices • (Erkan & Radev, 2004) • Documents as vertices • (Menzer, 2004) Categorization

Co-occurence and syntactic networks

Semantic networks

Language as an evolving word web (Dorogovtsev & Mendes, 2001)

Propose a theory of how language evolves Treat human language as a complex network of distinct words Words are connected with nearest neighbors (co-occurrence networks) Papers of Ferrer & Sole (2001, 2002) degree distribution consists of two power-law parts with different exponent Introduction

Preferential attachment • Provide power-law degree distribution • Average degree does not change • The total number of connections increases more rapidly than the number of vertices and the average degree grows The model

At each time step, • a new vertex (word) is added; • t is the total number of vertices, plays the role of time; • connect it with some old one i with the probability proportional to its degree ki; • ct new edges emerge between old words (c is a constant coefficient) • These new edges emerge between vertices i and j with the p ~ kikj The model

Two word webs by Ferrer and Sole (2001, 2002) Obatain ¾ of a million words of the British National Corpus 470 000 vertices Average degree = 72 data

Continuum approximation k(s,t) : the average degree of the vertices born at time s and observed at time t Ct ≈ 70 >>1 Solving the model

The degree distribution has two regions separated by the crossover point Solving the model

Below this point, stationary degree distribution Above this point, Non-stationanry degree distribution Solving the model Empty and filled circles show the degree distributions for two word webs by Ferrer and Sole (2001, 2002)

Interested only in degree distribution Clustering coefficients not match The total number of words of degree greater than kcross does not change The size of kernel lexicon does not depend on the total number of distinct words in language discussion

Network properties of written human language (Masucci & Rodgers, 2008)

The words (include punctuations) are vertices and two vertices are linked if they are neighbors. Directed network Topology of the network

8992 vertices, 117687 edges, mean degree <k> = 13.1 P(k) ∝k -1.9 Zipf’s law slope -1.2 Network statistics

The number of edges between words grows faster than the number of vertices. N(t) ∝ t 1.8 Growth properties

The mean cluster coefficient <c> = 0.19 Nearest neighbor’s properties

Repeated binary structures of words Reproduce by local PA

Starts with a chain of 20 connected vertices At each time add a new vertex and connect it to some vertex i with p ∝ ki m(t) -1 new edges emerge between old words with p ∝ kikj The models (d-m model)

<c(k)> = 0.16 Catches the average clustering and the global growth behavior Misses the internal structure D-m model

Include local PA P(t) ≈ 0.1t0.16 Start with a chain of 20 connected vertices At each time add a new vertex and connect it to some vertex i (not nearest neighbors) with p ∝ ki m(t) -1 times, with probability p(t) link the last vertex to an old vertex i (in its nearest neighborhood) through local PA (p ∝ ki); with 1 – p(t), link an old vertex i (not part of its nearest neighborhood) with global PA Model 2

<c> = 0.08 Catches the global and nearest neighbor behavior but not the average cluster coeffient Model 2

Different words in written human language display different statistical distributions, according to their functions Model 3

Start with a chain of 20 connected vertices At each time add a new vertex and connect it to some vertex i (not nearest neighbors) with p ∝ ki m(t) -1 times, with probability q= 0.05, link the last linked vertex to one of the three fixed vertices; with probability p(t) link the last vertex to an old vertex i (in its nearest neighborhood) through local PA (p ∝ ki); with 1 – p(t) – 3q, link an old vertex i (not part of its nearest neighborhood) with global PA. Model 3

<c> = 0.20 Model 3

New growth mechanisms: 1.local PA 2.the allocation of a set of preselected vertices Conclusions

The large-scale structure of semantic networks: Statistical analyses and a model of semantic growth (Steyvers & Tenebaum, 2005)

There are general principles governing the structure of network representations for natural language semantics The small-world structure arise from a scale-free organization Introduction

Concepts enter the network early are expected to show higher connectivity One aspect of semantic development – growth of semantic networks by differentiations of existing nodes The model grows through a process of differentiation analogous to mechanisms of mechanic development which allows it to produce both small-world and scale-free structure. model

Free association norms WordNet Roget’s thesaurus Analysis of semantic networks

Associative networks • Created two networks: directed, undirected methods

Bipartite graph • Word nodes and semantic category nodes • A connection is made between a word and category node when the word falls into the semantic category • Convert to a simple graph for calculating cc ( one-mode projection) Roget’s thesaurus

120,000+ word forms 99,000+ word meanings Links between forms and forms, meanings and meaning, forms and meanings Treat as an undirected graph wordnet

results

Zipf’s “law of meaning”

Previous models • BA model: low cc • WS model: no scale-free structure Growing network model

At each time step, a new node with M links is added to the network by randomly choosing some existing node i for differentiation, and then connecting the new node to M randomly chosen nodes in the semantic neighborhood of node i. Model A: undirected

Two probability distribution

Set n equal to the size of the target network Set M equal to ½ <k>

Assume the direction of each arc is chosen randomly and independently of the other arcs Point toward old node with probability α, point toward new node with probability1-α Model b: directed

Only test on association networks with Model A and B set α = 0.95 Average of 50 simulations results

Patterns in syntactic dependency networks (Ferrer et al., 2004)

Co-occurrence networks fail in capturing the characteristic long-distance correlations of words in sentences The proportion of incorrect syntactic dependency links is high Require a precise definition of syntactic link introduction

Defined according to the dependency grammar formalism Vertices are words, links go from the modifier to its head The syntactic dependency network

Language networks

Language networks

Presentation Transcript

Ted – A language for Modeling Telecommunication Networks

Frenetic: A High-Level Language for OpenFlow Networks

Frenetic : A Programming Language for Software Defined Networks

eSL - A language for Social Networks

Language Modeling and Encryption on Packet Switched Networks

Perfect phylogenetic networks, and inferring language evolution

Markov Logic Networks: A Unified Approach To Language Processing

Modeling Language Acquisition with Neural Networks

The Emergentist Approach To Language As Embodied in Connectionist Networks

LSTM Neural Networks for Language Modeling

INTENT- Integrating Telecollaborative Networks Into Foreign Language Higher Education

NETWORKS – THE UNIVERSAL LANGUAGE OF COMPLEX WORLD

Network Description Language: Semantic Web for Hybrid Networks

Language Networks The small world of human language

Modelling Language Acquisition with Neural Networks

The Emergentist Approach To Language As Embodied in Connectionist Networks

Ruling Networks with RDL: A Domain-Specific Language to Task Wireless Sensor Networks

Language Challenges inspired by Networks of Tiny Devices

SNAL Sensor Networks Application Language

Frenetic : A Programming Language for Software Defined Networks