460 likes | 639 Views
The Large-Scale Structure of Semantic Networks. A. Tuba Baykara Cognitive Science 2002700187. 1) Introduction 2) Analysis of 3 semantic networks and their statistical properties - Associative Network - WordNet - Roget’s Thesaurus 3) The Growing Network Model proposed by the authors
E N D
The Large-Scale Structure of Semantic Networks A. Tuba Baykara Cognitive Science 2002700187
1) Introduction 2) Analysis of 3 semantic networks and their statistical properties - Associative Network - WordNet - Roget’s Thesaurus 3) The Growing Network Model proposed by the authors - Undirected Growing Network Model - Directed Growing Network Model 4) Psychological Implications of the findings 5) General Discussion and Conclusions Overview
1) Introduction • Semantic Network:A network where concepts are represented as hierarchies of inter-connected nodes, which are linked to characteristic attributes. • Important to understand their structure because they reflect the organization of meaning and language. • Statistical similarities important because of their implications on language evolution and/or acquisition. • Would a similar model have the same statistical properties? Growing Network Model
1) IntroductionPredictions related to the model 1- It would have the same characteristics: * Degree distribution would follow a power-law some concepts would have much higher connections * Addition of new concepts would not change such structure Scale-free (vs. small-world!!) 2- Previously added (early acquired) concepts would have higher connectivity than later added (acquired) concepts.
1) IntroductionTerminology • Graph, network • Node, edge (undirected link), arc (directed link), degree • Avg. shortest path (L), diameter (D), clustering coefficient (C), degree distribution () • Small-world network, random graph
2) Analysis of 3 Semantic Networksa. Associative Network • “The University of South Florida Word Association, Rhyme and Word Fragment Norms” • >6000 thousand participants; 750,000 responses to 5,019 cues (stimulus words) • great majority of these words are nouns (76%), but adjectives (13%) and verbs (7%), and other parts of speech are also represented. In addition, 16% are identified as homographs
2) Analysis of 3 Semantic Networksa. Associative Network Examples: • BOOK _______ BOOK READ • SUPPER _______ SUPPER LUNCH
2) Analysis of 3 Semantic Networksa. Associative Network (when SUPPER was normed, it produced LUNCH as a target with a forward strength of .03) Note: for simplicity, the networks were constructed with all arcs and edges unlabeled and equally-weighted. Forward & backward strength imply directions.
2) Analysis of 3 Semantic Networksa. Associative Network I) Undirected network • Word nodes were joined by an edge if associatively related, regardless of associative direction The shortest path from VOLCANO to ACHE is highlighted.
2) Analysis of 3 Semantic Networksa. Associative Network II) Directed network • Words x & y were joined by an arc from x to y if cue x evoked y as an associative response all shortest directed paths from VOLCANO to ACHE are shown.
2) Analysis of 3 Semantic Networksb. Roget’s Thesaurus • 1911 edition with 29,000 words from 1,000 categories • A connection is made only between a word and a semantic category, if that word is within that category. bipartite graph
2) Analysis of 3 Semantic Networksb. Roget’s Thesaurus Bipartite graph Unipartite graph
2) Analysis of 3 Semantic Networksc. WordNet • Developed by George Miller at the CogSci Lab in Princeton Uni.: http://wordnet.princeton.edu • Based on the relation between synsets; contained more than 120k word forms and 99k meanings • ex: The noun "computer" has 2 senses in WordNet.1. computer, computing machine, computing device, data processor, electronic computer, information processing system -- (a machine for performing calculations automatically)2. calculator, reckoner, figurer, estimator, computer -- (an expert at calculation (or at operating calculating machines))
2) Analysis of 3 Semantic Networksc. WordNet • Links are between word forms and their meanings according to the relationships between word forms such as: • SYNONYMY • POLYSEMY • ANTONYMY • HYPERNYMY (Computer is a kind of machine/device/object.) • HYPONYMY (Digital computer/Turing machine… is a kind of computer) • HOLONYMY (Computer is a part of a platform) • MERONYMY (CPU/chip/keyboard… is a part of a computer) • Links can be established in any desired way, so WordNet treated as an undirected graph.
2) Analysis of 3 Semantic NetworksStatistical Properties I) How sparse are the 3 networks? • <k>: avg. # of connections • In all 3, a node is connected to only a small % of other nodes. II) How connected are the networks? • Undirected A/N: completely connected • Directed A/N: largest connected component has 96% of all words • WordNet & Thesaurus: 99% All further analyses with these components!!!
2) Analysis of 3 Semantic NetworksStatistical Properties III) Short Path-length (L) and Diameter (D) • In WordNet & Thesaurus, L & D based on a sample of 10,000 words. In A/N, all words considered. • L & D in random graphs with equivalent size; expected IV) Local Clustering (C) • To measure its C, directed A/N regarded as undirected • To calculate C of Thesaurus, bipartite graph converted into unipartite graph • C of all 4 networks much higher than in random graphs
2) Analysis of 3 Semantic NetworksStatistical Properties V) Power-Law Degree Distribution () • All distributions are plotted in log- • log coordinates with the line showing • best fitting power law distribution. • in of Directed A/N lower than the • rest These semantic networks are scale-free!
2) Analysis of 3 Semantic NetworksStatistical Properties / Summary • Sparsity & High-Connectivity On avg. words are related to only a few other words • Local Clustering Connections between words are coherent and transitive: if xy and yz; then xz • Short Path Length and Diameter Language is expressive and flexible (thru’ polysemy & homonymy..) • Power-Law Degree Distribution Language hosts hubs as well as many words connected to few others
3) The Growing Network Model • Inspired by Barabási & Albert (1999) • Incorporates both growth and preferential attachment • Aim: to see whether the same mechanisms are at work or not in real-life semantic networks and artificial ones • Might be applied to lexical development in children + • growth of semantic structures across languages, or even language evolution
3) The Growing Network Model Assumptions: • how children learn concepts is thru’ semantic differentiation: a new concept differentiates an already existing one, acquires a similar meaning, but also different, with a different pattern of connectivity. • more complex concepts get more differentiated • more frequent concepts get more involved in differentiation
3) The Growing Network ModelStructure • Nodes are words, and connections are semantic associations/relations • Nodes are different in their utility frequency of use • Over time new nodes are added and attached to existing nodes probabilistically according to: • Locality principle: New links are added only into a local neighborhood a set of nodes with a common neighbor • Size principle: New connections will be to neighborhoods with already large # of connections • Utility principle: New connections within a neighborhood will be onto nodes with high utility (rich-get-richer phenomenon)
3) The Growing Network Modela. Undirected GN Model • Aim: To grow a network with n nodes • # of nodes at time t is n(t) • Start with a fully connected network of M nodes (M<<n) • At each t, add a node i with M links (chosen for a desired avg. density of connections) into a local neighborhoodHi the set of neighbors of i including i itself. • Choose a neighborhood according to the size principle: ki(t): degree of node i at time t Ranges over all current n(t) in the network
3) The Growing Network Modela. Undirected GN Model • Connect to a node j in the neighborhood of node i according to the utility principle: • If all utilities are equal, make a connection randomly: • Stop when n nodes are reached. Uj= log(fj+1); fjtaken from Kučera & Francis(1967) frequency count Ranges over all nodes in Hi
3) The Growing Network Modela. Undirected GN Model • The growth process and a small resulting network with n=150, M=2:
3) The Growing Network Modelb. Directed GN Model • Very similar to the Undirected GN Model: insert nodes with M arcs instead of links • Same equations to apply locality,size and utility principles, since: ki = kiin + kiout • Difference: Direction Principle: majority (!) of arcs are pointed from new nodes to existing nodes the p that an arc points away from the new node is , where >0.5 is assumed; so most arcs will point towards existing nodes.
3) The Growing Network ModelModel Results • Due to computational constraints, the GN model was compared only with A/N model. • n=5018; M=11 and M=12 in the undirected and directed GN models respectively. • The only free parameter in Directed GN model, , was set to 0.95 • The networks produced by the model are similar to A/N in terms of their L, D, C. Same low in as in Directed A/N.
3) The Growing Network ModelModel Results • Also checked if the same results would be produced when the Directed GN Model was converted into an undirected one. why!? • Convert all arcs into links, with M=11 and =0.95 • Results similar to Undirected GN model. • Degree distribution follows a power-law
3) The Growing Network ModelArgument • L, C and from the artificial networks were expected to compare to real-life networks: • incorporation of growth • incorporation of preferential attachment (locality, size & utility principles) • Do models without growth not produce such power-laws? • Analyze the co-occurrence of words within a large corpus Latent Semantic Analysis (LSA): meaning of words can be represented by vectors in a high dimensional space • Landauer & Dumais (1997) have already shown that local neighborhoods in semantic space captures semantic relations between words.
3) The Growing Network ModelLSA Results • Higher L, D and C than in real-life semantic networks • Very different degree-distribution. The distributions do not follow a power-law. Difficult to interpret the slope of the best fitting line.
3) The Growing Network ModelLSA Results • Analysis of the TASA corpus (>10mio words) using LSA vector representation: All words from LSA (>92k) represented as vectors All words from A/N in TASA Most freq. words in TASA
3) The Growing Network ModelLSA Results • Non-existence of power-law degree distribution implies LSA does not produce hubs. • In contrast, a growing model provides a principled explanation for the origin of power-law: Words with high connectivity acquire even more connections over time.
4) Psychological Implications • Number of connections a node has is related to the time at which the node is introduced into the network. • Predictions: • Concepts that are learned early in life will have more connections than concepts learned later. • Concepts with high utility (frequency) will receive more links than concepts with lower utility.
4) Psychological ImplicationsAnalysis of AoA-related data To test the prediction, two data sets were analyzed: I) Age of Acquisition Ratings (Gilhooly & Logie, 1980) • AoA effect: Early acquired words are retrieved from memory more rapidly than late acquired words • An experiment with 1,944 words • Adults were required to estimate the age at which they thought they first learned a word on a rating scale (100-700, 700 rated to be very late-learned concept) II) Picture naming norms (Morrison, Chappell & Ellis, 1997) • Estimation of the age at which 75% of children could successfully name the object depicted by a picture
4) Psychological ImplicationsAnalysis of AoA-related data Predictions are confirmed! Standard error bars around the means
4) Psychological ImplicationsDiscussion • Important consequences on psychological research on AoA and word frequency • Weakens: • AoA affects mainly the speech output system • AoA & word frequency display their effect on behavioral tasks independently • Confirms: • early acquired words show short naming-latencies and lexical-decision-latencies • AoA affects semantic tasks • AoA is mere cumulative frequency
4) Psychological ImplicationsCorrelational Analysis of Findings • Early acquired words have more semantic connections (more central in an underlying semantic network) early acquired words have higher degree centrality • Centrality can also be measured by computing the eigenvector of the adjacency matrix with the largest eigenvalue. • Analysis of how degree centrality, word frequency and AoA from previous rating & naming studies correlate with 2 databases: • Naming-latency db of 796 words • Lexical-decision-latency db of 2,905 words
4) Psychological ImplicationsCorrelational Analysis of Findings • Centrality negatively correlates • with latencies • AoA correlates positively with • latencies • Word frequency correlates • negatively with latencies • When effects of word freq. and • AoA partialled out, centrality- • latency correlation remain • significant there must be other • variables
5) General Discussion and Conclusions • Weakness of correlational analysis: direction of causation is unknown: • Because acquired early, a word will have more connections vs. • Because of having more connections, a word will be acquired early • A connectionist model can produce similar results: early acquired words are learnt better.
5) General Discussion and Conclusions • Power-law degree distributions in semantic networks can be understood by semantic growth processes hubs • Non-growing semantic representations as LSA do not produce such a distribution per se. • Early acquired concepts have richer connections confirmed by AoA norms.
References • Barabási, A.L., & R. Albert (1999). Emergence of scaling in random network models. Science, 286, 509-512. • Gilhooly, K.J., & R.H.Logie (1980). Age of Acquisition, imagery, concreteness, familiarity and ambiguity measures for 1944 words. Behavior Research Methods and Instrumentation, 12, 395-427. • Kučera, H., & W.N.Francis (1967). Computational analysis of present-day American English. Providence, RI: Brown University Press. • Landauer, T.K., & S.T.Dumais (1997). A solution to Plato’s problem: The Latent Semantic analysis theory of acquisition, induction, and representation of knowledge. Psychological Review, 104, 211-240. • Morrison, C.M., T.D.Chappell and A.W.Ellis (1997). Age of Acquisition norms for a large set of object names and their relation to adult estimates and other variables. Quarterly Journal of Experimental Psychology, 50A, 528-559.
Thanks for your attention! Questions / comments are appreciated.
2) Analysis of 3 Semantic Networksc. WordNet Number of words, synsets, and senses POS Unique Synsets Total Word-Strings Sense Pairs Noun 114,648 79,689 141,690 Verb 11,306 13,508 24,632 Adjective 21,436 18,563 31,015 Adverb 4,669 3,664 5,808 Totals 152,059 115,424 203,145
2) Analysis of 3 Semantic NetworksStatistical Properties With N nodes and <k> avg.degree • If <k> = pN < , the graph is composed of isolated trees • If <k> > 1, a giant cluster appears • If <k> ln(N), the graph is totally connected
Roget’s Thesaurus • WORDS EXPRESSING ABSTRACT RELATIONS • WORDS RELATING TO SPACE • WORDS RELATING TO MATTER • WORDS RELATING TO THE INTELLECTUAL FACULTIES • WORDS RELATING TO THE VOLUNTARY POWERS • WORDS RELATING TO THE SENTIMENT AND MORAL POWERS