270 likes | 346 Views
Term Dependence on the Semantic Web. Gong Cheng and Yuzhong Qu Institute of Web Science (IWS) Southeast University, P.R. China at ISWC 2008, Karlsruhe, Germany. What is happening on the hypertext Web. foaf:Agent. rdfs:subClassOf. rdfs:subClassOf. foaf:Person. wn:Agent-3.
E N D
Term Dependence on the Semantic Web Gong Cheng and Yuzhong Qu Institute of Web Science (IWS) Southeast University, P.R. China at ISWC 2008, Karlsruhe, Germany
foaf:Agent rdfs:subClassOf rdfs:subClassOf foaf:Person wn:Agent-3 How about the Semantic Web
Term Dependence Graph (TDG) foaf:Agent foaf:Person wn:Agent-3 rdfs:subClassOf
Data Set • As of April 2008 • 9.8 million well-formed RDF documents • 114,408 hosts, or 7,290 registered domain names • 400 million RDF triples http://iws.seu.edu.cn/services/falcons/
Distribution of RDF documents on websites bio2rdf.org dbpedia.org openlinksw.com buzznet.com bibsonomy.org l3s.de …
Size distribution of RDF documents Maximum at 5 NCI Thesaurus WordNet
How large is the TDG analyzed • 1,278,233 terms as nodes (from 3,039 vocabularies) • 1,158,480 classes (90.6%) • 118,808 properties (9.3%) • 945 both (0.1%) • 7,312,657 arcs (after removing self-loops)
EthanAnimals DBpedia properties FMA OpenCyc EthanAnimals Distribution of terms in vocabularies
Degree analysis of TDG • In-degree: direct influence degree • Out-degree: direct dependence degree • Average in-/out-degree: 5.72
In-degree rdf:type rdfs:subClassOf owl:Class rdfs:label rdfs:comment rdfs:Class owl:equivalentClass … cyc:guid … 64.9%
40.9% at 5 Focal classes Out-degree
Correlation • Pearson’s correlation coefficient between in-degrees and out-degrees • 0.006 ∈ [-1, 1]
Distance analysis of TDG foaf:Agent foaf:Person wn:Agent-3 rdfs:subClassOf
Dependence depth Average dependence depth: 10 Leaf classes in FMA
93.4% in FMA Connectivity analysis of TDG
Connectivity analysis of TDG over 40,000 disconnected pieces after 16 nodes are removed
foaf:Agent foaf foaf:Person wn wn:Agent-3 rdfs rdfs:subClassOf Vocabulary Dependence Graph (VDG)
How large is the VDG analyzed • 3,039 vocabularies as nodes • 11,392 arcs (after removing self-loops)
Degree analysis of VDG Average in-/out-degree: 3.75 rdf, rdfs, owl, daml
Main results • Power-laws • Complex structures within ontologies • long distance between terms • Only a few links between ontologies (excluding language-level ontologies) • All the data are available online.
More words • Linking Open Data (LOD) • Linking Ontologies on the Web (LOW) • third-party links, e.g. ontology matching • publishing links online (collaboratively)