270 likes | 279 Views
Explore the term dependence in the Semantic Web through a detailed analysis of RDF documents and ontologies. Discover the Semantic Web Term Dependence Graph (TDG) and Vocabulary Dependence Graph (VDG) with insights on connectivity, degree, and power-laws. Access the full report online.
E N D
Term Dependence on the Semantic Web Gong Cheng and Yuzhong Qu Institute of Web Science (IWS) Southeast University, P.R. China at ISWC 2008, Karlsruhe, Germany
foaf:Agent rdfs:subClassOf rdfs:subClassOf foaf:Person wn:Agent-3 How about the Semantic Web
Term Dependence Graph (TDG) foaf:Agent foaf:Person wn:Agent-3 rdfs:subClassOf
Data Set • As of April 2008 • 9.8 million well-formed RDF documents • 114,408 hosts, or 7,290 registered domain names • 400 million RDF triples http://iws.seu.edu.cn/services/falcons/
Distribution of RDF documents on websites bio2rdf.org dbpedia.org openlinksw.com buzznet.com bibsonomy.org l3s.de …
Size distribution of RDF documents Maximum at 5 NCI Thesaurus WordNet
How large is the TDG analyzed • 1,278,233 terms as nodes (from 3,039 vocabularies) • 1,158,480 classes (90.6%) • 118,808 properties (9.3%) • 945 both (0.1%) • 7,312,657 arcs (after removing self-loops)
EthanAnimals DBpedia properties FMA OpenCyc EthanAnimals Distribution of terms in vocabularies
Degree analysis of TDG • In-degree: direct influence degree • Out-degree: direct dependence degree • Average in-/out-degree: 5.72
In-degree rdf:type rdfs:subClassOf owl:Class rdfs:label rdfs:comment rdfs:Class owl:equivalentClass … cyc:guid … 64.9%
40.9% at 5 Focal classes Out-degree
Correlation • Pearson’s correlation coefficient between in-degrees and out-degrees • 0.006 ∈ [-1, 1]
Distance analysis of TDG foaf:Agent foaf:Person wn:Agent-3 rdfs:subClassOf
Dependence depth Average dependence depth: 10 Leaf classes in FMA
93.4% in FMA Connectivity analysis of TDG
Connectivity analysis of TDG over 40,000 disconnected pieces after 16 nodes are removed
foaf:Agent foaf foaf:Person wn wn:Agent-3 rdfs rdfs:subClassOf Vocabulary Dependence Graph (VDG)
How large is the VDG analyzed • 3,039 vocabularies as nodes • 11,392 arcs (after removing self-loops)
Degree analysis of VDG Average in-/out-degree: 3.75 rdf, rdfs, owl, daml
Main results • Power-laws • Complex structures within ontologies • long distance between terms • Only a few links between ontologies (excluding language-level ontologies) • All the data are available online.
More words • Linking Open Data (LOD) • Linking Ontologies on the Web (LOW) • third-party links, e.g. ontology matching • publishing links online (collaboratively)