1 / 65

Creating and Sharing Structured Semantic Web Contents through the Social Web

Creating and Sharing Structured Semantic Web Contents through the Social Web. (Main Evaluation) Aman Shakya Advisor: Prof. Hideaki Takeda Sub-advisors: Assoc. Prof. Nigel Collier Assoc. Prof. Kenro Aihara. Outline. Introduction Social Semantic Web

dalia
Download Presentation

Creating and Sharing Structured Semantic Web Contents through the Social Web

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Creating and Sharing Structured Semantic Web Contents through the Social Web (Main Evaluation) AmanShakya Advisor: Prof. Hideaki Takeda Sub-advisors: Assoc. Prof. Nigel Collier Assoc. Prof. KenroAihara

  2. Outline • Introduction • Social Semantic Web • State-of-art and Problems • Proposed approach • The StYLiD system • Concept consolidation • Concept grouping • Evaluation • Practical applications • Conclusions main evaluation

  3. Introduction main evaluation

  4. Background • Information Sharing • Information publishing • Understandable semantics • Information dissemination • Shared information • Better utilization  Increased value • Shared information put together • Valuable knowledge main evaluation

  5. Social Web and Web 2.0 • Easy to publish, understand and use • Information sharing platform • User generated contents • Connecting people • Collaboration • Mass participation – Power of People • Wisdom of the crowds main evaluation

  6. Current Limitations and Needs • Data processing and automation • Unstructured data only for humans • Interoperability • Sharing data across different applications • Integration • Combining data from different applications main evaluation

  7. The Semantic Web • Web of Structured Data • Machine understandable semantics • Ontologies • Represent Conceptualizations of things • Consensus and common formats • Enables • Automated processing • Interoperation and Integration • Effective search and browsing main evaluation

  8. Challenges ? • Difficult to publish on the Semantic Web • Wide variety of data to share • Long Tail of information domains (Hunyh et al. 2007) • Not enough ontologies • Ontology creation is a difficult process • Goal - To enable people to easily share wide variety of semantically structured data main evaluation

  9. Social Semantic Web • Social software + Semantic Web • Web 3.0 Social connectivity Social Semantic Web Information connectivity - Adapted from (Decker, 2005) main evaluation

  10. State-of-Art: Social Semantic Web Structured content creation on the Social Semantic Web Direct Structured Contents Derived Structured Contents Instance Data Creation Semantification of Social Data Data Exporters Semantic Blogging Scrapers Semantic Bookmarking Semantics of Tags Semantic Desktop Semantics from Text Semantic Annotation Emergent Semantics Ontology + Instance Data creation Semantic Wikis Collaborative Ontology Creation main evaluation

  11. Collaborative Knowledge Base Creation Knowledge base = ontology + instance data Collaborative Knowledge Base Users Users main evaluation

  12. Collaborative Knowledge Base Creation Systems main evaluation

  13. Problems • Complexity and learning curve • Powerful collaborative systems difficult for ordinary people • Difficult to create perfect concept definitions and ontologies • Difficult to accommodate all requirements • Strict constraints can make the model rigid • Existence of multiple conceptualizations • Different perspectives or contexts • Difficulty of collaboration and consensus main evaluation

  14. Proposed Approach main evaluation

  15. Local KB Local KB Local KB Proposed Collaborative Knowledge Base Creation Collaborative Knowledge Base Users Users Users main evaluation

  16. Overview of Proposed Approach Structured Data Collection Concept Consolidation Social Platform for Structured Data Authoring Schema Alignment Concepts Instances Concept Grouping Structured Linked Data Grouped concepts Browsing, Searching, Services Emerging Lightweight Ontologies User Community main evaluation

  17. StYLiD Structure Your own Linked Data http://www.stylid.org Social Software for Sharing a wide variety of Structured Data Users freely define their own concepts Easy for ordinary people Consolidate multiple concept schemas Group and organize similar concepts Popular evolving concepts definitions main evaluation

  18. “Hotel” Concept Creating a new Concept List of Attributes Description Or Reuse / Modify existing Concept Suggested Value Range main evaluation

  19. Shinjuku Prince Hotel Instance Data Literal value Pick value from Suggested range Resource URI External URI Multiple Values main evaluation

  20. Concept Consolidation • Hotel 1 • Name • Amenities • Capacity • Contact • Price • Access • Rating • Hotel 2 • Name • Facilities • No. of rooms • Phone-number • Single room price • Double room price • Nearest station • Category • Address • Hotel 3 • Name • Price • Rating • City • Country • Near-by attractions • Hotel 4 • Name • Phone-number • Zip-code • Latitude • Longitude • No. of stories same Synonymous / different labels Different Contexts / Perspectives Many-to-one Complimentary main evaluation

  21. Hotel (Consolidated Concept ) • Name • Facilities • Capacity • Contact • Single room price • Double room price • Access • Rating • Address • Zip-code • Latitude • Longitude • Near-by attractions • No. of stories Consolidated Concept main evaluation

  22. Concept Consolidation • A concept consolidation C is defined as a triple < , S, A> where • - consolidated concept • S - set of constituent concepts {C1,C2 ,…..Cn} • Ais the attribute alignment between andS • Based on Global-as-View (GAV) approach for data integration (Lenzerini, 2002) • Global schema defined as views on source schemas • Consolidated Concept with consolidated attributes • aligned to source concept attributes as views main evaluation

  23. Concept Consolidation < , S, A> image view aligned( , ) aligned( , ) aligned( , ) A = { , … } main evaluation 23

  24. Concept Consolidation • Consolidated view of instances • Translation of instances • From one conceptualization to another • Query Unfolding (Advantage of GAV over LAV) • Queries over(in terms of attributes) to queries over {C1,C2 ,…..Cn} • Using alignment A • Union of results • Translation of queries main evaluation

  25. Concept Cloud Consolidated concept Sub-Cloud main evaluation

  26. Experiment on Conceptualization Hypothesis Multiple conceptualizations by different people for the same thing can be consolidated Methodology Participants given short text passages (6 participants) List down Facts structured as (Attribute, Value) table All concept schemas aligned manually Concept schema main evaluation 26

  27. Observations Types of Alignment Relations found Attribute label similarity main evaluation

  28. Remarks • People can express their conceptualizations in terms of schema • Different people have different conceptualizations • No one covers all possible attributes • Conceptualizations overlap significantly • Most parts can be aligned • Most have simple alignment relations • Multiple conceptualizations can be consolidated main evaluation 28

  29. Alignment of Concept Schemas • Attribute Alignments suggested Automatically • Alignment API implementation (with WordNet extension) (Euzenat, 2004) • Community-supported alignment • Human intelligence + Machine intelligence • Alignments are represented and saved • Alignment ontology (Hughes and Ashpole, 2004) • Alignment API alignment specification language (Euzenat et al., 2004) • Other formats : C-OWL, SWRL, OWL axioms, XSLT, SEKT-ML and SKOS. • Incremental alignment (maintained collaboratively) • A Unified View • Consolidated concept with Consolidated Attributes • Homogenous table of data main evaluation 29

  30. Semi-automatic Schema Alignment Two Hotel concepts x Consolidated attributes main evaluation

  31. Consolidated Structured Search Find all hotels with location “Tokyo” and type “luxury” Search on Consolidated Concept Hotel 1 ---- Hotel 2 location  address type  category main evaluation

  32. Concept Grouping Concept Similarity ConceptSim(C1, C2) = w1*NameSim(N1, N2) + w2*SchemaSim(S1, S2) NameSim WordNet-based similarity - Lin’s algorithm (1998) Levenshtein distance SchemaSim Average similarity of best matching pairs of attributes Calculate ConceptSim between all pairs of concepts Group similar concepts above Threshold main evaluation 32

  33. Schema Similarity • Calculate NameSim for all pairs of attributes to create an n1*n2 matrix M = [NameSim(A1X A2)] • Find best matching pairs using Hungarian Algorithm (M) (Kuhn, 1955; Munkres, 1957) • Calculate matching average SchemaSim(S1, S2) = 2xSimilarity of best matching pairs / (|A1|+|A2|) Adapted from Semantic similarity between sentences (Simpson and Dao, 2005) S2 S1 A2 A1 main evaluation

  34. Visualization of Concepts Grouping Cytoscape main evaluation

  35. Experiments on Freebase Data Purpose Evaluate automatic schema alignment Evaluate proposed concept grouping method Observations about user-defined concepts Community-driven database of world’s information User-defined Types – concept schemas Queried out (May 20, 2008) Cleaning Filter out test types, stop-words, types without instances main evaluation 35

  36. Observations • After cleaning • 1,412 concepts • 500 users who defined concepts • People want to share a wide variety of data • People define their own concept schemas • Most people only define few concepts (1-5) • Long tail of information types main evaluation

  37. Freebase Concept Consolidation Concepts with same name, synonyms, morphological variants 57 consolidated concepts formed Multiple versions of concept by different users Up to 6 versions of the same concept Same user also defines multiple versions Alignments suggested automatically 51 alignment relations (44 aligned attribute sets) Human judgement Precision 88.24% Recall 67.16% main evaluation 37

  38. Concept Consolidation Example Aligned attribute Sets (adapted from Freebase) • {Recipe(user1), Recipe(user2), Recipes(user3) ….} r1r2r3 • Consolidated concept - Recipe • Consolidated attributes • {r1#ingredient, r2#ingredients, r3#materials} • {r1#steps, r2#instructions} • r3#directions • r2#tools_required • r3#taste • r3#author …… main evaluation 38

  39. Evaluation of Concept Grouping ConceptSim(C1, C2) = w1*NameSim(N1, N2) + w2*SchemaSim(S1, S2) Concept grouping with different thresholds (w1 = 0.7, w2 = 0.3) Concept grouping with different weights (threshold = 0.8) main evaluation 39

  40. Emergence of Lightweight Ontologies • Concepts contributed by community • Concept consolidation • Concept grouping • Popularity of concepts (as in Tag clouds) • Common vocabulary for structured information sharing • Conceptual schemas (class/property) • Informal organization by similarity main evaluation

  41. Informal Lightweight Ontology source: Schaffert et al. (2005) p. 7 main evaluation

  42. Evaluation main evaluation

  43. Evaluation of Usability • Hypothesis • StYLiD is more usable than Freebase (for given tasks) • Methodology • Tasks performed with StYLiD and Freebase • Task 1 - Structured data authoring • Task 2 - Concept schema creation • Task 3, 4 - Modifying and reusing concepts • Task 5 - Structured concepts and instances authoring • Task 6 - Searching • Observations • Questionnaires, screen logs, comments, etc main evaluation

  44. Example (Task 1) Input Band – The Beatles main evaluation

  45. Participants • Total 15 participants • Including 6 without IT background • Different backgrounds • Public policy, international relations, psychology, telecommunication, networks, hotel staff, etc. • From 10 countries • Age : 22 – 43 (avg. 28.3) • Most did not know the systems before main evaluation

  46. Results • System Usability Scale (SUS) (Digital Equipment Corp.) • Average scores: StYLiD – 69.7%, Freebase – 39.3% • Enhanced Semantic MediaWiki – 54.8% (Pfisterer et al., 2008) • Aggregated results from the Tasks (score: 0-4) main evaluation

  47. Results for non-IT participants • 6 participants • SUS scores • StYLiD (71.67%), Freebase (50.42%)

  48. Observations • StYLiD quite usable without any training, knowledge or help • Most users preferred StYLiD to Freebase • Specifying attribute value range not easy • Strict data type constraints can cause problems • Many people modify and reuse concepts • People try to input all data in minimum steps • Data entry can be made easier and quicker • Auto-complete mechanisms would be helpful main evaluation

  49. Comparison with some systems main evaluation

  50. Practical Applications main evaluation

More Related