370 likes | 387 Views
This article explores the use of knowledge graphs and text analytics in data-driven publishing, focusing on the Ontotext Platform. Learn how knowledge graphs enhance traditional relational databases and how text analytics can provide semantic disambiguation and annotation. Discover how the combination of graph-based reasoning and vector space similarity can improve content search, recommendation, and relevance ranking. Gain insights into dynamic semantic publishing and its benefits for content curation, search, recommendation, and workflow optimization.
E N D
Towards Data Driven Publishing Leveraging Knowledge Graphs and Text Analytics Contech: 2018 Jem Rayfield, November, 2018
Outline • From; Unstructured Ambiguous Content • Knowledge Graphs • Ontotext Platform • To; Data driven publishing
How can I get OP? From: Unstructured Ambiguous Content
S S NP NP VP VP PP PP Adj N Adj N V P N V P N Stolen painting found by tree Stolen painting found by tree
Graphs treat the connections between information with equal importance.
Knowledge graphs represent information in a manner similar to how a human understands information.
Ontotext GraphDb; uses graph statements to reason and infer additional knowledge. Vector space indices for similarity.
Graph; Reasoning & Inference S = Berners-Lee P = type O = Person S = Berners-Lee P = type O = Mammal DATA (RDF) S = Person P = subClassOf O = Mammal NEW Implied DATA (RDF) KNOWLEDGE (ONTOLOGY)
Graph & Vector Space; Entity Awareness, Similarity +
Big Knowledge Graphs; Provide Awareness • Important airports near london? • Most popular banks in UK • People mentioned together with Apple in the news
Vector Space; Similarity &Concordance • Find similar content • Find similar concepts and link • Find relevant concepts for content
GraphDb Vector Space; Similarity & Concordance Documents Annotated With Graph Ids urn:Car urn:Car urn:Make urn:Engine urn:Make urn:MLModel urn:Tires urn:Model urn:Markov urn:SUV urn:Tires Vector Space Index Similarity urn:Car 0 1 1 0 1 urn:Engine 0 1 1 urn:Tires 0 0 0 0 urn:SUV 0 0 1 urn:Make 1 0 0 urn:Model 1 0 0 urn:MLModel 0 1 0 urn:Markov
Analyses content Concept Suggestions Classification Content Text Analytics API Sentiment Relationships Relationships ...
TA: Vocabulary Aware Semantic Disambiguation Annotate Content Get Suggestions Entity Detection from Vocab Apple : Organisation Tim Cook : Person, CEO Tim Cook : Person, Footballer Samsung : Organisation NLP Pipeline Language Detection Suggestions POS Disambiguation ... Apple CEO Tim Cook was at a conference with the CEO of Samsung. Tim explained how smart phones are changing the consumer electronics market. Vocabulary Gazetteer Apple : Organisation Tim Cook : Person, CEO Tim Cook : Person, Footballer Samsung : Organisation Dynamic Vocabulary ... ... GraphDB Vocabulary Disambiguation (ML Model) Relevance 87% - Tim Cook : Person, CEO 68% - Apple : Organisation 56% - Samsung : Organisation ... Relevance Ranking (Statistical)
Automated (Governed) Machine Learning update model load Re-train Text Analytics Machine Learnt Model moderate Gold Standard Corpus [W3C Open Annotation] modify corpus suggest Curation Accept|Reject|Modify
Annotates content with knowledge Content Content Semantic Fingerprint Open Annotation API
Content Vocabulary AnnotationGraph Organisation mentions type relevance:56% textpos:123,142 tag USA Annotation type Samsung location target competitor NASDAQ Content exchange about Apple relevance:68% Computer Hardware target sector tag textpos:123,142 Annotation ceo target about relevance:87% Tim Cook Person tag textpos:123,142 Annotation
Understands content USA USA UK exchange located in NASDAQ headquarters Content Apple industry about Content ceo mentions Computer Hardware about Knowledge Graph Samsung Tim Cook Tim Cook
Understands users USA UK located in NASDAQ lives in headquartered in exchange User interested in Apple Inc industry User Data employed by ceo Computer Hardware Samsung Knowledge Graph Tim Cook
Captures behaviour Events Event API User Event Index
Understands behaviour concept:follow content:view User User Behaviour content:scroll User content:dwell Social Behaviour tweet:view hashtag:follow
Mine social behaviour Events Social API User Event Index User Behaviour
Behavioral + Contextual recommendation Behavioral similarity Reads
Increased Engagement User Behaviour + + = User Data Content Social Behaviour Knowledge Graph Knowledge Graph
Architecture Unstructured Content Content Concordance Search Annotation User Events Text Analytics Recommendation Knowledge Graph Structured Reference data Semantic Fingerprint OP APIs Tools & Visualisations Users + Events
Dynamic Semantic Publishing Authoring • Rapid high value, lower cost content curation • Capture knowledge and meaning as re-usable data Search & Discovery • Unambiguous semantic search • Recommendation and Similarity Product • Re-purpose and aggregate with Business context • Generate new revenue streams
Enhanced Publishing Workflow Authoring Editorial Production Delivery Discover Related Content Annotate With Concepts & Relations Dynamic Data driven products Contextual Semantic Search Recommend Related Content Organise & Improve Workflow Content Transformation Add references Link to products & archive Domain Modelled IA Personalised Content Streams Add Context
DSP - BBC Sport • Goals • Create a dynamic semantic publishing platform that assembles web pages on-the-fly using a variety of data sources • Deliver highly relevant data to web site visitors with sub-second response "The goal is to be able to more easily and accurately aggregate content, find it and share it across many sources. From these simple relationships and building blocks you can dynamically build up incredibly rich sites and navigation on any platform." John O’Donovan, Chief Technical Architect, BBC
The IET • Goals • Manageable, discoverable, searchable; Journals, research papers and articles • Semantic search using existing taxonomies • Intelligent citations and data provenance • Automated, dynamic repurposing of content assets • Enable new revenue opportunities
Thank you! Experience the technology with our demonstrators NOW: Semantic News Portalhttp://now.ontotext.com RANK: News popularity ranking for companieshttp://rank.ontotext.com FactForge: Knowledge graph of linked open data and news about People and Organizations http://factforge.net