1 / 13

Semantic Zoo: Smart Data Hubs, Knowledge Bases, and Data Catalogs

Explore the world of semantic architectures and their applications such as knowledge graphs, smart data hubs, semantic data catalogs, ontology managers, semantic classifiers, and smart contracts.

gladysr
Download Presentation

Semantic Zoo: Smart Data Hubs, Knowledge Bases, and Data Catalogs

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Semantic Zoo Smart Data Hubs, Knowledge Bases and Data Catalogs

  2. Big Data Isn’t Always Smart Data • Enterprise Data Warehouses and Data Lakes aggregate data, but do not usually create a consistent model • Schematic information is often very limited. • Annotating content is hard • Data in the aggregate is not portable, and often not directly addressable • Any data outside the database does not exist • Big Data has only very limited context.

  3. A Lack of Focus • Context indicates where the focus is in a data space. • In relational databases, context is determined by the cursor. • In NoSQL databases, context is the combination of a document URL and a path, indicating one or more nodes. • Application oriented data establishes a bias in the data • This bias in turn makes it hard to create an enterprise wide dataspace. You are here.

  4. Being Context-Free • Context-Freedata has no explicit bias. • This means that the context can be set on any resource within the space, including on properties, classes and related metadata. • Semanticapplications are typically context free. • Semantics makes assertions by exploding row identifiers, columns, and values as “triples”. • RDF is a minimal (mostly) context-free framework. Its underlying bias is formal logic, with a healthy smattering of REST.

  5. Building Semantic Architectures • Semantic Architectures work best in a context free manner. This means several things: • Most use a combination of node by node navigation, combined with search. • Applications generally will make use of schematic metadata to determine interfaces at run-time. • The boundaries between data objects will likely be fluid, heterogeneous and time dependent • They work best at larger scale – enterprise or inter-enterprise. • They are sensitive to models, but are not necessarily bound by them.

  6. The Beasties of the Semantic Zoo • Knowledge Graphs • Smart Data Hubs • Semantic Data Catalogs • Ontology Managers • Semantic Classifiers • Smart Contracts

  7. Knowledge Graphs – Semantic Encyclopedias • A Knowledge Graph is one of the most common semantic applications. • Most knowledge graphs look much like the new Google Search or Wikipedia • Each “page” corresponds one to one with a node in the graph. • Knowledge graphs have rich body content and links to both images and to other pages/links in the body. • Outbound links are attributes or associations and provide the “data” of the page or card. • Inbound links are typically collections, such as related images, people, places, and so forth.

  8. Smart Data Hubs - Databases • A Smart Data Hub is much more like a traditional database. • Context here retrieves data documents from within the hub. • This data can be encoded as RDF (for greatest format flexibility) or as semi-structured content such as JSON or XML (for greatest performance) • A semantic layer handles mastering (key resolution between source and target ontologies). • Data is stored WITHIN the database, although in pure RDF implementations some federation can take place. • Pure RDF systems are better for search via SPARQL. Hybrid systems in JSON or XML may end up using other query languages such as GraphML or XPath.

  9. Semantic Data Catalogs – Library Systems • Semantic Data Catalogs serve much the same role as library card catalogs, and is a hybrid of knowledge graphs and data hubs. • They hold just enough information about concepts (resources) to allow for reasonably sophisticated searches. • At the same time, they don’t actually hold the data, they only contain references (URL links) to the data. • This same approach also works with electronic links to digital asset management and content management systems. • In this case, ingestion needs to capture the provenance, format and schema of the source systems. • This works best when most of the data resources are in external data systems.

  10. Semantic Classifiers - Categorization • A Semantic Classifier extracts metadata from a digital published work, EXIF data or similar media. • The metadata is normally stored and encoded as RDF within other semantic applications. • Classifiers typically require the use of existing semantic taxonomies or ontologies to perform classification. • Increasingly, these are also turning to machine learning to identify content using clustering and other algorithms. • Classifiers are normally an ingestion (or reingestion) process.

  11. Ontology Managers - Harmonization • An Ontology Manager serves to migrate multiple external schemas or ontologies to a single canonical model. • Ontology managers can edit ontologies (models), but their primary purpose is to create translations between models. • Ontology managers are often used to harmonize ontologies that arose from different acquisitions of roughly similar data spaces – such as parts catalogs from different vendors. • Ontology managers generally do not contain much actual data. Instead, they are often used as part of a pipeline with semantic data catalogs to better manage translation of data algorithmically.

  12. Smart Contracts – IoT and Blockchain • A Smart Contract is a specialized semantic application which ties together parties, actions, constraints, definitions and penalties in order to facilitate processes. • Blockchain is often used as the foundation of smart contracts in order to provide verity of transactions, but blockchain primarily stores pointers (which is just another form of URI). • Smart contracts are also tied intimately with financial transactions, notification systems, rights management and geospatial systems. • Smart contracts will likely be the foundation for the Internet of Things as well as legal code becomes more inextricably wound with software code.

  13. Questions? • Kurt Cagle, author, editor, thought leader • Hashtag #theCagleReport for latest articles and posts • kurt.cagle@gmail.com • Contributing Writer for Forbes Cognitive World • Available for consulting at 443-837-8725

More Related