130 likes | 149 Views
Explore the world of semantic architectures and their applications such as knowledge graphs, smart data hubs, semantic data catalogs, ontology managers, semantic classifiers, and smart contracts.
E N D
The Semantic Zoo Smart Data Hubs, Knowledge Bases and Data Catalogs
Big Data Isn’t Always Smart Data • Enterprise Data Warehouses and Data Lakes aggregate data, but do not usually create a consistent model • Schematic information is often very limited. • Annotating content is hard • Data in the aggregate is not portable, and often not directly addressable • Any data outside the database does not exist • Big Data has only very limited context.
A Lack of Focus • Context indicates where the focus is in a data space. • In relational databases, context is determined by the cursor. • In NoSQL databases, context is the combination of a document URL and a path, indicating one or more nodes. • Application oriented data establishes a bias in the data • This bias in turn makes it hard to create an enterprise wide dataspace. You are here.
Being Context-Free • Context-Freedata has no explicit bias. • This means that the context can be set on any resource within the space, including on properties, classes and related metadata. • Semanticapplications are typically context free. • Semantics makes assertions by exploding row identifiers, columns, and values as “triples”. • RDF is a minimal (mostly) context-free framework. Its underlying bias is formal logic, with a healthy smattering of REST.
Building Semantic Architectures • Semantic Architectures work best in a context free manner. This means several things: • Most use a combination of node by node navigation, combined with search. • Applications generally will make use of schematic metadata to determine interfaces at run-time. • The boundaries between data objects will likely be fluid, heterogeneous and time dependent • They work best at larger scale – enterprise or inter-enterprise. • They are sensitive to models, but are not necessarily bound by them.
The Beasties of the Semantic Zoo • Knowledge Graphs • Smart Data Hubs • Semantic Data Catalogs • Ontology Managers • Semantic Classifiers • Smart Contracts
Knowledge Graphs – Semantic Encyclopedias • A Knowledge Graph is one of the most common semantic applications. • Most knowledge graphs look much like the new Google Search or Wikipedia • Each “page” corresponds one to one with a node in the graph. • Knowledge graphs have rich body content and links to both images and to other pages/links in the body. • Outbound links are attributes or associations and provide the “data” of the page or card. • Inbound links are typically collections, such as related images, people, places, and so forth.
Smart Data Hubs - Databases • A Smart Data Hub is much more like a traditional database. • Context here retrieves data documents from within the hub. • This data can be encoded as RDF (for greatest format flexibility) or as semi-structured content such as JSON or XML (for greatest performance) • A semantic layer handles mastering (key resolution between source and target ontologies). • Data is stored WITHIN the database, although in pure RDF implementations some federation can take place. • Pure RDF systems are better for search via SPARQL. Hybrid systems in JSON or XML may end up using other query languages such as GraphML or XPath.
Semantic Data Catalogs – Library Systems • Semantic Data Catalogs serve much the same role as library card catalogs, and is a hybrid of knowledge graphs and data hubs. • They hold just enough information about concepts (resources) to allow for reasonably sophisticated searches. • At the same time, they don’t actually hold the data, they only contain references (URL links) to the data. • This same approach also works with electronic links to digital asset management and content management systems. • In this case, ingestion needs to capture the provenance, format and schema of the source systems. • This works best when most of the data resources are in external data systems.
Semantic Classifiers - Categorization • A Semantic Classifier extracts metadata from a digital published work, EXIF data or similar media. • The metadata is normally stored and encoded as RDF within other semantic applications. • Classifiers typically require the use of existing semantic taxonomies or ontologies to perform classification. • Increasingly, these are also turning to machine learning to identify content using clustering and other algorithms. • Classifiers are normally an ingestion (or reingestion) process.
Ontology Managers - Harmonization • An Ontology Manager serves to migrate multiple external schemas or ontologies to a single canonical model. • Ontology managers can edit ontologies (models), but their primary purpose is to create translations between models. • Ontology managers are often used to harmonize ontologies that arose from different acquisitions of roughly similar data spaces – such as parts catalogs from different vendors. • Ontology managers generally do not contain much actual data. Instead, they are often used as part of a pipeline with semantic data catalogs to better manage translation of data algorithmically.
Smart Contracts – IoT and Blockchain • A Smart Contract is a specialized semantic application which ties together parties, actions, constraints, definitions and penalties in order to facilitate processes. • Blockchain is often used as the foundation of smart contracts in order to provide verity of transactions, but blockchain primarily stores pointers (which is just another form of URI). • Smart contracts are also tied intimately with financial transactions, notification systems, rights management and geospatial systems. • Smart contracts will likely be the foundation for the Internet of Things as well as legal code becomes more inextricably wound with software code.
Questions? • Kurt Cagle, author, editor, thought leader • Hashtag #theCagleReport for latest articles and posts • kurt.cagle@gmail.com • Contributing Writer for Forbes Cognitive World • Available for consulting at 443-837-8725