1 / 10

Case study: The Practicalities of building an enterprise knowledge graph

Case study: The Practicalities of building an enterprise knowledge graph. Edgar Zalite Head of Metadata Management – Chief Data Office, Deutsche Bank. Client logo positioning. What is a Knowl e dge Graph? Why Build One?.

kathernt
Download Presentation

Case study: The Practicalities of building an enterprise knowledge graph

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Case study: The Practicalities of building an enterprise knowledge graph Edgar Zalite Head of Metadata Management – Chief Data Office, Deutsche Bank Client logopositioning

  2. What is a Knowledge Graph? Why Build One? • Technologies exist that bring together disparate data. Why look to yet another new technology? • Definition: • “A knowledge graph represents a collection of interlinked descriptions of entities – real-world objects, events, situations or abstract concepts.” * • Why? • The need to find data • The need to understand data • The need to link data • Yes, and? • There is a *lot* of data • The data constantly changes • The data models constantly change • Relational constructs (Warehouses, catalogs, lakes, etc.) struggle to keep up • Data is not always consistent *Source: Wikipedia, Entry for Ontology 2010 DB Blue template

  3. Semantic Search– Google Example • Knowledge Graph understands the meaning of yourdata, and presents it accordingly Sample Google Search: Latvia Classic Text-based Search Results Knowledge Graph Results 2 9/24/2019 2010 DB Blue template 2010 DB Blue template

  4. Why data is more valuable when semantically connected • Semantic data presentation provides capabilities not easily obtained with traditional approaches. “Things not strings”: understanding the meaning of your data Contextualizes content: where data comes from & what does it apply to Warehouse Time to add new sources Inference: relationship discovery creates automated enrichment Extensible: not limited by relational model updates Graph Number of sources Existing Graph New Graph 2010 DB Blue template

  5. Guiding Principals • Look to industry best practices • Design goals for the Knowledge Graph seek to implement a flexible architecture using current best practices: • Simplicity - The system must be simple to allow new metadata to be created, shared and managed quickly. The system must not interfere with the Bank's ability to create new types of metadata and share it across the bank. The format of the metadata is easy to create and has a low barrier to technical expertise • Self-Service - The system must provide the bank mechanisms to publish, discover and integrate metadata across the bank. This allows any system to create, publish and manage its metadata, but also allows them to join and share in the broader community of metadata. • Many Forms - The forms of metadata will support all communities of human and machines in the bank. Metadata will uniquely reference any topic and resolve to a web address. That web address serves both human and machine readable forms either by explicitly requesting information or embedded markup. • Transparency - Metadata across the organization is mostly opaque and needs to be found. It wants to be found and used, but the lack of a coherent architecture to make it available is stopping it. The architecture must provide users the ability to find metadata and therefore the data it describes. The architecture should prioritize and make it easy to publish metadata and give transparency to the hidden value. • Capabilities necessary for success • Semantic Search – The core mechanism for access by human users • Security – Especially banking – must build in security access from the start • Standardized Name Space – Retain consistency of data – map new sources to a common set of schemas • Registration - The warehouse is distributed. Any new source can register with the warehouse. After registration, the search and query tooling will extend into the distributed stores of data. • API/Query – Necessary to support serious data analytics. • Reasoner and Machine Learning – Enrich data & generate inewnsights

  6. Building a Enterprise Knowledge Graph - Roadmap Implementation using agile practices – build quick, deliver incremental change, constant user interaction 2010 DB Blue template

  7. Getting an EKG Successfully Built & Used –Start Fast with the Data at Hand and Well Understood Our approach: Focus on CDO metadata • DESIGN-TIME perspective • Architects • Data governance team • Data Stewards Integrated UI & Access Layer Metadata Sources - Uses CDO core data Knowledge Graph Orchestrates information across sources and serves up a single view Technology Estate Technology Estate Governance Data Governance Data Data Quality Lineage Data Issues etc.

  8. Getting an EKG Successfully Built & Used –Expand Sources & Capabilities Driven by User Demand • An integrated approach: address multiple user groups, multiple data sets, link it all. Defense Offense • DESIGN-TIME perspective • Architects • Data governance team • Data Stewards • RUNTIME ACCESS perspective • Application consistency • Quality metrics • Machine Learning • ANALYSIS-TIME perspective • Business Development • Revenue Generation • Anti-money laundering & financial crimes Integrated UI & Access Layer [Semantic Data Lake] Production Environment Metadata Sources - Uses CDO core data Knowledge Graph Orchestrates information across sources and serves up a single view Technology Estate Governance Data Data Catalog Data Quality • Master Data Management • Systems • Reference data Lineage • Prod data repository • [Data Lake] • Access Control • Data Lifecycle controls • Sensitive information access Data Issues etc.

  9. Cautionary Notes – What Made Success • Success comes from continuous response to a broad business user base. It must add new value. “The EKG is enabling our business - we will support it! … oh, and can you add source xxxx?” 2010 DB Blue template

  10. Thank You Edgar Zalite Head of Metadata Management – Chief Data Office, Deutsche Bank

More Related