380 likes | 392 Views
Learn about the Semantic MediaWiki approach to managing metadata for improved data findability, ownership, quality, and consistency. Explore the benefits and future opportunities of this approach.
E N D
Semantic MediaWiki Approach to Metadata Scott E. Thompson Manager - Data Architecture Ontario Teachers’ Pension Plan
Agenda • Why? • Mashup of slides I’ve used before… • What is Semantic MediaWiki? • Proof of Concept • The Unexpected • Wrap Up • Questions
Ontario Teachers’ Pension Plan • Fixed Income • Public Equities • Private Capital • Real Estate • Infrastructure • Foreign Currency • Commodities • Hedge Funds
Current: Low Confidence 42? ETL Correct Trade Data Warehouse Reload IT Reload Data Rerun Report
Business Requirements Findability of Data Ownership of Data Data Quality Consistent Business Terminology Added later… Ownership of Metadata Metadata Quality
Business Requirements Value of Meta Data & Meta Data Tool Allows business users / end users to gain the required insight into what the data and reports they are looking at means Makes data available and visible to others Creates a searchable set of information about the firm’s data. This allows data developers and users to search for existing data and avoid data duplication. Provides a platform for sharing and publicizing data. This reduces the workload of developers (interfaces, reports, etc.) and users and increases efficiency. Quality control, data restrictions and uses can be applied to the entire data set. Metadata documentation transcends people and time. Staff turnover and balancing of multiple projects can be mitigated with metadata, providing data permanence and the documentation of institutional knowledge.
MDM? MDM could stand for Master Data Management or Meta Data Management… coincidence? “Lets go get all the key pieces of data and put them in one place, which is really more of an enterprise data warehouse but master data management then says… it’s almost a map… here is what each of those data fields are, here is how you can find them, here is what they mean, here is where they came from.” Blake Johnson Consulting Professor Stanford University “The Truth and Power of Master Data Management” (Teradata) http://www.youtube.com/watch?feature=player_embedded&v=p6VHpIlDfu4#!
One Truth? Pre-Trade Post-trade Investment Strategy & Planning Portfolio Research & Analytics Trade & Deal Management Securities Operations Collateral &Cash Management Portfolio Accounting V = f(trade, market context, model, business context) Reconciliation Trades Trades Market Context Market Context Model Model Trades Business Context Business Context Market Context Model Business Context Total Fund Reporting Market Risk Management Credit & Counterparty Risk Management Liquidity Risk Management Performance Compliance
What is a Wiki? Hawaiian for “quick” Allows large numbers of people to create and edit the same content Effective for reaching a credible consensus from a large group Wikipedia is the world’s largest collaboratively edited source of encyclopedic knowledge
Future Opportunities Simple search algorithms would suffice to provide a precise answer to the question…
Graphs (relate/infer) otpp:Index-Linked Bond otpp:Debt subClassOf sameAs otpp:subtypeOf dbpedia: Inflation-Linked Bond otpp:Fixed-Rate Bond otpp:Debt subClassOf otpp:Amortizing Index-Linked Bond otpp:Index-Linked Bond subClassOf
Proof of Concept Build a knowledgebase about: • Our structured data (schemas, tables, columns) • Our business terminology (business process, products, attributes) Prove that the technology could: • Automatically load technical metadata and relate it with business metadata • Customize workflow to collect and govern the manual business input
% Sourced from Core Schemas? {{#sparql: SELECT DISTINCT ?Product ?Product_attribute ?Column ?Schema WHERE { ?Product property:HasAttribute ?Product_Attribute . ?Product_attribute property:GetsDataFrom ?Column . ?Column MDM:belongsToSchema ?Schema . } |merge=true|link=all}}
SMW+ in a nutshell Semantic MediaWiki MediaWiki WYSIWYG extension Enhanced Retrieval Extension Deployment Framework Web Server
“The smartest organizations are not those with the smartest people but those with the quickest access to their collective knowledge” - Rod Collins (wiki-management.com)