200 likes | 392 Views
Data Architecture at CIA. Dave Roberts Chief Technical Officer Application Services, CIO CIA Davecr@ucia.gov. Topics. Enterprise Data Architecture at the CIA Applicability across enterprises. Data Architecture Mission Statement. Enable the mission by enhancing the value of data.
E N D
Data Architectureat CIA Dave Roberts Chief Technical Officer Application Services, CIO CIA Davecr@ucia.gov
Topics • Enterprise Data Architecture at the CIA • Applicability across enterprises
Data Architecture Mission Statement Enable the mission by enhancing the value of data.
A Framework for Data Architecture “make it easy to sharewithin and outside the Agency” “centralized data repository” “enable more effective linguistic search and data manipulation” “reuse the data, so multiple entry of the same information is eliminated” Some Business Drivers: • Enterprise Data Model • What things are important to us and how they are related • Master Data Management • Starting points and priorities • Data Services Strategy • Shielding applications from changes in data structure • Repository Strategy • Increasing data value by reducing data fragmentation • Uniform Resource Identifier • Telling different things apart • Semantic Technology Strategy • Knowing what we know and what it means
When We Started • No enterprise data model to use for standardization • Projects underway changing major data stores • If we developed an EDM and then standardized, it would be too late to influence the major projects • So we are aligning projects with the EDM as we build it • We must address not just technical characteristics of data but also the data strategy for the enterprise. • We are releasing a draft of our EDA every quarter in FY07, final EDA to be complete at end of FY07.
Enterprise Data Strategy • We are changing our data culture, from project-centric to enterprise-aware • The scope of the change is huge--top management support is essential (and we have it) • Enterprise Data Layer project provides public focus for our efforts to improve data value • The business understands “cleaning up” data and putting it “into” the EDL • Although data may not move to “enter” the EDL, nevertheless it’s a useful construct
Application View of the EDL • Always available • Subject to strong, consistent access control • Discoverable • Physically protected • High data integrity • Sharable • Consistently represented
Enterprise Data Layer Defined The Enterprise Data Layer is a collection of data of interest to the enterprise, software used to access, manage and control it and hardware used to house and access it. The Enterprise Data Layer • is always available • makes all data discoverable • includes entity types, attributes and relationships in alignment with the Enterprise Data Model • has duplicate entity instances resolved and there is a unique enterprise identifier for each instance • is accessed through a set of enterprise access interfaces • has access to it controlled by enterprise access control • is physically secure
Service Service Service Serv-ice Service Enterprise Access Control Enterprise Data Stores Middleware Data Mapping Enterprise Data Layer and Applications Application Application Application Application
Enterprise Data Model • The EDM is conceptual, shows only entity types and principal relationships • There’s a lot that it doesn’t show and doesn’t control • For master data entities, a logical data model will be developed that will show attributes and details of relationships • Master data entities include the objects of interest to the community (Person, Organization, etc)
Use of Data Models in CIA Data Architecture • Data models are not used to control storage structures • Data model constraints apply to interfaces to the EDL, not to physical storage • Data model constraints can be met logically through middleware or other multipurpose services, or at the storage level • This flexibility is used to deal with legacy and with COTS
How We’re Getting There • We work with projects in development • Every project is given an Enterprise Data Model Maturity Assessment • Assessment can be from 1 (project-centric data management) to 5 (entirely enterprise-aware and compliant) • Assessments are carried along with project status data • Maturity level assessment provides a management tool to set goals and track progress
Next Step—Legacy • We are inventorying legacy data stores • For each, we will develop an appropriate plan to bring each into the EDL • Some will have storage structures changed, some will have storage structures emulated through middleware or other enterprise services
Duration of our Effort • We can measure progress by EDAMM level and by amount of data “in the EDL” • The effort won’t be completed in a year; or five; or ten • We will deliver mission benefit in the current FY and on a continuing basis • This is like a quality effort; you don’t stop
What’s Next • We are constructing artifacts to make our EDAMM assessment as objective as possible • Features checklist • Mandatory content indicated on EDM • Logical data models for master entity types • We are writing our EDA to describe the whole process • The document will be specific about EDAMM assessments • To be completed September 30, 2007; drafts quarterly • March 31 issue will be the first to include EDAMM
What About a Community? • If you’re talking about the whole federal government or the whole Intelligence Community, what should you do? • We believe that it is not practical to force compliance with storage structures even within a single enterprise, much less across a federation of enterprises • We are standardizing on interfaces, even within the enterprise • In a community, why not standardize on information exchange and ignore how it’s stored?
The Conglomerate Model • Enterprise Technical Architecture uses a concept called the conglomerate model • Individual pieces are separate businesses • Standardization is intended to allow interchange of information, not common infrastructure
Data Architecture for a Conglomerate • Information exchange formats are required • XML is the obvious choice for exchange • But shared semantics for most important data is also needed, not conveyed by XML • Agreement on conceptual data model for principal entities is needed