220 likes | 325 Views
Data Integration at the FBI: “ Connecting the Dots” for Intelligence, Homeland Security & Law Enforcement. Michael F. Reed Managing Partner, EWSolutions. EWSolutions. Best Business Intelligence Application Information Integration Client: Department of Defense.
E N D
Data Integration at the FBI: “Connecting the Dots” for Intelligence, Homeland Security & Law Enforcement Michael F. Reed Managing Partner, EWSolutions
EWSolutions Best Business Intelligence Application Information Integration Client: Department of Defense EWSolutions is a Chicago-headquartered strategic partner and full life-cycle systems integrator providing both award winning strategic consulting and full-service implementation services. This combination affords our clients a full range of services for any size enterprise architecture, managed meta data environment, and/or data warehouse/business intelligence initiative. Our notable client projects have been featured in the Chicago Tribune, Federal Computer Weekly, Crain’s Chicago Business, won the 2004 Intelligent Enterprise’s RealWare award and DM Review’s 2005 World Class Solutions award. Our partial client list includes: World Class Solutions Award Data Management Arizona Supreme Court Bank of Montreal Becton, Dickinson and Company Blue Cross Blue Shield Branch Banking & Trust (BB&T) British Petroleum (BP) California DMV Corning Cable Systems Countrywide Financial Defense Logistics Agency (DLA) Delta Dental Department of Defense (DoD) Driehaus Capital Management Eli Lilly and Company Federal Bureau of Investigation (FBI) Fidelity Information Services Ford Motor Company GlaxoSmithKline Harris Bank Harvard Pilgrim HealthCare HP (Hewlett-Packard) Information Resources Inc. Janus Mutual Funds Johnson Controls Key Bank Loyola Medical Center Manulife Financial Mayo Clinic Microsoft National City Bank Nationwide Neighborhood Health Plan NORC Pillsbury Secretary of Defense/Logistics SunTrust Bank Target Corporation The Regence Group Thomson Multimedia (RCA) United States Air Force United States Navy United States Transportation Command USAA Wells Fargo Wisconsin Department of Transportation Zurich Cantonal Bank For more information on our Strategic Consulting Services, Implementation Services, or World-Class Training, call toll free at 866.EWS.1100, 866.397.1100, main number 630.920.0005 or email us at Info@EWSolutions.com
Speaker Biography • Michael F. Reed, Managing Partner – EWSolutions • 30 years of IT industry experience • Expertise in data warehousing, metadata management and Enterprise Architecture • Editor of Real-World Decision Support newsletter • Project Manager & Lead Architect – Federal Bureau of Investigation • PMO System Architect for the Sentinel Program since early 2006
Project Background “We don’t know what we know.” Robert Mueller Director Federal Bureau of Investigation (FBI)
Goals for this Session • Share EWSolutions’ experience partnering with the FBI • Describe the FBI’s ambitious data integration effort to address one of its core intelligence problems • Challenges, solutions, and successes achieved
Agenda Project overview The FBI Issues the Challenge EWSolutions Accepts the Challenge The Vision Specific Hurdles Implemented solution Environment overview Deployment Strategy Key benefits Best Practices Future direction Q & A
The FBI Issues the Challenge • A visionary program manager saw an opportunity to build a data warehouse • The concept was put out for bid • Over a year-long process, EWSolutions won out over a number of competitors • The Program Manager issued the following challenge – “Don’t repeat our past mistakes – use metadata, industry standards, and do not write custom code”
EWS Takes the FBI Challenge • The FBI asked EWSolutions to create a new vision • EWSolutions decided to take an enterprise approach • Created a completely new logical data model describing the business instead of the data • Within two years this model was designated the official “FBI Data Model” • EWS’ customer received the FBI Director’s Award for our work
Vision • A cohesive view of “all that we know” • Conformed definitions, rationalized elements • Give Agents access to ALL the data • Eliminate the distinction between “structured” (table) data, “unstructured” (document) data, and “semi-structured” (e.g. XML) data • Information is not lost in the din of “data overload” • A highly complex data model with carefully designed relationships
Project Challenges • Volume of data – petabyte potential within one year • Velocity of data – updates at increasing rates • Variety of data – literally hundreds of data sources, of all possible types & formats • Very “dirty” data
Project Challenges • Certification & Accreditation (C&A) • COTS (“Commercial Off-The-Shelf software”) data integration tools are new to many Federal environments • C&A is a difficult, time-consuming process • Enterprise Metadata repositories are very new to the Federal Government – it is difficult for Security to certify a tool whose function it does not understand
Project Challenges • Product interfaces • Goal of 100% COTS solution • Attained 99% COTS – one other vendor’s tool could not be called from the command line
The Solution • Multi-vendor COTS solution • Interfaces with other vendor’s COTS tools both pre- and post-process • Populates normalized data warehouse (data model looks more like an ODS) • Post-process to correct addresses, foreign names, localization, genderization
Deployment Strategy • Replace custom code with COTS ETL for existing data extraction programs • Use existing “staging” data model (target looks like source data) • Gradually migrate to consolidated data model • Users switch over to new system one source at a time
Key Benefits • Lower costs from COTS solution • Changes to data sources don’t require experienced programmers • Workflow can be more easily automated • Leverage person-years of COTS ETL development • Better performance from multi-threaded, multi-processor ETL applications
Key Benefits (cont.) • Metadata integration • Ability to track operational and business metadata is a key differentiator • End-to-end metadata analysis and reporting (especially data lineage) • Higher data quality • More robust error handling, exception reporting, and data profiling is a good start on Data Governance
Best Practices • 100% COTS • Keep the developers in line! • Custom code increases O & M costs • Always design for scalability • Always design for scalability • Always design for scalability • As you may have noted, I think this is very important • Use parallelism, multi-threading, partitioning, clustering…
Best Practices (cont.) • Make maximum use of managed metadata environment • Self-documenting system (well, sort of) • Reporting is a way to reduce O & M costs • Pay careful attention to input data • It’s even worse than you thought • Good error handling is a way to reduce O & M • Data Quality Root Cause Analysis • Well crafted ICDs (Interface Control Documents)
The Future • Full metadata integration with all tools in the solution • Big Honkin’ Box (BHB) • That’s the technical term for we’re going to scale the system up to very large UNIX boxes • Major component of Federated Query and Service-Oriented Architecture • SENTINEL, CTISS, PM-ISE, DNI DMC
Questions? Michael F. Reed Managing Partner EWSolutions Phone: 202.295.7283 mreed@ewsolutions.com Michael.Reed@ic.fbi.gov