190 likes | 328 Views
Sentry Brief for Ms. Hollis. December 14, 2011. UNCLASSIFIED //FOR OFFICIAL USE ONLY. Meeting Objective. Provide understanding of the Sentry system requirements, capabilities, associated costs, and operational successes to date. Sentry.
E N D
Sentry Brief for Ms. Hollis December 14, 2011 UNCLASSIFIED//FOR OFFICIAL USE ONLY
Meeting Objective • Provide understanding of the Sentry system requirements, capabilities, associated costs, and operational successes to date
Sentry • Sentry was developed to support the Afghan Threat Finance Cell with processing and analyzing handwritten and computer generated Afghan hawala and bank financial transactions, ledgers, and supporting data. Volume of Data Example of Data
Requirements • A system located on-site for local ingest, processing, and querying • Automate parsing, cleansing, and standardization of ingested data • Automate OCR of machine generated documents • Automate OCR of handwritten Pashto and Dari documents • Translation and transliteration of Dari and Pashto documents. • Store parsed and standardized data in a relational database • Mask all US persons information • Federated data sharing architecture • Row and entity level database security • Query and visualize data in a variety of formats (nodal, temporal, etc.)
Capabilities • System deployed in June 2011 • Dell server cluster hosting virtual machines residing in-theater • Dedicated Internet pipe with Cisco VPN for access between DC and ATFC • SQL Server database with WebTAS analytic software providing query and display capabilities • Unique capabilities of Sentry system • Automatic image parsing/binning • OCR of handwritten Dari/Pashtu ledgers/forms • Integrated translation/transliteration • Content Examiner
Image Processing • Majority of the financial transactions are recorded on paper • Seized information must be returned within 72 hours • More than 75,000 images of documents seized • Developed software to automatically identify image type and group images so high-value information can be processed first
Content Examiner • Highly automated data profiling and analysis program • Codify the analytic processes used to generate domain knowledge enabling automatic determination of field content, value, and standardization requirements. • Incorporate data definitions and processing rules for industry standard information
Operational Successes • Processed nearly half a terabyte of ATFC information in support of operational partners • Providing holistic view of suspect threat finance information
Shaheen Exchange/Kabul Bank • 80,000 Excel spreadsheets containing 1.2 million financial transactions from offices around the world
Strategic View • Decade of information offers unique perspective
Strategic View • The anomalies found in the first strategic view of the information by the ATFC director was “eye-opening” 2.4 billion US dollars were moved during these three spikes in activity
Tactical View • Still provides transaction-level detail for targeting * Mahmood Karzai is Afghan President Hamid Karzai’s brother
Structuring Payoffs SherkhanFarnood ($1,500,000) FaridahFarnood ($943,036) Largest Shareholder of Kabul Bank (28%) Farnood’s Wife Sequential Transactions August 20, 2007 Azdarak Capital Account Dr Ahmad Jawid $173.096 Mohd Tahir $450,724 Kefayat LTD $11,802 Jamal Khail $154,269 Abdul Rab $119,425 Qushqar $118,582 Jora Bek $96,448 Abdul Fahim $106,780 Rabiullah Kakar $47,489 Mahmood Karzai $592,910 MohdEhsan Rafat $71,093 Shokrullah $59,010
Developing New Targets • JoraBek is the largest individual money mover in the data set First Noted: January 3, 2001 Last Noted: July 30, 2010 Number of transactions: 6,664 Total amount: $694,073,982 Fax: 7-3772214421 (Kazakhstan) Mobile: 7-3779010096 (Kazakhstan) H/L: 971-42666450 (Dubai) Fax: 971-42620285 (Dubai) Mobile: 971-504574410 (Dubai) Mobile: 971-504545867 (Dubai) Mobile: 998-711864138 (Uzbekistan)
Upcoming Milestones • Operational • Update in-theater database by January 15th • Research & Development • Field Content Examiner prototype by June 1st
Content Examiner • Today’s Extract, Transform, and Load (ETL) tools are inadequate • Individuals with domain knowledge must still describe data • Configuration files need to be developed for every format • Changes in file and field formats cause significant problems • Metadata is often lost • Resource intensive (man hours) • We should be buying capabilities not hours