310 likes | 419 Views
OSIC - A Cost-Sharing Approach to Open Source Information. Presented by: Scott Mutton PM – Security & Intelligence xwave 13 April 2004. Agenda. Fly through basic concepts 2 min The OSIC Solution 5 min Functionality of an OSIC 15 min Savings and Benefits 5 min
E N D
OSIC - A Cost-Sharing ApproachtoOpen Source Information Presented by: Scott Mutton PM – Security & Intelligence xwave 13 April 2004
Agenda • Fly through basic concepts 2 min • The OSIC Solution 5 min • Functionality of an OSIC 15 min • Savings and Benefits 5 min • Costs and Risks 5 min • Questions as time (Robert) permits
OSINF & OSINTBasic Concepts • OSINF = Open Source Information • Information that can be legally and morally obtained either freely or by paying a fee • Some question the “morally” caveat • OSINT = Open Source Intelligence • OSINT = OSINF that has been analyzed, categorized, filtered, or validated through some intelligence-driven process
Security ImplicationsBasic Concepts • By its nature, OSINF is “unclassified” • Sometimes, though, the fact that a particular person/topic is even of interest is classified • OSINT may be classified • depends on the nature of the analysis, categorization, etc.
Some Sources of OSINFBasic Concepts • Meetings and presentations • Places of worship • Books & Plays • Can be fact or fiction • Fiction can provide insight into culture • Newspapers & periodicals • also flyers, brochures, etc. • Movies • Radio & Television • Unclassified reports and studies • Internet
The OSIC Solution • Establish an OSINF infrastructure which • Greatly improves the ability of people to get info from the Internet • Provides tools to help them understand the info they’ve found • Organizations in the S & I community • Share the OSINF infrastructure • Each do their own processing to generate OSINT • xwave • Owns & operates the infrastructure • Negotiates collective data purchasing agreements • Conducts ongoing tool reviews and infrastructure improvements
Chain of OSINT ProcessesOSIC Solution Feedback Distribute Analysts Publish Mandatory Analyze Reformat (Optional) Librarians Helpful RFI Info Collate Collect Researchers Optional Access Note: The process likely becomes classified at the point RFI Info introduced
A Shared OSINF InfrastructureOSIC Solution Org A Org B Org C Organization Specific Functions Feedback Feedback Feedback Distribute Distribute Distribute Publish Publish Publish Analyze Analyze Analyze Reformat Reformat Reformat Generic Functions (Technology Infrastructure) Collate Collect Access
Components of OSICFunctionality • Federated Search • Focus on specific websites • Access to fee-for-data sites • Multi-lingual • Multi-media • Local OSINF repository • Data Mining • Collaboration
Federated Search Functionality • A “federated search engine” is one that • Takes a single user-query • Passes it on to other independent search engines • Collects the returns from each search engine • Removes duplicates • Prioritizes the collection • Presents the user with a single return list • The OSIC incorporates Copernic Empower • a commercial federated search engine that can fan out to over 1,000 different Internet search engines • Can also “log into” fee-for-service sites • Xerox’s “AskOnce” an alternative under consideration
Focus on WebsitesFunctionality • May want to spider and index some sites directly, rather than relying on commercial search engines • May want to ignore Robot.txt file • May want to access sites not indexed by commercial engines • May want to Data Mine the info • The OSIC uses Autonomy for indexing, data mining, and other advanced functionality • May want to monitor certain sites/pages for changes • Empower or AskOnce
Fee-for-Data SitesFunctionality • A great deal of good OSINF needs to be purchased • BBC World Monitoring (FBIS), Canadian Press, LexisNexis, etc. • Significant economies of scale for bulk data • A quote from one data vendor • 11-30 users costs “X” dollars • 31-300 users costs “2X” dollars • 15 users pay 5 times more per-user than 150 users • Tools exist to allow seats to be “shared” for sites that require interactive access • For example, might share 5 seats across 150 users • Price per shared seat high, but much less than 30 normal seats • The OSIC looking at Tarantella tool to control use • A similar solution in use in Canada’s Foreign Affairs department
Multi-LingualFunctionality • A tremendous amount of needed information not in English • Need to be able to • Recognize the language used in a document • Be able to conduct “native searches” • E.g. an Arabic query to get Arabic info • Be able to conduct “cross-language” searches • E.g. an English query to get Arabic info • Do “gist” translations from one language to another
Multi-MediaFunctionality • There can be significant benefit in being able to apply automation to • Monitor radio or television broadcasts • Search stored audio/video files for content • Voice-to-text transcription improving • A key factor is to be able to maintain link between text and original audio/video • Another situation where multi-lingual functionality needed
Local RepositoryFunctionality • Unclassified OSINF/OSINT that is not posted to the Internet • Could be product created by OSIC users • Some Internet docs may be of such lasting value that they’re worth saving • Data mining tools can be readily applied
Data MiningFunctionality • Tools exist to analyze volumes of numeric and text data and • Identify trends and clusters • Allow users to easily “walk through” the data • A significant benefit to the OSIC is that currently such tools are expensive and technically challenging to set up
CollaborationFunctionality • Simple file sharing • Organizations connecting to the OSIC add UNCLASS product to local repository • Connecting users with questions to users with answers • Possibly some vehicle for chat • Likely security issues
Savings and Benefits • Cost sharing across multiple user groups • Savings through economies of scale • Savings through shared access • Ongoing evaluation and adoption of new technologies • Sharing of knowledge • Easily implemented
Cost SharingSavings and Benefits • One set of HW and SW serves entire community • Users access through Web Browser • Shared operations and maintenance • One set of HW/SW support agreements • One system administrator • If scale/need dictates, may have more than one and provide 24/7 support
Economies of ScaleSavings and Benefits • When purchasing data, size matters • A quote from one data vendor: • 11-30 users costs “X” dollars • 31-300 users costs “2X” dollars • 15 users pay 5 times more per-user than 150 users
Shared AccessSavings and Benefits • Some data vendors require individual licenses to log into their systems • The OSIC can potentially arrange for shared licenses • Higher cost per license, but many fewer licenses needed • Technology can enforce sharing limits, so vendors confident not they’re being abused
Access to New TechnologiesSavings and Benefits • OSIC has mandate to investigate new tools and technologies for • Search & retrieve • Data mining • Knowledge sharing • If a tool is put into OSIC, can be immediately available to all • May be some benefits to entire community using common toolset
Knowledge SharingSavings and Benefits • If one individual finds a useful document or URL, can be made available to entire community • If users wish, they can make available to others • Their specific queries • Their areas of interest • If OSIC used to “contract out” unclassified research, then expertise of who to call for what is centralized
Ease of ImplementationSavings and Benefits • Subscribers don’t have to deal with • Evaluations and processes associated with a system purchase or development • Staffing to support • Licensing with SW or data vendors • Training preparation or delivery • Service can be available within days of signing up for subscription
Costs and Risks • Operating model • Different organizations have different requirements • Security concerns
Operating ModelCosts and Risks • The OSIC, as a contractor-owned user-subscription model, has some difficulties • Affordable fees depend on significant community participation • 50-100 users likely a viable starting number • Subscription could be per-seat or per-organization • Problem: A number of orgs want to wait until “others go first” • Problem: This is a model new to Canadian Federal contracting processes • A government owned & operated model also has difficulties • Arriving at a fair way to distribute costs can be a problem • Particularly if different organizations have very different data acquisition or system utilization needs • The contracting process to establish such a capability very long
Organization RequirementsCosts and Risks • There may be issues if parts of the community have substantially different needs for such things as • Security • 24/7 • Solutions almost certainly exist, but how would they be cost-shared?
Security ConcernsCosts and Risks • Internet-based OSINT requires connectivity • Some agencies air-gap • May leave “footprints” • All collection methods have a comparable problem • Some believe their work so super-secret that the merest hint of my interests is classified • In many cases, this is an exaggeration