390 likes | 741 Views
Mark Bennett. Protecting Confidential Information in Corporate Search. Agenda. Business Drivers Levels of Security “Granularity” “Early” vs. “Late” Binding – why it matters! Vendor round up Organization and Technical Challenges Patching Search Security Holes Trends Wrap Up / Q & A.
E N D
Mark Bennett ProtectingConfidential Informationin Corporate Search
Agenda • Business Drivers • Levels of Security “Granularity” • “Early” vs. “Late” Binding – why it matters! • Vendor round up • Organization and Technical Challenges • Patching Search Security Holes • Trends • Wrap Up / Q & A
The ES Security Paradox As Search is deployed further and furtherinto the Enterprise, the likelihood of having a security problem increases.
An Experiment You Should Try • You’ll be amazed what you can find on your own company’s network. Try searching for: • confidential • highly confidential • salaries • performance review • Excel spreadsheets (.xls) • Access databases (.mdb) • Also look for: • Obscenities • Racial and gender slurs
Shifts in Thinking • From technical security to Business Viability • IP, financial/SEC, regulatory, espionage, privacy • Downsides include: • Loss of competitive advantage, Degradation of company reputation, Impact of fraud and misuse, Decisions made on faulty information, Loss of access to critical information, Legal and contract liability, Regulatory fines, Public safety • Forrester interview with Michael Rasmuseen • From “perimeter-focused” to “distributed” • Must protect some data internally • Some systems must trust other security providers • Burton Group Enterprise Search Security Summer 2008
Enterprise Search and Corporate Security The Current State of Affairs The Good: SSO, SAML, LDAP, Active Directory The Bad: Spidering, Org Boundaries The Ugly: Holes, Lack of Awareness Enterprise Search Security Summer 2008
Levels of Security“Granularity” • Summary: • Application / Collection • Document • Field / Sub-Document • Sub-Field / “Redaction”
Granularity: Collection Level Enterprise Search Security Summer 2008
Granularity: Document Level Enterprise Search Security Summer 2008
“Early Binding”vs.“Late Binding” SecurityThis choice affects performance and security infrastructure load
Defining “Early” vs. “Late” Binding • Early-Binding • Search engine Index includes ACL info • Forrester: “Caching security credentials” • Late-Binding • ALL security work done at Search Time • Forrester: “Run-time access validation” • Hybrid: combines Early and Late • Federated: leverage indigenous engines • May require complex security mapping
Security Infrastructure Interaction • No work needed at Index time • Would appear to be a simpler/better design Early Binding: Index Time • I have document “http://corp.acme.com/sales/forcast.html”, what are the group IDs for it? (ACLs, etc) Early Binding: Search Time • I have Session ID “14729834416”, which User is that for? • I have User “Jones”, which groups is he in? • Transform the list of Group IDs into a Native Query Filter (with ACLs, etc) Late Binding: Search Time • I have Session ID “14729834416”, can I access document “http://corp.acme.com/sales/forcast.html”, Yes or No? (repeat for every match)
Vendor: FAST Search & Transfer • Supports Early and Late binding • Can use BOTH together • Hybrid approach “Best of both Worlds” • Gets along very well with Microsoft Active Directory • FAST SAM = Security Access Module • Based on Windows technology • Can still use your own application level logic if you prefer Enterprise Search Security Summer 2008
Vendor: Autonomy • IDOL supports both Early and Late binding: • Hybrid approach “Best of both Worlds” • IDOL: Early Binding = “Mapped” • IDOL: Late Binding = “Unmapped” • Ultraseek • Ultraseek is Late Binding only Enterprise Search Security Summer 2008
Vendor: Google Appliance Enterprise Search Security Summer 2008 • Google Appliance • Late-Binding only • “spin” is low latency – but actually a compromise... • Could heavily load security infrastructure • Does use some caching to lighten the load • Caching decreases response time = good • Caching increases latency (ACL changes)
Vendor: Endeca Enterprise Search Security Spring 2008 • Out of the box is Early Binding only • Mitigated by low latency for document changes • Provides accurate document counts by user • General term is “Record Filters” • Or can use “joins” to a fulltext ACL index • RRN: Relational Record Navigation • Late binding via custom code
“Vendor” Lucene / Solr / Nutch Enterprise Search Security Spring 2008 Roll your own…
OrganizationalandTechnical Challenges“They won’t let me in!”
Access Issues • Spider may need “Über Login” • Divisions worried about loss of control • Worried about cached copies of data • Several Approaches • Global Indexing – single Monolithic Search • Federated Search – leverage what’s already there • “Deferred Search” Enterprise Search Security Summer 2008
Deferred Search
Check List • Limit access to Disk files • Use File / SSH restrictions • Don’t recommend total file encryption • (exception for password files of course) • Files to keep in mind • Config files, Scripts • LOGS • Search Engine Indices • In some search engines DOCUMENTS CAN BE RECONSTRUCTED from the Words Index Enterprise Search Security Summer 2008
Other “Gotcha’s” • Secure the Search Admin UI! • May require other back end changes • Secure the Search Analytics UI • Can assign various “roles” as appropriate • Secure TCP/IP traffic where appropriate • Searches, spider, logging, admin UI • Overkill in some cases • Beware of Cached Data • Can violate automatic retention policy Enterprise Search Security Summer 2008
Editing Search Engine URLs • Form-Based Filtering: http://www.acme.com/go?coll=public • Hackable View URLs http://www.acme.com/go?viewdoc=100 • DOCUMENT HIGHLIGHTING represents a potential Security Hole • Results List Summaries • Full-Document highlighting Enterprise Search Security Summer 2008
Gotcha’s: Misc. • Results Navigators show Meta Data • Employees see “Upcoming Layoff”, etc. • Detecting FAILED pages with status 200 • Some Web Servers give back nicely formatted error screens or redirects, instead of an HTTP error code • Desktop Search Holes • Peer-to-peer may not be properly controlled • May bypass Office file/doc passwords • User Data: To Log or Not to Log? • Potential liability with either choice • Employee Privacy Concerns • De Facto Notification • Disclaimer: We are not lawyers
The Near Future Enterprise Search and Corporate Security • Search & Security tied to SOX/HPPA • Search Logs get Regulatory Interest • Who Saw What, When • Failure to Spot Trends becomes Negligence • Distributed Credentials Management • Not as big of a factor in the Enterprise • More cooperation between e-commerce sites • Government employees accessing other agencies Enterprise Search Security Summer 2008
Call to Action! Enterprise Search and Corporate Security • Run some test searches! • Do you know your company’s current policies? • If confused, talk to your vendor, or get some professional help Enterprise Search Security Summer 2008
Resources Search Dev Newsgroup: www.SearchDev.org Newsletter & Whitepapers: www.ideaeng.com/current Blog: www.EnterpriseSearchBlog.com
Finish Line Review & Questions General Info info@ideaeng.com Mark Bennett mbennett@ideaeng.com