530 likes | 759 Views
Search Engine Hacking. Steve at SnakeOilLabs dot com. Search Engine Hacking. Search Engine Hacking. 1. What is SEH?. 2. Tools Armoury. 3. Exploiting SEH. 4. Countermeasures. Search Engine Hacking. What is SEH?. Definition: Search Engine Hacking (SEH) Function: noun
E N D
Search Engine Hacking Steve at SnakeOilLabs dot com
Search Engine Hacking 1. What is SEH? 2. Tools Armoury 3. Exploiting SEH 4. Countermeasures
What is SEH? Definition: Search Engine Hacking (SEH) Function: noun SEH is the malicious use of indexing technologies in order to identify, fingerprint and exploit at-risk systems, data and people. In other words: Using Search Engines and other indexing facilities to find juicy information and 0wnable b0x3n/w4r3z/d00dz
What is SEH? How much data are we talking about? http://searchenginewatch.com/reports/article.php/2156481
What is SEH? Only now there’s much more to contend with IRC Search Engines Bit Torrent/P2P Search engines FTP Search engines Flickr.com Blogs Your.application.here/search/ Oh, and Google But there’s more… (Whaddya mean you only thought there was Google?)
Tools Armoury Tools Armoury • SiteDigger • Apollo • Wikto • Athena
Tools Armoury SiteDigger (http://www.foundstone.com) • The ‘original’ Google Scanning tool (other than a web browser, of course) • Requires a Google API Key • Uses FSDB and GHDB • Searches deliberately restricted • The ‘Internet Scanner’ of SEH tools
Tools Armoury SiteDigger
Tools Armoury SiteDigger
Tools Armoury SiteDigger • Pros • Slick Reporting • Well maintained • FSDB sometimes outdated, but well categorized • Cons • Needs Google API Key • Google-Specific • Restricted searches means stuff gets missed • Overall • A good tool, ultimately crippled by restrictions
Tools Armoury Apollo (http://worm.ccert.edu.cn/GoogleHacking/Apollo/) • Written by Mimi & Spark of the Good Cat Studio. • No Google Key required, but still Google only • No restrictions on Search • Similar functionality to SiteDigger, minus the snazzy reporting
Tools Armoury Apollo • Pros • No restrictions • No Google API Key needed • Auto update GHDB • Cons • Google-Specific • Clunky interface • No direct link in results • Overall • Better than SiteDigger, but needs better reporting interface
Tools Armoury Wikto (http://www.sensepost.com/research/wikto/) • Port of Nikto to Windows with bells and whistles • Google Hacking functionality a la GooScan • Needs Google API Key • Site orientated • Requires registration with Foundstone’s portal!!!!
Tools Armoury Wikto • Uses a ‘Googler’ to identify directories worth investigating
Tools Armoury Wikto
Tools Armoury Wikto • ‘BackEnd’ module imports data from Googler for use in data mining…
Tools Armoury Wikto
Tools Armoury Wikto • ‘Wikto’ module functions as Nikto on other systems, with ability to import dirs from Googler and BackEnd
Tools Armoury Wikto
Tools Armoury Wikto • ‘GoogleHacks’ Module provides an automated GoogleDork searching facility
Tools Armoury Wikto
Tools Armoury Wikto • Pros • Directory harvesting via Google • Wikto port • Cons • Google Key required • Complicated • Google-Specific • Overall • Feels like several tools bundled into one
Tools Armoury Athena (http://www.snakeoillabs.com) • The ‘original’ Search Engine Hacking tool (other than a web browser, of course) • No API Key required • Features GHDB editor and extensive logging functionality • Not Google Specific! • Manual tool
Tools Armoury Athena
Tools Armoury Athena
Tools Armoury Athena
Tools Armoury Athena
Tools Armoury Athena
Tools Armoury Athena • Pros • Cool logging/note-taking functionality • Can edit GHDB information within Athena • Use datagrid or raw XML editing facilities • Designed for non-techies as well as power users • Suitable for Yahoo, Altavista, <your search facility here> • Cons • No automation • Tabbed browsing would be nice • Overall • Unique … so far.
Exploiting SEH It’s easy as 1-2-3 • Load the GHDB.xml into Athena • Select your query type • (and enter any filters) • Hit Search
Exploiting SEH Thinking of buying a digital camera? • Load Digicams.xml into Athena • Select your camera manufacturer • (and enter any filters – e.g wedding, holiday, ‘amateur’) • Hit Go!
Exploiting non-Google SEH An example • Create a Catalog in Indexing Server for file store • Associate the Catalog with the default web site via the catalog properties • Use the index server query object in ASP (ixsso.Query) • Voila! Instant Search facility!
Exploiting non-Google SEH Indexing Service MMC Snap-in
Exploiting non-Google SEH Example query
Exploiting non-Google SEH What happens when you’re not sure what you’re indexing?
Exploiting non-Google SEH Things to try on your own app • .htaccess/.htpasswd stuff • GET POST • Deny from all • IIS Indexing • REM (from autoexec.bat) • SELECT (from backup .asp and .aspx files) • Other stuff • <?php • #!/usr/bin/perl • root:0: • .inc, .htm, .txt, .bak • </> • <div> (try other html tags)
Countermeasures Google-specific countermeasures • Add the following to specific pages to be left out • <META NAME="GOOGLEBOT" CONTENT="NOINDEX, NOFOLLOW"> • Remove ‘snippets’ but still index link • <META NAME="GOOGLEBOT" CONTENT="NOSNIPPET"> • Stop archiving • <META NAME="GOOGLEBOT" CONTENT="NOARCHIVE"> • Remove my page NOW! • http://services.google.com:8882/urlconsole/controller • http://www.google.com/remove.html
Countermeasures HTTP Server configuration countermeasures • Robots.txt • Some indexing systems obey it • Some don’t • .htaccess/.htpasswd • Make sure it’s configured properly! • Indexing Services • Make sure indexed files are held in a specific directory, not the web root! • Figure out what you’re indexing – you’re only indexing files with specific extensions, right?
Countermeasures Procedural countermeasures • Newsgroups/Mailing lists • Use a hushmail/hotmail account • Use X-No-Archive: Yes headers in Usenet postings • Don’t post information about your systems, data or people • (e.g: specify Solaris rather than specific Solaris patch levels) • Check for information leakage periodically • Don’t use site: restrictions – you want to find all occurrences that affect you, not just the ones on your site! • Web sites • Ensure that backups, test data etc. is held outside of the web root.