270 likes | 412 Views
Open Proxy Servers Kevin Guthrie & David Yakimischak January 2003. Outline. Background: what are “open proxies”? What’s the exposure? What happened? How was it done? Not an isolated case What to do. What has been taken: 51,392 Articles from 11 Titles. Proxy Servers.
E N D
Open Proxy Servers Kevin Guthrie & David Yakimischak January 2003
Outline • Background: what are “open proxies”? • What’s the exposure? • What happened? • How was it done? • Not an isolated case • What to do
Proxy Servers A proxy server is a web server that acts as an intermediary or relay station between a workstation user and the Internet.
http://www.jstor.org/browse http://www.jstor.org/browse proxy.inst.edu IP: 2.3.4.5 www.jstor.org User IP: 1.2.3.4
Proxy Servers Common Reasons for Their Use • Caching • Remote access • Usage tracking • Controlled access • Approved filtering
What is an “open” proxy server? • There is a configuration process to specify who is authorized to access the server. It is similar to the configuration process for any web server • When a proxy server is not set up with the appropriate access controls, anyone can access that machine and “assume its identity”
“Open” Proxy Servers:How and Why are they Created • Some are organizational or departmental proxy servers incorrectly configured. • Some are set up intentionally to provide access to restricted resources (probably for convenience). • We believe many are set up accidentally as an unknown by-product of setting up a web server.
A List of Open .edu Proxies [The server hostnames have been edited to protect the institutions with open proxy servers listed on this page.]
JSTOR Monitors Use • We have triggers to alert us to unusual levels of usage activity • We investigate when usage seems unusual
The Abuse What Happened August 22nd to the 27th -- 13413 articles are downloaded from Proxy #1. August 27th we deny this IP access to JSTOR. ------------------------------------------------------------- August 26th to September 4th -- 3859 articles are downloaded from Proxy #2 at a different participating site. September 4th we deny the IP address of this second proxy.
The AbuseWhat Happened • It appeared the two abuse situations were related: • There was an overlap in journals downloaded, but not an overlap in articles downloaded. • Analysis of our log files showed that the URLs being downloaded via Proxy #2 were created through use at Proxy #1.
The AbuseThe Pattern Continues • Between August 27th and October 31st downloads occurred from: • 27 open proxy servers at • 16 different sites • As JSTOR staff denied each proxy server, the abuse moved on. ~51,000 articles downloaded from 11 journals
Automate The Process • Download lists of open proxies • Automate a process to probe each to see if there is access to restricted resources • Identify a set of open proxy servers with such access and set them aside • Automate a process to download content • From the “confirmed” list – commence downloading.
Not an Isolated Case We have found web pages providing explicit instructions for others to help them exploit open proxies in order to download content.
Not an Isolated Case - Translations • “The Bible for Downloading Journal Articles” • “To be blunt about it, you find an overseas proxy. The institution that the proxy server belongs to has spent money to buy the electronic edition of some journal, and then you use this proxy, (so) of course you can download the entire text of that journal!” • “I cannot deny that some servers can download complete texts from many journals, but please, everyone, let’s not grab onto the ones which are easy to use and use them madly. The result of doing so will be to hasten the death of that server! So when you are using them, it’s best to do so equitably!”
What to do? • Shibboleth http://shibboleth.internet2.edu/ • DLF Certificates http://www.diglib.org/architectures/digcert.htm • Education • Drive all campus access through a set of properly authenticated proxy servers