610 likes | 740 Views
Choosing a Proxy - Don’t roll the D20!. Leif Hedstrom Cisco WebEx. Who am I?. Unix developer since 1985 Yeah, I’m really that old, I learned Unix on BSD 2.9 Long time SunOS/Solaris/Linux user Mozilla committer (but not active now) VP of Apache Traffic Server PMC ASF member
E N D
Choosing a Proxy- Don’t roll the D20! Leif Hedstrom Cisco WebEx
Who am I? • Unix developer since 1985 • Yeah, I’m really that old, I learned Unix on BSD 2.9 • Long time SunOS/Solaris/Linux user • Mozilla committer (but not active now) • VP of Apache Traffic Server PMC • ASF member • Overall hacker, geek and technology addict zwoop@apache.org @zwoop +lhedstrom
Plenty of Proxy Servers PerlBal
Answer: the one that solves your problem! http://mihaelasharkova.files.wordpress.com/2011/05/5steploop2.jpg
But first… • While you are still awake, and the coffee is fresh: My crash course in HTTP proxy and caching!
Why Cache is King • The content fastest served is the data the user already has locally on his computer/browser • This is near zero cost and zero latency! • The speed of light is still a limiting factor • Reduce the latency -> faster page loads • Serving out of cache is computationally cheap • At least compared to e.g. PHP or any other higher level page generation system • It’s easy to scale caches horizontally
Plenty of Proxy Servers PerlBal
Plenty of Free Proxy Servers PerlBal
Plenty of Free Proxy Servers PerlBal
The problem • You can basically not buy a computer today with less than 2 CPUs or cores • Things will only get “worse”! • Well, really, it’s getting better • Typical server deployments today have at least 8 – 16 cores • How many of those can you actually use?? • And are you using them efficiently?? • NUMA turns out to be kind of a bitch…
Problems with multi-threading • It’s a wee bit difficult to get it right! http://www.flickr.com/photos/stuartpilbrow/3345896050/
Problems with Event Processing • It hates blocking APIs and calls! • Hating it back doesn’t help :/ • Still somewhat complicated • It doesn’t scale on SMP by itself
Where are we at ? *) Can use blocking calls, with (large) thread pool
Proxy Cache test setup • AWS Large instances, 2 CPUs • All on RCF 1918 network (“internal” net) • 8GB RAM • Access logging enabled to disk (except on Varnish) • Software versions • Linux v3.2.0 • Traffic Server v3.3.1 • Nginx v1.3.9 • Squid v3.2.5 • Varnish v3.0.3 • Minimal configuration changes • Cache a real (Drupal) site
ATS configuration • etc/traffficserver/remap.config: map / http://10.118.154.58 • etc/trafficserver/records.config: CONFIG proxy.config.http.server_ports STRING 80
Nginx configuration try 1, basically defaults (broken, don’t use) worker_processes 2; access_loglogs/access.log main; proxy_cache_path/mnt/nginx_cache levels=1:2 keys_zone=my-cache:8m \ max_size=16384m inactive=600m; proxy_temp_path/mnt/nginx_temp; server { listen 80; location / { proxy_passhttp://10.83.145.47/; proxy_cachemy-cache; }
Nginx configuration try 2 (works but really slow, 10x slower) worker_processes 2; access_loglogs/access.log main; proxy_cache_path/mnt/nginx_cache levels=1:2 keys_zone=my-cache:8m \ max_size=16384m inactive=600m; proxy_temp_path/mnt/nginx_temp; gzipon; server { listen 80; location / { proxy_passhttp://10.83.145.47/; proxy_cachemy-cache; proxy_set_header Accept-Encoding ""; }
Nginx configuration try 3 (works and reasonably fast, but WTF!) worker_processes 2; access_loglogs/access.log main; proxy_cache_path/mnt/nginx_cache levels=1:2 keys_zone=my-cache:8m \ max_size=16384m inactive=600m; proxy_temp_path/mnt/nginx_temp; server { listen 80; set $ae ""; if ($http_accept_encoding ~* gzip) { set $ae "gzip"; } location / { proxy_passhttp://10.83.145.47/; proxy_cachemy-cache; proxy_set_header If-None-Match ""; proxy_set_headerIf-Modified-Since ""; proxy_set_headerAccept-Encoding $ae; proxy_cache_key$uri$is_args$args$ae; } location ~ /purge_it(/.*) { proxy_cache_purgeexample.com $1$is_args$args$myae } Thanks to Chris Ueland at NetDNA for the snippet
Squid configuration http_port 80 accel http_access allow all cache_mem 4096 MB workers 2 memory_cache_shared on cache_dirufs /mnt/squid 100 16 256 cache_peer10.83.145.47 parent 80 0 no-query originserver
Varnish configuration backend default { .host = "10.83.145.47”; .port = "80"; }
RFC 2616 is not optional! • Neither is the new BIS revision! • Understanding HTTP and how it relates to Proxy and Caching is important • Or you will get it wrong! I promise.
How things can go wrong: Vary! $ curl -D - -o /dev/null -s --compress http://10.118.73.168/ HTTP/1.1 200 OK Server: nginx/1.3.9 Date: Wed, 12 Dec 2012 18:00:48 GMT Content-Type: text/html; charset=utf-8 Content-Length: 8051 Connection: keep-alive X-Powered-By: PHP/5.4.9 X-Drupal-Cache: HIT Etag: "1355334762-0-gzip" Content-Language: en X-Generator: Drupal 7 (http://drupal.org) Cache-Control: public, max-age=900 Last-Modified: Wed, 12 Dec 2012 17:52:42 +0000 Expires: Sun, 19 Nov 1978 05:00:00 GMT Vary: Cookie,Accept-Encoding Content-Encoding: gzip
How things can go wrong: Vary! Note: no gzip support $ curl -D - -o /dev/null -s http://10.118.73.168/ HTTP/1.1 200 OK Server: nginx/1.3.9 Date: Wed, 12 Dec 2012 18:00:57 GMT Content-Type: text/html; charset=utf-8 Content-Length: 8051 Connection: keep-alive X-Powered-By: PHP/5.4.9 X-Drupal-Cache: HIT Etag: "1355334762-0-gzip" Content-Language: en X-Generator: Drupal 7 (http://drupal.org) Cache-Control: public, max-age=900 Last-Modified: Wed, 12 Dec 2012 17:52:42 +0000 Expires: Sun, 19 Nov 1978 05:00:00 GMT Vary: Cookie,Accept-Encoding Content-Encoding: gzip EPIC FAIL!
What type of proxy do you need? • Of our candidates, only two fully supports all proxy modes!
CoAdvisor HTTP protocol quality tests for reverse proxies 49% 81% 51% 68%
CoAdvisor HTTP protocol quality tests for reverse proxies 25% 6% 27% 15%
ATS – The good • Good HTTP/1.1 support, including SSL • Tunes itself very well to the system / hardware at hand • Excellent cache features and performance • Raw disk cache is fast and resilient • Extensible plugin APIs, quite a few plugins • Used and developed by some of the largest Web companies in the world
ATS – The bad • Load balancing is incredibly lame • Seen as difficult to setup (I obviously disagree) • Developer community is still too small • Code is complicated • By necessity? Maybe …
ATS – The ugly • Too many configuration files! • There’s still legacy code that has to be replaced or removed • Not a whole lot of commercial support • But there’s hope (e.g. OmniTI recently announced packaged support)
Nginx – The good • Easy to understand the code base, and software architecture • Lots of plugins available, including SPDY • Excellent Web and Application server • E.g. Nginx + fpm (fcgi) + PHP is the awesome, according to a very reputable source • Commercial support available from the people who wrote and know it best. Huge!
Nginx – The bad • Adding extensions implies rebuilding the binary • By far the most configurations required “out of the box” to even do anything remotely useful • It does not make good attempts to tune itself to the system • No good support for conditional requests
Nginx – The ugly • The cache is a joke! Really • The protocol support as an HTTP proxy is rather poor. It fares the worst in the tests, and can be outright wrong if you are not very careful • From docs: “nginx does not handle "Vary" headers when caching.” Seriously?