480 likes | 621 Views
APV Technical Training. Chapter 7 - Reverse Proxy Cache. Objectives. Learn Web caching concepts and configuration. Topics. Unit 1: Caching Concepts Unit 2: Array Cache Overview Unit 3: Array Cache Configuration. Unit 1: Caching Concepts. What is a Proxy Cache? Proxy Cache Types
E N D
APV Technical Training Chapter 7 - Reverse Proxy Cache
Objectives • Learn Web caching concepts and configuration.
Topics • Unit 1: Caching Concepts • Unit 2: Array Cache Overview • Unit 3: Array Cache Configuration
Unit 1: Caching Concepts • What is a Proxy Cache? • Proxy Cache Types • Cache Control Headers • Cacheable content
What is a Proxy Cache? • What is a Proxy Cache? • Proxy Caching is the transient storing of HTML data and graphic files on an intermediary device between the client and the origin Web server. • Two main types of Proxy Cache • Forward Proxy Cache • Reverse Proxy Cache • Array appliance supports reverse proxy cache
Forward Proxy Cache • A forward proxy cache relays requests from clients on the local network to other sites on the Internet and caches the responses thus reducing the amount of data transferred on external links. • Non-transparent mode: • Requires clients to make requests directly to the proxy. • Example: Web browser must be configured to point to the proxy cache. • Transparent mode: • Clients have no knowledge of the proxy cache. • Client traffic is redirected to the proxy by a content switch. • Array ADC Appliance; tProxy feature.
Example 1 Non-transparent Proxy Cache in enterprise DMZ Reduces WAN bandwidth utilization. Forward Proxy Cache
Forward Cache • Example 2 • Transparent Proxy Cache at ISP POP • Reduces backbone utilization.
Reverse Proxy Cache • Reverse proxy cache receives requests from clients all over the Internet and responds to these requests in conjunction with a fixed number of origin servers it is configured to interact with • Positioned right in front of web servers and helps accelerate HTTP and SSL traffic • Reduces resource utilization on the origin Web servers • Reduce latency content delivery • Reduce network traffic with advanced cache directives (304, Not Modified)
Example – Reverse Proxy Cache Reduces load on origin servers. Reverse Proxy Cache
How Web Cache Work • Caches have a set of cache rules that they use to determine • To Cache or Not Cache server response • To response with cache object to client or not; or use client own cached object • Basic cache rules are set in the HTTP protocols (HTTP 1.0 and 1.1). Array ADC Appliance support HTTP 1.1 (most). • Some cache rules are set by the administrator. Array ADC Appliance allow administrator set “Cache Policy” to override protocol cache rules. • HTTP traffic falls into two categories: • Cacheable content • Non-cacheable content.
What Is Cacheable? • Whether the content is cacheable or not is determined by • HTTP headers • “Cache-Control: private” header indicates that content is intended for a single user and should not be stored in a public cache • “Cache-Control: public” header indicates that any cache may store the content. • HTTP response status code • Response with 200 (OK) status is cached unless HTTP headers disallow that • Response with 403 (Not Found) status code is not cached • Response size (implementation, configurable on Array) • User configurable parameter with the default value of 5 MB
Content Expiration • Basic Formula: • Freshness Lifetime > Age • Freshness lifetime: • Time that cached response can be used to service requests • Default value is 23 hours • Determined by values in “Cache-Control” and "Expires“ headers • Example: “Expires: Tues, 17 Dec 2002 18:00:00 GMT” This header tells the cache to re-fetch the response from an origin server when the request for that object is made after Tues, 17 Dec 2002 18:00:00 • Age: • Time since the response has been generated by the server • Determined by “Age”, “Date” headers, arrival time of request and response
Content Revalidation • Client browser cache can cache web objects. When client request objects existed in its local cache; the request carries its local cache object information (such as ETag; checksum). Caches/Servers can then decide send or not send the complete the data object to the client. • Proxy revalidates cached response when: • Response gets expired. • Freshness Lifetime of the response becomes less than response age • Request headers require revalidation. • “Cache-control: max-age=0” in the request indicates that cached response needs to be revalidated with origin server • Response headers require revalidation. • “Cache-Control: no-cache” indicates that proxy cache may store the content, but may only serve the cached content if it first re-validates the content with the origin server
Unit 2: Array Cache Overview • Array Cache Basic Operation Principle • Array Cache Functions • Array Cache Benefits • Implementation Example
Array Cache Operation • Cache Lookup is by “Host Name” + “URL” • Vary Header support need more • The URL may with dynamic part (such as query data) • Cache Memory size is around 1/2 of the unit memory size • TM 8.x is ¼ of physical memory • 64K Cache Object Entries • Array recommend cache the following contents – • Static Web Pages – shareable HTML • Static Images – JPG, GIF, PNG, PDF, DOC, etc. • Files delivered via HTTP – MPG, Flash, QuickTime, etc. • Shareable Object generated by dynamic scripts are passed to server • “/…/a.php?...” (CGI, Perl, PHP, JSP, ASP, etc)
Array Cache Functions • Standards compliant • HTTP 1.0 and HTTP 1.1 (RFC 2616) Compliant • Support all Cache Control Headers • Range Header Support • Client can request part of the object. • Vary Header Support • Multiple copies of different content may be cached per server negotiation • Flexible customer Cache Filter able to overwrite protocol control • Management functions • Cached object names and associated information can be listed • Cached objects can be manually expired • Cached content can be manually flushed • Cached objects can be locked (prevents manual flushing of object) • Logging in Squid, common, combined or custom format
Array Cache Benefits • Efficient cope with burst heavy HTTP traffic • Deal with “flash crowd/spike” event • Mitigate Denial of Service (DoS) attacks against origin server • Unavailable for TM8.1. • Accelerate content delivery • Content is cached only in RAM for quick access (not on hard disk). • No general purpose OS overhead; cached content is stored in network buffers (not disk, virtual memory) so it can be placed on the wire very fast. • Optimized TCP/HTTP protocol stack able to deliver content for million connections • Reduce bandwidth usage • No need repeat get the same content from server • Can instruct client using its own local cache data (304, not modified), without xmit real object. • Reduce resource utilization on origin Web servers. • Requests for cached content are served by the cache instead of the origin servers.
Implementation Example • Without Caching enabled • High load on real servers because they must handle all client connections. • Adding more servers to handle the load increases capital and operational costs. • With Array Caching enabled • All requested files can be stored in the cache and served directly to web clients, thus eliminating much of the load on the real servers. • Number of servers can be reduced significantly. • Caching license (cost = 2 servers) • Load handling capacity substantially higher than 2 servers.
Unit 3: Array Cache Configuration • Enabling/disabling the Cache • Cache Settings • Displaying Cache Settings • Displaying Cache Status • Displaying Cache Statistics • Clearing the Cache • Locking/Expiring Cache • Displaying Cache Content • Setting Cache Policies • Cache Filter Configuration • Cache Override Rules
Cache Configuration • Cache configuration commands allow administrators to specify what cacheable elements will be stored. • The administrator may need understand web application for cacheable and none-cacheable content to make cache operation more effective. • To turn on the cache: • cache on • Content will be served from the cache depending on its cache-control headers. • To turn off the cache: • cache off • The Array appliance will rely completely on the SLB subsystem for every request that comes to the device until the cache function is reactivated.
Cache Settings • To set cache tuning parameters: • cache settings expire<hh:mm:ss | ss> • hh:mm:ss/ ss: time value used by Array appliance as an expiration time of the cached object if server does not provide expiration information in the response (default=23:00:00). • cache settings objectsize <max_size> • max_size: Maximum object size to cache (minimum=1 KB, default=5120 KB) • Adjust this parameter according to the size of content stored on servers. • cache settings flowthrough <on | off> • Send data while receive (on). Or wait until whole data received before response cache data to client (off). • Removed from TM 8.1.x (TM8.1.x always on) • cache settings lowresource <keepclient | closeclient> • Ignore (closeclient) or factor in (keepclient) presence of attached clients when selecting objects to replace. (default = keepclient) • Removed from TM 8.1.x • cache settings replacement <number_of_objects> • number_of_objects: Number of cached objects to be replaced when cache becomes full (minimum=1, default=10). • Removed from TM 8.1.x
Cache Settings • Example: • Expire cache entries after 12 hours. • Set a maximum cacheable object size of 10000 KB. AN(config)# cache settings expire 12:00:00 AN(config)# cache settings objectsize 10000 AN(config)# cache settings lowresource keepclient
Cache Settings • To display cache settings: • show cache settings AN(config)#show cache settings Cache Configuration: Cache Default Expiration: 82800 seconds Maximum Cacheable Object Size: 5120 KB Cache Replacement Number: 10 Cache Replacement Selection: factor in clients Send cache object to client: AFTER caching
Cache Status • To show cache status: • show cache status AN(config)#show cache status reverse proxy cache: enable
Cache Statistics • To show cache statistics: • show statistics cache Advanced Statistics: Number of cache objects: 202 Number of cache frames used: 260 Successful cache probes: 1 Cache revalidate, request with "no-cache": 0 Cache revalidate, client IMS forward: 0 Cache revalidate, proxy IMS forward: 0 Cache revalidate, not modified: 0 Cache miss, requests with cookies: 0 Cache miss, requests with range: 0 Cache miss, HTTP version mismatch: 0 Cache miss, IMS mismatch: 0 Cache miss, server driven negotiation: 1 Cache miss, negative entry hit: 0 Cache miss, requests with content: 0 Cache miss, mismatch vary header: 0 Requests redirected to HTTPS: 0 Requests redirected based on regex match: 0 Requests forwarded with rewritten url: 0 Locations rewritten to HTTPS: 0 Locations rewritten based on regex match: 0 Cache/Network buffer size (KB): 2031360 Percent cache/network buffer used: 8.54% AN(config)#show statistics cache Reverse Proxy Cache Basic Statistics: Requests received: 9 Requests with GET method: 7 Requests with HEAD method: 0 Number of open client connections: 1 Number of open server connections: 2 Cache miss, new entry created: 4 Cache miss, noncacheable requests: 2 Cache revalidate: 0 Cache hit, reply using cache: 2 Cache hit, reply with "Not Modified": 1 Hit ratio: 42.85% (Reverse Proxy Cache Advanced Statistics follows)
Cache Management • To flush all cache content: • clear cache content • All cached objects will be removed. • To flush specific cache content: • cache evict<hostname> <regex> • Cached objects matching hostname and regular expression will be removed from the cache. • (TM 8.1 removed) • To lock an object in the cache: • [no] cache lock<hostname> <regex> • Cached objects matching hostname and regular expression will be immune to clear cache content and cache evict • (TM 8.1 removed)
Cache Management • To expire an object in the cache: • [no] cache expire<hostname> <URL regex> <hh:mm:ss | ss> • Cached objects matching hostname and regular expression will be assigned an expiration time. This expiration time has a higher precedence than expiration time given by the server • Integrated into “cache filter …” for TM8.x. • cache filter rule <hostname> <URL regex> <ttl=seconds> • Available for both TM6.x and TM8.x • To view information associated with cached objects: • show cache content<hostname> <URL regex> • URL and other information associated with cached objects matching hostname and regular expression will be displayed
Cache Filter • Array ADC Appliance supports “cache filter” to facilitate unit administrator defines its own caching rules to take advantage of “caching”. • Without “cache filter”, applications based on HTTP cache protocol will not able take caching as - • Client request or Server Response with cache-control “no-cache” for shared objects as logo.jpg. • Client request URL contains dynamic query information which can not match any cached objects • Client request with cookie information which is assumed need be processed by server to generate dynamic object. • Etc.
Cache Filter Overview • Array ADC Appliance design simple “cache filter … ” command to enable Cache Policy • Cache Filter bypass HTTP header cache control information for both request and response. • Cache Filter is designed for simple customer use, the best/longest match rule (not first match), to specify web object s, i.e. “*.jpg”, to be cached or not • Cache filter can be used to conveniently control how long a cache object to be cached. Useful for dynamic objects or web objects not conform to HTTP 1.1.
Cache Filter – Command Format • Cache Filter CLI Command – cache filter rule <hostname> <URL> “cache=yes|no” “override=yes|no” “ttl =n” • Hostname • “hostname” and “URL” define the address we want to impose cache filter on. Hostname need be exact match; can not take any regular expression. • URL is based on the PERL alike regular expression. • “/” : Match all URLs (as “/” always at the start of URL. “/” = “/*”) • “/upload/”: Match URL contains “/upload/” directory. • “/*.exe” : Match all the exe files • “/image/*.jpg”: Match all the jpg file under /image directory • cache=yes|no: force cache. • override=yes|no: Ignore cache control header. • urlquery=yes: Ignore query part in the URL. New to TM8.1.x. • ttl=n: Time to live; in second
The method of cache filter • “cache =yes override=yes” • Array appliance will force the object to be cached and ignore some headers or cache-control directive such as “no store”“no cache” and “private”. • “cache=no override=yes” • force the object not to be cached although it is “public” or “max_age” was set • “cache=yes override=no” • Array appliance will decide whether to cache the object according to the header and cache-control directive returned from server first. And if there have not any cache related header or cache-control directive, Array appliance will cache the object and set its freshness time to be n seconds.
Cache Filter - Examples • Specific types of file should be cached, any others files follow the server’s cache-directive • cache filter rule www.xyz.com “/*.jpg” “cache=yes” • cache filter rule www.xyz.com “/*.gif” “cache=yes” “ttl=200000” • cache filter rule www.xyz.com “/*.html” “cache=yes” “ttl=200000” • Specific types of file should be cached, any other files should NOT be cached • cache filter rule www.xyz.com “/*.jpg” “cache=yes” • cache filter rule www.xyz.com “/*.gif” “cache=yes” “ttl=200000” • cache filter rule www.xyz.com “/*.html” “cache=yes” “ttl=200000” • cache filter rule www.xyz.com “/” “cache=no” • Specific types of file should NOT be cached, any other files follow the server’s cache-directive • cache filter rule www.xyz.com “/*.jpg” “cache=no” • cache filter rule www.xyz.com “/*.gif” “cache=no” • cache filter rule www.xyz.com “/*.html” “cache=no” • Specific types of file should NOT be cached, any other files should be cached • cache filter rule www.xyz.com “/*.jpg” “cache=no” • cache filter rule www.xyz.com “/*.gif” “cache=no” • cache filter rule www.xyz.com “/*.html” “cache=no” • cache filter rule www.xyz.com “/” “cache=yes”
Cache Policies • To enable/disable/show caching of responses with cookies: • [show] cache policy response setcookie<cache|nocache> • Enables (cache) and disables (nocache) caching of the responses with set-cookie header • “cache filter rule <host> <url> “cache=yes” (TM8.x) • To enable/disable/show serving responses to the request with cookies: • [show] cache responsecookierequest<enable/disable> • Enables and disables serving responses to request with cookie header from cache. • “cache filter rule <host> <url> “cache=yes” (TM8.x) • To enable/disable/show byte range support: • [show] cache responsebyterange<enable/disable> • Enables and disables serving responses to range requests from cache. • Not available to TM8.x. Byte range support by server with TM 8.x.
Cache Policy – URL Query • Request URL can carry dynamic query information; need ignore the query to make cache hit • http://www.a.com/test/logo.jpg/php?test=1234&ID=45679... • Configuration Example • To enable appliance cache support for a specific URL requests with different query information • TM 6.x • AN(Config)# cache filter rule www.a.com “/*.jpg” “cache=yes” • AN(Config)# cache override request urlquery settings rules • AN(Config)# cache override request urlquery rules 1 www.a.com “/test/logo.jpg • TM 8.x • AN(Config)# cache filter rule www.a.com “/*.jpg” “cache=yes” • AN(Config)# cache filter rule www.a.com “/*.jpg” “urlquery=yes”
Cache Policy – Vary Header • The Vary Header is a HTTP header generated by HTTP server to indicate (can be used by intermediate caches) that the contents of the URI varies based on specified aspects of the client request. • For example; same URI request from different mobile phone may with different return objects as phone display size is differed. • Server generates content based on the preferred language with client requests. English, Japanese, Chinese, etc.. • Vary Header and client request “aspects” need be matched for a cache hit • Available to TM 6.x; not for TM8.x.
Cache Policies – Vary Header • Compared with traditional cache system, in Vary Header supported cache system: • The cache entry can not be identified only by Hostname and URL any more. The same Hostname and URL may contain many versions of cache objects according to the result of server negotiation. • The cache entry can’t completely filled by the client request. The server negotiation result is dynamic and should be used in the cache entry retrieval. • The selection should be delayed after the response comes in and the vary header list in response is checked.
Cache Policies – Vary Header • Once Vary Header support is enabled, cache system process • Parses client request Hostname and URL. • Use Hostname and URL find matched cache object. • If no match, a new (empty) cache entry is created and forward the request to the server. When server reply, if Vary Header is part of the response, add Vary Header and matched client request information into the cache data structure (and adjust cache object list position, if needed). • If match found and with Very Header information. Extract information indicated by Very Header from client request. If all the Very Header information matched, the cache is used. Otherwise, continue to the next cache entry belonging to the same Hostname and URL. If all the existing entries in the same list are checked and no match found, a new entry need to be created and the request should be forwarded to the server.
Cache Policies – Vary Header New CLI Command for Vary Header Support - • cache policy varyheader {off|on} • Turn on /turn off the HTTP cache vary header function. • show cache policy varyheader • This command is used to display the current HTTP cache vary header status. • show statistics varyheader • This command is used to display the HTTP cache vary header global statistics information.
Cache Policies – Vary Header • Vary Header Example • Configurations on Array TMX/APV • AN(config)#slb real http "r202" 172.16.63.202 8080 • AN(config)# slb virtual http "v42" 172.16.63.42 80 • AN(config)# slb policy static "v42" "r202" • AN(config)# cache on • AN(config)#cache policy varyheader on
Cache Policies – Vary Header • Vary Header Example • Right panel (headers received): • This is the response header from a HTTP server, “Vary” header defines the set of request headers of variation • “vary” header says that for the same URL, different combinations of accept-language, accept-encoding and user-agent may require different content (eg: when accept-language is “en”, an English version response will be responded. If accept-language is “cn”, a Chinese version response will be responded). • Left panel (headers sent): • This is the request header from a HTTP client • Accept-Encoding, Accept-language and user-agent headers are going to be used with URL together to determine a unique cache object
Cache Policies – Vary Header • Vary Header Example AN(config)#show cache content "172.16.63.42" Cache Content information: URL: /foo/foo.html.var Host: 172.16.63.42 Vary Headers List: User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; Maxthon; .NET CLR 1.1.4322) Accept-Language: zh-cn Accept-Encoding: gzip Object compressed type: gzip. Page is not locked Expiration time is not configured Cache Probes: 0 Content Length: 215 Frame count: 2 Response code: 200 Server: Not presentshow statistics varyheader.
Cache Override Commands • NOTE: Since Cache override rules force proxy cache to violate RFC2616, it is advised that these commands are used with caution. • To set/unset/show cache response override rules: • [show|no|clear] cache override response <rule_id> <hostname> <regex> <private/cache> • cache: default cachability will be modified and responses to requests matching hostname and regular expression will not be cached. • private: “Cache-Control: private” will be ignored in responses to requests matching hostname and regular expression and these responses will be cached unless other response characteristics prohibit caching.
Cache Override Commands • To set/unset/show cache request override rules for the Cache-Control header: • [show|no] cache override request cachecontrol nocache • Enables serving responses to request with “Cache-Control: no-cache” header from cache without revalidating responses with the backend server. • To set/unset/show cache request override rules to ignore the hostname for caching: • [show|no] cache override request host • Forces proxy cache to ignore hostname in cache lookups.
Cache Override Commands Query string – unique to each login/user render cache useless • http://mail.abc.com/dc/launch?gx=1&.rand=649a0h0fti2oj • cache override request urlquery settings { off | all | rules} • To set/unset/show cache request ignore URL query string • off: Default. No partial URL match. • all: The cache is set to use partial (exclude query) url match for all requests. • rules: The cache is set to use partial url matches only for those requests that match one of the rules configured using the cache override request urlquery rules command. • cache override request urlquery rules <rule_id> <hostname> <url_prefix> • URL query will be ignored in requests that match hostname and url prefix. • rule-id: An identifier (1 – 1000) for the rule. • hostname: Hostname of the objects. • url-prefix: The part of the url before the first question mark ‘?’.
Summary • Caching Concepts • Array Cache Overview • Array Cache Configuration