240 likes | 329 Views
A Programable Caching Cluster Director. Submitting: Barak Pinhas 037584042 Gil Fiss 031801731 Laurent Levy 921012076. Caching by proxy servers.
E N D
A Programable Caching Cluster Director Submitting: Barak Pinhas 037584042 Gil Fiss 031801731 Laurent Levy 921012076
Caching by proxy servers • The World-Wide Web evolution since its introduction in 1990 has evolved from a simple client server model into a complex distributed architecture. • This evolution has been driven largely due to the scaling problems associated with exponential growth. • One of the core infrastructure components that have been employed to meet the demands of this growth is caching by proxy servers.
A proxy: A proxy server is an intermediary program which acts as both a server and a client for the purpose of making requests on behalf of other clients. The proxy’s cache stores cacheable responses in order to reduce the response time and network bandwidth consumption on future, equivalent requests.
Proxy cache capacity The proxy cache must utilize a clean-up algorithm to determine which documents to throw away when its capacity limit is reached. When the proxy free space reaches a certain limit, a number of web pages are cleared from the proxy.
Proxy cluster user Web server Proxy #1 3.The proxy checks if he has the requested page 1.The user request for a web page Proxy #2 4.if the requested web page is not in the proxy’s cahce,the proxy searches the neighboring proxies for the web page,and if the web page is not found in the neighboring proxies the proxy takes the page from the web server. director 2.The director select a proxy to handle the request Proxy #n
Proxy cache capacity Methods to determine the web pages that would be cleared: • LFU aging Least Frequently Used aging. The document which has been requested least frequently is cleared from the cache. • LRU aging Least Recently Used aging. The document which has been requested least recently is cleared from the cache. • Perfect LFU Perfect Least Frequently Used aging. In this method the document which has been requested least frequently is cleared from the cache.
Project’s aims and measurements Proxy servers that cache web pages can potentially reduce: • The number of requests that arrive to the web servers ( the load on web servers in units of web pages). • The volume of network traffic resulting from document requests ( the load on web servers in units of bytes). • The latency that an end user experiences in retrieving a document
Selection methods for the Director 1. RRM - Round Robin selection method When a page request arrives, the director selects the proxy that is after the last selected proxy in a predetermined fixed order.
Selection methods for the Director 2.Hash Function The hash key is the URL, when a new request arrives with a given URL the request goes to the proxy number hash(URL). Hash function Proxy #1 Index k director Proxy #k www.cs.technion.ac.il Proxy #n
Selection methods for the Director • The Hash function is: Hash(Url Number) = <sum of the Url Number digits> mod < number of proxies >
Selection methods for the Director 3.Unfinished job evaluating the number of jobs that the proxy takes care of and giving the request to the proxy with the least jobs at the time of the request. 1.The director selects the proxy with the least tasks Proxy #1 At the time of the request, proxy #1 has 1 task Proxy #2 At the time of the request, proxy #2 has 2 task Proxy #3 At the time of the request, proxy #3 has 2 task tasks tasks time time time
Selection methods for the Director 5. BLFU – byte least frequently used. In this Method each proxy has a BLFU Value ( In The PLFU method the system remembers the usage frequency record of each request after that request was erased.)
Selection methods for the Director Proxies with BLFU Value higher then a certain Verge are ignored
Selection methods for the Director 4. PLFU – perfect LFU: In this Method each proxy has a PLFU Value ( In The PLFU method the system remembers the usage frequency record of each request after that request was erased.)
Selection methods for the Director Proxies with PLFU Value higher then a certain Verge are ignored
results and Conclusions
Expected results Object hit ratio • We expected ROUND-ROBIN oriented selection methods (RRM TASKS and PLFU/BLFU) to get a lower local hit ratio then the hash selection methods, mainly because while using the hash method a cluster hit will always come with a local hit. That is why a hash function that is not even might lead a a lower cluster hit rate. • Among the ROUND-ROBIN oriented selection methods we expected the “even load distribution“ methods like TASKS and PLFU/BLFU to get a better hit ratio then the plain RRM because we expected these methods to grant us an even distribution and that way less pruning and a more adaptive cluster (a better cluster hit ratio).
Expected results Byte hit ratio • We expected the methods that takes requests sizes into consideration to get a better byte hit ratio Load distribution • We expected the methods that strive to achieve a better distribution (like RRM TASKS and PLFU/BLFU) to get a better load distribution then the Hash method. • We expected the following best results ranking: • BLFU/PLFU - because of the “perfect” technique used. • TASK - because of the load distribution. • RRM - because of a simple distribution. • HASH – because the hash keying might not be even.
The results Comparing the director selection methods In this graph we see a significant advantage to the hash selection method. The average hit rate for the hash selection method is about 62%-65%. But a disadvantage of this method is that the load distribution is not even. All the other method produced about 50% hit rate.
Comparing all the possible director selection and proxy pruning methods Hit rate As we can see in the graph,the best result of hit rate was in hash for the director method and LFU,LRU and PLFU for the proxies pruning method. In the next place we see the task method for the director and PLFU for the proxy pruning method.
Comparing all the possible director selection and proxy pruning methods Byte hit rate In this graph we can see again that the hash method for the director Got the best results. Comparing to the hit rate we can see that the hash method result differs More From other method result in the byte hit rate campers to the hit rate graph.
Comparing all the possible director selection and proxy pruning methods Load distribution In this graph we see that there is no even distribution among the proxies. Proxy #1 got more request in compare to the other proxies when hash method Was chosen for the director method,and proxy #5 got more request when tasks Method was chosen for the director.