160 likes | 324 Views
Cluster Load Balancing for Fine-grain Network Services. Kai Shen, Tao Yang, and Lingkun Chu Department of Computer Science University of California at Santa Barbara http://www.cs.ucsb.edu/projects/neptune. Cluster-based Network Services.
E N D
Cluster Load Balancing for Fine-grain Network Services Kai Shen, Tao Yang, and Lingkun Chu Department of Computer Science University of California at Santa Barbara http://www.cs.ucsb.edu/projects/neptune
Cluster-based Network Services • Emerging deployment of large-scale complex clustered services. • Google: 150M searches per day; index of more than 2B pages; thousands of Linux servers. • Teoma search (powering Ask Jeeves search): a Sun/Solaris cluster of hundreds of processors. • Web portals: Yahoo!, MSN, AOL, etc. • Key requirements: availability and scalability. IPDPS 2002
Architecture of a Clustered Service:Search Engine Index servers (partition 1) Firewall/ Web switch Local-area network Index servers (partition 2) Web server/ Query handlers Doc servers IPDPS 2002
“Neptune” Projecthttp://www.cs.ucsb.edu/projects/neptune • A scalable cluster-based software infrastructure to shield clustering complexities from service authors. • Scalable clustering architecture with load-balancing support. • Integrated resource management. • Service replication – replica consistency and performance scalability. • Deployment: • At Internet search engine Teoma www.teoma.com for more than a year. • Serve Ask Jeeves search www.ask.com since December 2001. (Serving 6-7M searches per day as of January 2002.) IPDPS 2002
Network to the rest of the cluster Service Access Point Service Availability Directory Service Load-balancing Subsystem Service Availability Subsystem Service Availability Publishing Service Runtime Neptune Clustering Architecture – Inside a Node Service Consumers Services IPDPS 2002
Cluster Load Balancing • Design goals: • Scalability – scalable performance; non-scaling overhead. • Availability – no centralized node/component. • For fine-grain services: • Already widespread. • Additional challenges: • Severe system state fluctuation more sensitive to load information delay. • More frequent service requests low per-request load balancing overhead. IPDPS 2002
Evaluation Traces • Traces of two service cluster components from Internet search engine Teoma; collected during one-week of July 2001; the peak-time portion is used. IPDPS 2002
Broadcast Policy • Broadcast policy: • An agent at each node collects the local load index and broadcasts it at various intervals. • Another agent listens to broadcasts from other nodes and maintains a directory locally. • Each service request is directed to the node with lightest load index in the local directory. • Load index – number of active service requests. • Advantages: • Require no centralized component; • Very low per-request overhead. IPDPS 2002
Broadcast Policy with Varying Broadcast Frequency (16-node) Mean response time (norm. to Cent.) Mean response time (norm. to Cent.) <A> server 50% busy <B> server 90% busy 10 10 MediumGrain MediumGrain FineGrain 8 8 FineGrain Centralized Centralized 6 6 4 4 2 2 • Too much dependent on frequent broadcasts for fine-grain services at high load. • Reasons: load index staleness, flocking effect. 0 0 31.25 62.5 125 250 500 1000 31.25 62.5 125 250 500 1000 Mean response time (norm. to IDEAL) Mean response time (norm. to IDEAL) Mean response time (norm. to IDEAL) Mean response time (norm. to IDEAL) Mean response time (norm. to IDEAL) Mean response time (norm. to IDEAL) Mean broadcast interval (in ms) Mean broadcast interval (in ms) IPDPS 2002
Random Polling Policy • For each service request, a polling agent on the service consumer node • randomly polls a certain number (poll size) of service nodes for load information; • picks the node responding with the lightest load. • Random polling with a small poll size. • Require no centralized components; • Per-request overhead is limited by the poll size; • Small load information delay due to just-in-time polling. IPDPS 2002
Service nodes are kept 90% busy in average Is a Small Poll Size Enough? <A> MediumGrain trace <B> FineGrain trace 1000 100 Random Mean response time (in ms) Polling 2 Mean response time (in ms) Polling 3 800 80 Polling 4 Centralized 600 60 400 40 200 20 • Mean response time (in milliseconds) • Mean response time (in milliseconds) 0 0 0 50 100 0 50 100 Number of service nodes Number of service nodes In principle, it matches the analytical results on the supermarket model. [Mitzenmacher96] IPDPS 2002
System Implementation of Random Polling Policies • Configurations: • 30 dual-processor Linux servers connected by a fast Ethernet switch. • Implementation: • Service availability announcements made through IP multicast; • Application-level services are loaded into Neptune runtime module as DLLs; run as threads; • For each service request, polls are made concurrently in UDP. IPDPS 2002
Experimental Evaluation of Random Polling Policy (16-node) <B> FineGrain trace <A> MediumGrain trace 700 Random Random Mean response time (in ms) 80 Mean response time (in ms) 600 Polling 2 Polling 2 Polling 3 Polling 3 500 Polling 4 Polling 4 60 Polling 8 Polling 8 400 Centralized Centralized 40 300 200 20 100 • For FineGrain trace, large polling size performs even worse due to excessive polling overhead and long polling delay. 0 0 50% 60% 70% 80% 90% 50% 60% 70% 80% 90% Mean response time (in milliseconds) Mean response time (in milliseconds) Mean response time (in milliseconds) Mean response time (in milliseconds) Server load level Server load level IPDPS 2002
Discarding Slow-responding Polls • Polling delay with a poll size of 3: • 290us polling delay when service nodes are idle. • In a typical run when service nodes are 90% busy: • Mean polling delay – 3ms; • 8.1% polls are not returned in 10ms. Significant for fine-grain services (service time in tens of ms) • Discarding slow-responding polls – shortens the polling delay. 8.3%reduction in mean response time. IPDPS 2002
Related Work • Clustering middleware and distributed systems – Neptune, WebLogic/Tuxedo, COM/DCOM, MOSIX, TACC, MultiSpace. • HTTP switching – Alteon, ArrowPoint, Foundry, Network Dispatcher. • Load-balancing for distributed systems – [Mitzenmacher96], [Goswami93], [Kunz91], MOSIX, [Zhou88], [Eager86], [Ferrari85]. • Low-latency network architecture – VIA, InfiniBand. IPDPS 2002
Conclusions • Random-polling based load balancing policies are well-suited for fine-grain network services. • A small poll size provides sufficient information for load balancing; while an excessively large poll size may even degrade the performance. • Discarding slow-responding polls can further improve system performance. http://www.cs.ucsb.edu/projects/neptune IPDPS 2002