280 likes | 405 Views
Secure and High-performance Web Server System for Shared Hosting Service. D a isuke Ha r a and Yasuichi Nakayama The University of Electro-Communications, Tokyo, Japan. Outline. Introduction Background Problems of large-scale hosting service and web server Proposal - H i -s a p Design
E N D
Secure and High-performance Web Server System for Shared Hosting Service Daisuke Hara and Yasuichi Nakayama The University of Electro-Communications, Tokyo, Japan ICPADS 2006@Minneapolis
Outline • Introduction • Background • Problems of large-scale hosting service and web server • Proposal - Hi-sap • Design • Implementation • Evaluation • Conclusions
Introduction • Problem of existing web servers • Server embedded interpreters cannot be used safely in large-scale environments like a shared hosting service. • Proposal - Hi-sap • Web objects that are stored in a server are divided into partitions*. • Server processes run under the privilege of different users in every partition. • Achievement • Hi-sap solves the problem. • It achieves high performance & scalability. (*) “partition” is a unit of division of web objects. (e.g. site, content, QUERY_STRING)
Background • More people are creating their own websites as the Internet grows in popularity. • weblog, wiki, CMS • Shared hosting services are widely used. • Many customers share a server. • 100s - 1000s sites/server • low price & flexible • custom CGI, etc.
Server embedded interpreters • e.g. PHP, mod_ruby, mod_perl • Because they have server processes including interpreters of language processors, • they can improve performance in processing dynamic content like weblogs and wikis.
Problem of existing web servers It is required to grant read permission to an other. (rw-r--r--) A’s website B’s website ID & Pass authentication auth content auth content C’s website browser steal & delete Server Internal users can steal & delete authentication content without authentication (cp, rm commands or malicious CGI scripts).
Problem of existing web servers (cont.) • Existing solution: POSIX ACL & suEXEC • CGI scripts run under the privilege of the site owner by using suEXEC. • Permissions of public access files are granted only to the dedicated user* by using POSIX ACL. • It is not required to grant read permission to an other. (*) “dedicated user” is user account that runs server processes. e.g. www, apache, www-data
Problem of existing web servers (cont.) • Even if POSIX ACL & suEXEC is used, the problem occurrs when server embedded interpreters are used. • Dynamic content that use server embedded interpreters (e.g. PHP, mod_ruby, mod_perl) also run under the privilege of a dedicated user. • Malicious PHP scripts can steal & delete authentication content.
Harache ([13][14]) • Predecessor of Hi-sap • Server processes run under the privilege of the site owner. Harache root root ② GET /~userA/ ① root userA ③ ④ browser • A browser sends request to the user A's website. • The privilege of the server process is changed to user A. • The server process processes the request. • It returns a response to the browser. Server Process
Harache (cont.) • Server embedded interpreters can be used safely. • File permissions to a dedicated user are not necessary. • It is required to grant permissions only to the site owner. • But, it cannot fully use the increased speed of server embedded interpreters. • Server processes terminate after each session. (= CGI) Hi-sapsolves Harache’s performance problem.
Goal • Realization of secure, high-performance, and scalable web server system, Hi-sap • Secure: Scripts of a partition cannot access other partitions. • High performance: Dynamic content can be processed at high speed by fully using the increased speed of server embedded interpreters. • Scalable: A number of partitions can be housed in a server.
Design • Security • Server processes run under the privilege of different users in every partition. (= Harache) • The system brings access control into operation with a secure OS. • Performance • The system pools server processes that run under the privilege of the different users. (!= Harache) • Scalability • The system controls the creation and termination of server processes. Content Access Scheduler
Content Access Scheduler • Web-server level scheduler • [aim] It enhances the scalability of the number of partitions in a server. • [method] It controls the creation and termination of server processes. By using the suitable scheduler for the purpose, it achieves high-scalability.
Implementation • OS: Linux OS with SELinux • dispatcher • reverse proxy server • Apache 2.0.55 + mod_hisap • workers • Each worker runs under the privilege of a different user and processes requests for a specific dedicated partition. • Apache 2.0.55 x 1000 • Any web server software can be used. • hisapd • Content Access Scheduler
UNIX Domain socket GET / HTTP/1.1 Host: www.C.net A C C A C A C Overview of request processing HTTP asking to activate worker C confirming if worker C is active Browser worker A has no requests hisapd dispatcher www root sending the response heavy load www root OK reverse proxy terminating worker A activating worker C … B B process the request B Server workers
Scheduling algorithm • We developed Content Access Scheduler to avoid thrashing. • Thrashing decreases the performance of web servers dramatically. • Algorithm of worker activation • hisapd dynamically activates workers after requests from the dispatcher. • Algorithm of worker termination • When thrashing seems to occur, hisapd terminates workers that have not been requested recently.
Scheduling algorithm (cont.) • Conditions for which hisapd judges that thrashing seems to occur • A swap-in occurs. • A swap-out occurs. • Memory use is 99% or more. • Conditions for which hisapd chooses workers to terminate • The worker is active. • The worker is not recorded in the most recent 10,000 requests.
Evaluation • Experimental environments Gigabit Ethernet Gigabit Ethernet
Evaluation (conf.) • Basic performance evaluation • We evaluated the basic performance in processing dynamic content. • Scalability evaluation • We evaluated the scalability of the number of partitions in a server in processing dynamic content. • Target content • We sent requests to a PHP script that calls phpinfo(). • The script displays the system information of the PHP language processor. (40 KB per request)
Basic performance evaluation • Aim • to determine useful performance of our system • Systems for comparison • Apache • One-to-one • It uses networks with a reverse proxy, and has a dispatcher and many workers that are dedicated to process requests for each partition. • Although it is similar to our system, mod_hisap and hisapd are not installed. • Apache with suEXEC • Benchmark • httperf benchmark ver. 0.8
Basic performance evaluation (cont.) • The system loses an avg. of 28.0% of the throughput relative to Apache. • The overhead of the system is because of a reverse proxy. • However, the system has high throughput relative to suEXEC. • The system loses an avg. of 1.0% of the throughput relative to One-to-one. • The overhead of mod_hisap & hisapd is very low.
Scalability evaluation • Aim • to determine the effectivenessof Content Access Scheduler • Comparison system • One-to-one • mod_hisap and hisapd (Content Access Scheduler) are not installed. • Benchmark • Apache benchmark ver. 2.0.41-dev
Scalability evaluation (cont.) • Our system’s scalability is high. • The throughput decrement due to an increase in the number of partitions was low. • For One-to-one, the OS crashed due to a memory shortage when the number of partitions was 600.
Scalability evaluation (cont.) • The swap use of One-to-one dramatically increases due to an increase in the number of partitions. • This is the reasonof the OS crash. • Our system does not useswap space as much because of Content Access Scheduler.
Conclusions • Proposal: Hi-sap • Secure and high-performance web server system • Implementation: • On a Linux OS with SELinux. • Achievement: • High performance • High scalability
Future Work • Creating various Content Access Schedulers • for wiki • for weblog • for CMS, etc. • Evaluating these schedulers
Thank you. Any questions/comments?