450 likes | 761 Views
itec 400 Apache Web Server. George Vaughan Franklin University. Topics. Background information HTTP, URLs, Web History Apache Web Server: History, how it works Configuring, Running and Administering the Apache Web Server. Perl scripts and the Apache Web Server. The Web.
E N D
itec 400Apache Web Server George Vaughan Franklin University
Topics • Background information • HTTP, URLs, Web History • Apache Web Server: • History, how it works • Configuring, Running and Administering the Apache Web Server. • Perl scripts and the Apache Web Server.
The Web • The WWW and the internet are not the same. • We can view the WWW as a very general client-server application running on the internet. • The WWW is based on a client - server architecture • Browser is the client • Web server is the server
HTTP • The browser and the web server communicate with each other using Hyper Text Transfer Protocol (HTTP). • HTTP defines how messages are transmitted and what actions the browser and server perform when such messages are received. • HTTP is a stateless protocol.
URL • The browser uses HTTP to request objects using a Universal Resource Locator (URL). • The URL is the global address of an object.
URL • An example of a URL: http://cs.franklin.edu:80/some_dir/webPage.html • Where: • ‘http’ specifies the protocol • ‘cs.franklin.edu’ specifies the web server host. • ‘:80’ specifies the host port (80 is default) • The first slash after the port is the ‘Doc Root’ • ‘some_dir’ is a directory under ‘Doc Root’ • ‘webPage.html’ is the target object
A Little History of the WWW* *Notes from http://www.w3.org/History.html • 1960’s • Ted Nelson coins the word Hypertext in A File Structure for the Complex, the Changing, and the Indeterminate. 20th National Conference, New York, Association for Computing Machinery, 1965 • Doug Engelbart prototypes an "oNLine System" (NLS) which does hypertext browsing editing, email, etc. He invents the mouse for this purpose. • to see 1968 demo, visit: http://sloan.stanford.edu/mousesite/1968Demo.html
A Little History of the WWW* *Notes from http://www.w3.org/History.html • 1980 • While consulting for CERN June-December of 1980, Tim Berners-Lee writes a notebook program, "Enquire-Within-Upon-Everything", which allows links to be made between arbitrary nodes. • CERN - European Organization for Nuclear Research • 1990 • October - Tim Berners-Lee starts work on a hypertext GUI browser+editor • Coins the term "WorldWideWeb" as a name for the program.
A Little History of the WWW* *Notes from http://www.w3.org/History.html • 1993 • February - NCSA release first alpha version of Marc Andreessen's "Mosaic for X". • September - WWW (Port 80 http) traffic measures 1% of NSF backbone traffic. • NCSA - National Center for Supercomputing Applications • 1994 • March - Marc Andreessen and colleagues leave NCSA to form "Mosaic Communications Corp" (later Netscape). • 2003 - More than 80 percent of all Internet traffic is WWW traffic
History of Apache • Like Mosaic and Netscape Browsers, the Apache Web Server can trace its root to NCSA. • Apache was originally based on code from an NCSA web server (circa 1995). • The original Apache web server was ‘a patchy’ version of the NCSA web server - hence the name
Apache Web Server • The name of the Apache binary in Unix/Linux is: httpd (goes back to NCSA) and is located at /usr/sbin/httpd. • httpd executes under the login ID of ‘apache’, except for the first instance which is run under ‘root’. • When the Apache project was started (1995), the NCSA web server was the most popular web server.
Apache Web Server • The Apache Web Server is free. • Can be downloaded at: www.apache.org • Comes bundled with RedHat (Apache 2.0 comes with Fedora 3.0).
How Apache Works • Apache sits and listens to the IP addresses and port specified in its Config file. • The default port is 80. • Can be configured to listen to other ports.
Apache Market Share(Data from: http://www.netcraft.com/survey/) Market Share for Top Servers Across All Domains August 1995 - February 2006
Running Apache • The following slides are based on Apache 2.0 running on Fedora 3.0 • Below is check to see if Apache is already installed: [root@microtel bin]# rpm -q httpd httpd-2.0.52-3.1 • Edit the config file /etc/httpd/conf/httpd.conf • Search for the line that begins with: #ServerName www.example.com:80 • Uncomment the line and replace www.example.com with name of server or IP address • example: ServerName 192.168.1.12:80 • save the file.
Running Apache • The Apache binary is located at: /usr/sbin/httpd • You must be root to start httpd - this is because only root can listen to ports below port 1024. • By default, web servers listen to port 80. • You may start the server by typing: service httpd start
Running Apache 0001: [root@localhost root]# service httpd start 0002: Starting httpd: [ OK ] 0003: 0004: [root@localhost root]# service httpd status 0005: httpd (pid 3108 3107 3106 3105 3104 3103 3102 3101 3098) is running... 0006: 0007: [root@localhost root]# service httpd stop 0008: Stopping httpd: [ OK ] Notes: • Line 1: start the web server • Line 4: check web server status (pid’s of server processes • Line 8: stop the web server
Running Apache • There will be one instance of httpd per request being serviced, plus some additional instances that are waiting. • One instance will be owned by root. • The remaining instances will be owned by the default user name (apache): [root@microtel httpd]# ps -ef | fgrep httpd root 3314 1 0 13:20 ? 00:00:00 /usr/sbin/httpd apache 3395 3314 0 13:20 ? 00:00:00 /usr/sbin/httpd apache 3396 3314 0 13:20 ? 00:00:00 /usr/sbin/httpd apache 3397 3314 0 13:20 ? 00:00:00 /usr/sbin/httpd apache 3398 3314 0 13:20 ? 00:00:00 /usr/sbin/httpd apache 3399 3314 0 13:20 ? 00:00:00 /usr/sbin/httpd apache 3400 3314 0 13:20 ? 00:00:00 /usr/sbin/httpd apache 3401 3314 0 13:20 ? 00:00:00 /usr/sbin/httpd apache 3402 3314 0 13:20 ? 00:00:00 /usr/sbin/httpd • Each Server process is mult-threaded. • The initial number of severs is tunable. • The number of server processes may grow or shrink automatically, based on load.
Running Apache • You can test your web server by using the local host URL: http://127.0.0.1/ which will produce the web page on next slide • You do not have to have your server on a network to perform this test.
Running Apache • We can create our own default web page • Name the file index.html • Place file in the directory: /var/www/html • Now we can see our own default web page when we use the local host URL: http://127.0.0.1/ which will produce the web page on next slide.
Apache Auto-Start • Apache does not automatically start at boot as delivered by Fedora. • It is easy to make Apache start at boot with the chkconfig command (see man page): 0001: [root@localhost init.d]# chkconfig --list httpd 0002: httpd 0:off 1:off 2:off 3:off 4:off 5:off 6:off 0003: 0004: [root@localhost root]# chkconfig httpd on 0005: 0006: [root@localhost root]# chkconfig --list httpd 0007: httpd 0:off 1:off 2:on 3:on 4:on 5:on 6:off
Apache Manual • You may also install an html based manual for Apache (this is a separate package from Apache itself). • Below is a check to see if the manual is installed: [root@microtel ~]# rpm -q httpd-manual httpd-manual-2.0.52-3.1 • If the manual is not installed, you may use the ‘rpm’ installation mechanism to install it after it has been downloaded. • Once installed, the manual can be accessed locally on the server using the following URL: http://127.0.0.1/manual/ • The next slide shows the home page for the manual.
Configuring Apache • Configuration file • located at /etc/httpd/conf/httpd.conf • ASCII file – can be edited with ‘vi’ • contains such things as: • server name • port(s) to listen on • default user name (apache) • Document Root (/var/www/html) • error log location (/var/log/httpd/error_log) • many, many other paraemeters
Configuring Apache • apacheconfig: • Program to generate config file • Located at /usr/share/apacheconf • Will overwrite any manually created/modified config file. • It is probably best to learn how to configure Apache manually using: vi /etc/httpd/conf/httpd.conf
/etc/httpd/conf/httpd.conf • The Apache config file has many directives. • One directive defines the ‘Document’ root. The concept is the same as the ‘root’ directory in Unix. • The document root is the top level directory for web content. • By default, the top level document root in Apache is /var/www/html, but can be changed. • The directive appears below: # # DocumentRoot: The directory out of which you will serve your # documents. By default, all requests are taken from this directory, but # symbolic links and aliases may be used to point to other locations. # DocumentRoot "/var/www/html"
/etc/httpd/conf/httpd.conf • The following directive allows users to have user level doc roots at ~/public_html • Disabled by default for security since it can be used to confirm lognames on the system <IfModule mod_userdir.c> UserDir public_html UserDir disabled root </IfModule>
etc/httpd/conf/httpd.conf • The following server directive allows a browser to collect server information. • Will only work on browser on server for security. • URL: http://127.0.0.1/server-info <Location /server-info> SetHandler server-info Order Deny,Allow Deny from all Allow from 127.0.0.1 </Location> • Next slide shows browser contents
etc/httpd/conf/httpd.conf • The following server directive allows a browser to collect server information. • Will only work on browser on server for security. • URL: http://127.0.0.1/server-status <Location /server-status> SetHandler server-status Order Deny,Allow Deny from all Allow from 127.0.0.1 </Location> • Next slide shows browser contents
Webalizer • Apache includes the webalizer which produces run-time statistics on server performance. • Start at command line: webalizer • Start from browser: http://127.0.0.1/usage • Next page contains a snapshot
Apache Log Files • Apache maintains an access log and an error log. • The access log is located at: /var/log/httpd/access_log • The access log logs who accessed what and from where: 192.168.1.101 - - [16/Mar/2003:23:19:45 -0500] "GET /icons/text.gif HTTP/1.1" 304 - 192.168.1.101 - - [16/Mar/2003:23:19:45 -0500] "GET /icons/back.gif HTTP/1.1" 304 - 192.168.1.101 - - [16/Mar/2003:23:19:45 -0500] "GET /icons/image2.gif HTTP/1.1" 304 - 192.168.1.103 - - [17/Mar/2003:20:18:23 -0500] "GET / HTTP/1.1" 200 1228
Apache Log Files • The Apache error log is located at: /var/log/httpd/error_log • The error log tracks server operations (not necessarily errors): [Sun Mar 16 23:59:55 2003] [notice] caught SIGTERM, shutting down [Mon Mar 17 19:23:07 2003] [notice] SIGHUP received. Attempting to restart [Mon Mar 17 19:23:08 2003] [notice] Apache/1.3.20 (Unix) (Red-Hat/Linux) mod_ssl/2.8.4 OpenSSL/0.9.6b DAV/1.0.2 PHP/4.0.6 mod_perl/1.24_01 configured -- resuming normal operations
CGI Scripts • CGI scripts go in /var/www/cgi-bin and not /var/www/html • More secure - /var/www/cgi-bin is not under doc root (harder to find) • Doc Root has general access - /var/www/cgi-bin can be made more restrtictive. • The following config parameter maps “cgi-bin” in Doc Root search space: ScriptAlias /cgi-bin/ "/var/www/cgi-bin/“ • The URL for a cgi script might look like this: http://192.168.1.102/cgi-bin/welcome.cgi
Apache and Perl Scripts • The Apache architecture is flexible. • New functionality can be added by loading Apache modules. • One such module is: mod_perl • perl scripts can still executed without mod_perl. • Without mod_perl, perl scripts are executed as CGI scripts • Installed Apache modules are located at: /etc/httpd/modules
Apache and Perl Scripts • The disadvantage of running a perl script using CGI: • each time a Perl script runs, Apache has to load the Perl interpreter. • If our site has only one page (onePage.pl) and if we have 100,000 visitors, the interpreter has to be loaded 100,000 time and the script has to be loaded and compiled a 100,000 times.
Apache and Perl Scripts • mod_perl basically embeds the Perl interpreter into Apache • Perl scripts run within Apache rather than as a separate process. • Running scripts using mod_perl can be 100 times faster than using CGI • mod_perl allows perl scripts to interact with the Apache web server itself.
Apache and Perl Scripts • Perl scripts can also be cached using mod_perl. • This means that perl scripts are compiled only once. • See perl.apache.org • Apache 2.0 in RedHat 9.0 comes with mod_perl already configured.
Virtual Hosts • Apache can be configured to support multiple ‘virtual hosts’. • In other words, Apache can be configured to support multiple web sites on a single machine. • When a request comes in, Apache uses the IP address, port, and hostname to determine which virtual host should service the request. • Each virtual host can have its own server name, doc root, error log, transer log, config file, etc…
Tux • Tux is a web server that comes with RedHat. • Tux does not replace Apache is works with Apache. • Tux runs at the kernel level -very fast. • Tux is used to server static pages. • Tax forwards more complex requests to Apache • See http://people.redhat.com/mingo/TUX-patches/2.1-docs/index.html
References • Red Hat Fedora and Enterprise Linux 4 Bible, Christopher Negus, 2005 • http://www.w3.org/History.html • Apache – The Definitive Guide, Ben Laurie and Peter Laurie, 2003.