340 likes | 595 Views
HTTP /Apache. Intro to Apache / HTTP / Modules. Introduction to Apache. Two thirds of the web servers apache Excellent tool for WWW hosting May not perform well in some bench mark tests – does perform in the field. It is stable, heavily tested in different environments and platforms and
E N D
HTTP /Apache Intro to Apache / HTTP / Modules
Introduction to Apache • Two thirds of the web servers apache • Excellent tool for WWW hosting • May not perform well in some bench mark tests – does perform in the field. • It is stable, heavily tested in different environments and platforms and • It’s FREE!
A little bit of History • Apache is based on NCSA’s httpd, a daemon that was first used in early years of the Internet. • Rob McCool – creator of NCSA Web Server til 1994 – then no single programmer took his place. • Server cont’d to grow in popularity but incompatibilities between versions began to develop.
History Cont’d • Eventually – a group of administrators began working together to regain control. • Single path developed project came to be know as “ a patchy server “ or Apache server. • Today – Apache posseses a level of complexity that easily surpasses some OS’s. • Not necessary to learn every feature but a functional understanding is needed.
TCP / IP • Transmission Control Protocol / Internet Protocol • Main protocol suite that allows computer on the Internet to communicate. • Name to number translation is possible with Domain Name System (DNS) • A group of distributed servers that exist to translate the string of numbers into names.
Every time the name of a Web site is entered, the name must be resolved to an IP address through a look-up on a DNS server to determine where to send the request. • Apache relies on TCP \ IP for : • Communication with Internet browsers • Depends on TCP / IP as a language between web servers • TCP / IP Port 80 used to carry the request / response protocol – known as http
HTTP • Hyper Text Transfer Protocol protocol used to submit and serve requests. • Uniform Resource Identifier (URI), Uniform Resource Locator (URL), and Uniform Resource Name (URN) – all part of the communication between clients and servers using HTTP. • Each request is preceded by a bit of information about the request/response that is contained n a header.
Headers • The client and the sever know they are actually dealing with hypertext when they transfer info back and forth. • When they send data – start transmission with headers. • It is a piece of info that tells the client or server what kind of info it is receiving. • Knows the diff between text, html, jpeg images and …..
Headers Cont’d • Apache makes significant work of headers. • Headers can indicate what kind of Web browser is hitting the Web site. • Apache can use this info to do specific things based on browser info contained within the headers. • More later…
Apache • On a Web server – server process listens for incoming requests on a specified port of the network interface – responds appropriately. • At first – web sites consisted of static web pages in a directory tree and special functions could be handled by CGI scripts – executed on request. • Today – variations have been developed making it necessary for developers to rethink and retool the original Web server definition.
Apache Cont’d • Most expect a certain level of participation by the server. • Dynamic content – is everywhere. • Serve web pages as well and create them. • Greater demands on the server
Apache Conventions • How does Apache work? • By replicating itself • Let’s the Apache copy – called a child process – handle the process • These are the processes that dole out the content • Apache rec’s a request, it chooses one of its idle child processes – handles the request. • When it’s done – returns to an idle state ready for more work.
Every child handles a certain number of requests before dying. • Killing a child process after it handles a # of requests also kills any side effects – exp memory leaks. • Use can determine # of children Apache spawns at one time, as well as the life span of the child. • Normally – child handles unlimited # of req’s.
Apache Configurations • Good way to learn about apache is – apache configuration • First level of configuration occurs during compilation. • Set : • Whether Apache uses Dynamic Shared Objects (DSO’s) or statically linked modules • If static modules – tell Apache which modules to compile into the server at this pt. • Default locations • Layout of Apache’s files and directories
Http.conf • Next level of configuration – httpd.conf file • After installation – is a global perferences file that controls all aspects of the server: • Who can access the Web server • What dynamic modules the server loads • Where certain functions occur • What kind of info apache will log • The location of the virtual web sites apache controls • What content is allowed
.htaccess file • Last level of configuration occurs here • These are small, local files that contain specific orders that apply to the directories in which they reside. • Httpd.conf contains great deal of info about Apache’s users. • Comments are marked using #’s. • Apache offers diff levels of control within httpd.conf, based on user’s needs.
These levels are called scopes. • A scope can apply to an entire system, a virtual host, an absolute directory path, or even a single file. • Wild cards are used with scopes to give greater freedom. • Containers are used to define each scope
Four Main Containers <Directory /path> #This container applies to the filesystem location of a directory. #For example, /home/user/public_html would #pt to user’s public_html directory. Directive </Directory>
<Location /~*> #Location applies to a url. For example ~/user would pt to user’s public_html directory Directives </Location>
<VirtualHost host_name> #VirtualHost applies to a virtual web site you are hosting, such as mysite.com Directives </VirtualHost>
<Files file_name> #Applies to a file name, no matter where it is located. If file_name were index.html, then all instances of index.html ( no matter what their location) would be affected by your directives. Directives </Files>
Directives • Is a command give to Apache that controls its behavior. • They are a single line of text with the following format: • Directive [option [option] ] More than one option for your key word. Can be more than one directive in a given scope. But each directive must occur on its own line. Directives that do not reside in any container apply to all scopes More on directives later…..
Handlers • When a user requests a certain file, Apache decides what to do with it. • If the user accesses a file called hiccup.cgi --- Apache must know how to execute the .cgi or not try to display it as an HTML file. Handlers help you direct Apache to the correct helper app. Exp: add a handler called cgi-script and apply it to all files with an ext. of .cgi
Logging • Apache logs almost everything that happens to it – errors to HTTP transactions and even cookies. • User can set the level of logging and what is logged in httpd.conf • More on logging later….
Modules • Are chunks of software (perl or C) – allow customization of Apache. • For instance – for Apache to use Server side Includes – user must install that module. • Kernel modules might enable support for a nic card, some protocol, or other resource. • Linux kernel modules can be loaded and unloaded for quick system customization.
Apache was originally developed with above concept in mind – not quite there yet but can easily incorporate integration of dynamic module loading. • Static Versus Dynamic Modules • Early versions – modules were statically linked into the httpd app during compilation. • Specific configuration of a server was defined during the config process before compilation • Using this method, provide for development of numerous modules – customization of the server.
Since version 1.3 – it is possible to compile the server and the modules to support dynamic loading. • With this option compiled, configuration of the server can be changed by simply restarting the server after editing the config file – rather than recompiling the server. • Today – two modules must be compiled into the server: http_core.c provides the core functionality of the server and mod_so.c which supports DSO.
Static Module Advantages • It may be desirable to compile other modules into the server so that they are always available, giving a slightly faster response and guaranteeing that a specific version of the module is present at all times.
Static Module Disadvantages • Compiling a module statically into the server takes up space within the server, even if its features are not used. • Also – upgrading requires the user to recompile the server.
DSO Advantages • Benefits are being able to add or remove a feature or to update a feater to a newer version without having to recompile the server.
DSO Disadvantages • DSO files – cause the server to be approx 20% slower at startup – cause the module loader must find all the modules – load them – resolve relocation symbols. • However – startup is infrequent. • On some platforms – performance is 5% slower – due to relative addressing of position-independent code – slower than absolute addressing. • Few other potential problems – apply to platforms other than Linux
Using DSO Files • Since advent of DSO – more common for distros to include Apache server with binaries already installed and make installation of source code opt. • This means that a full set of files necessary to compile a new module acquired from another source is not initially present on the system. • To prevent this problem, authors developed Apache eXenSion (APXS)
APXS • A Perl program • Created at installation and gives access to all Apache header files – as well as platform-dependent compiler and linker flags. • User is then able to compile Apache modules without the Apache source tree and without struggling with platform-dep linker and compiler flags. • Using APXS $cd /path/to/the_module $apxs –c mod_new_module.c $apxs –i –a –n foo mod_new_module.so
HTTP • Broken down – Hypertext is a format of content viewed by Web browser. • Tranfer Protocol is means by which Web servers and clients share hypertext. • Version HTTP/1.1 • Lynx – www.slcc.edu/lynx/release • Info: • www.lynx-brower.org