560 likes | 813 Views
Generating Dynamic Content for the Web. University of Georgia CSCI 4800/6800. Technologies for generating dynamic content. CGI Servlets JSP Struts JSF. Web Content Types. Three types of content: Static Dynamic Active. Static Content.
E N D
Generating Dynamic Content for the Web University of Georgia CSCI 4800/6800
Technologies for generating dynamic content • CGI • Servlets • JSP • Struts • JSF
Web Content Types • Three types of content: • Static • Dynamic • Active
Static Content • defined in text file by page author • remains unchanged until edited
Dynamic content • generated on demand by HTTP server • program on server returns output to client • counters, database searching, search engines, questionnaires, up-to-date info
Active content • executes code on the client computer • user interaction, display updating, remote connections, smart forms
Dynamic Content • Server must be able to execute program • The program generates the document dynamically • Server programs can be written in any language • Shell scripts, C, C++, Java, Perl, Tcl, PHP, Python, ASP, etc. • Program output returned to web client via HTTP server • Output must be in form of static page • e.g., Content-type: text/html, image/gif etc. • Some types of content can contain dynamic components • Server needs to recognize dynamic document request • On a per-directory basis, e.g., /cgi-bin/* • Or via file names, e.g., *.jsp
Common Gateway Interface(CGI) • CGI standard defines server-program interaction • Developed at the National Center for Supercomputing Applications (NCSA) • CGI was the first way of generating dynamic content • Based on the Unix shell model • Parameters passed via stdin/stdout and shell environment variables • Typically, a special directory is used on the server for CGI programs • /cgi-bin/ • URL selects program to run • http://host/cgi-bin/program
CGI WWW Client WWW Server CGI program Invoke CGI request response CGI output internet server
CGI: Pros and Cons • Pros of CGI: • Simple; suitable for small once-off tasks • Supported by all web servers • Cons of CGI: • Slow; web server forks new process for every request • Parameter decoding tedious
HTML Forms • Dynamic content is often generated in response to HTML forms • Example: • http://www.random.org/nform.html
HTML Forms <form method=“get” action=“http://www.random.org/cgi-bin/randbyte”> <p>Generate <input type="text" name="nbytes"/> random bytes (maximum 16384).</p> <p>Format:</p> <input type="radio" name="format" value="hex" checked/> Hexadecimal <br/> <input type="radio" name="format" value="dec"/> Decimal <br/> <input type="radio" name="format" value="oct"/> Octal <br/> <input type="radio" name="format" value="bin"/> Binary <br/> <input type="radio" name="format" value="file"/> Download to a file <br/> <input type="submit" value="Get Bytes"/> <input type="reset" value="Reset Form"/> </form>
HTML Forms and Parameters • Each form field has a name • Fields passed as (name, value) pairs • Names and values separated by ‘=’ • Multiple pairs separated by ‘&’ • e.g., nbytes=256&format=hex • Called the query string • Non-printable characters are encoded • Space encoded as ‘+’ or ‘%20’ • Any character can be encoded as %x where x is the character’s ASCII value in hex, e.g., %26 for ‘&’
HTML Forms and Parameters • With GET requests, the query string is appended to the base URL as follows: • path?querystring GET /cgi-bin/randbyte?nbytes=256&format=hex HTTP/1.0 • Query string appears in browsers URL bar • Query string can be bookmarked • Query string can be contained in web pages
HTML Forms and Parameters • With POST requests, the query string is sent in the optional data field of the HTTP request • Unlimited query length • Query string not part of URL • Hyper-references w/ POST request containing query strings cannot be bookmarked or used as hyperlinks
Comparison …. • In both cases, server side program must decode the data supplied by the client • CGI just gives you the raw query string • Decoding can be tedious • Other approaches to dynamic content generation do this for you: • Example: Java Servlets: • HttpServletRequest.getParameter(name)
More about query strings • Query strings can be constructed as a fixed URL, e.g., embedded in a page or bookmarked by the browser from a HTML form • Query strings constructed from forms follow the name-value pair format • Otherwise, the format is defined by the programmer • e.g., http://www.eboard.com/show.jsp?234873
Parameter passing with CGI • When invoked with GET, the query string is passed as a shell environment variable called QUERY_STRING • CGI program must evaluate the variable and parse the string • When invoked with POST, the query string is passed through standard input • CGI program must read from stdin and parse the string • In both cases, the CGI program outputs the response to stdout
Simple CGI script example #!/bin/sh echo “Content-type: text/html\n\n” echo “<html><body><p>” echo “Your query string was: $QUERY_STRING” echo “</p></body></html>”
HTTP and State • Recall that HTTP is stateless • Server maintains no state about clients between successive HTTP requests • Statelessness is an attractive feature, because it makes servers less vulnerable to client failures (and vice versa) • However, state is useful • Maintain a history of previous invocations or visits • Correlate information from several requests • Trace users through a web site
HTTP and State • State can take any form • In HTTP, typically one or more (name, value) pairs • Short-term state can be encoded in a variety of ways: • in URL to browser (URL rewriting) • in HTML documents served (hidden fields) • in cookies • Long-term state can be encoded: • Keep record of hosts addresses in file • Stored in cookies
HTTP State: URL Rewriting • Server stores state in URLs embedded in content • State encoded as GET-style HTTP parameters • Subsequent requests for those URLs will include the parameters • e.g., http://www.random.org/essay.php?id=212 • Server generates content dynamically • All local links in the content (page) are translated by the web server to include the specified state • e.g., parameter (name, value) pair ‘id=212’
URL Rewriting: Example • The request to www.random.org: GET /essay.php?id=212 HTTP/1.0 • Could result in the following page: <html> <head>...</head> <body> <p>There is the <a href="/users.php?id=212">users page</a>, there is the <a href="/clients/?id=212">client archive</a> and we here at <a href="/?id=212">random.org are grateful to <a href="http://www.tcd.ie/">Trinity College</a>...</p> </body> • Note: Only local links are rewritten
URL Rewriting: Example • With URL rewriting, you need a way of creating the first URL • Typically done via a login procedure using an HTML form • If the server receives a request without parameters, it returns a login form instead of the content. • If the login form is submitted (and details validate), the server returns the first page with rewritten URLs.
URL Rewriting • With URL rewriting, the hyperlinks are personalised • Support for URL rewriting in some technologies for dynamic content generation: • e.g., Java Servlets: HttpServletResponse.encodeURL()
URL Rewriting: Advantages • URL rewriting works just about everywhere, especially when cookies are turned off. • Multiple simultaneous sessions are possible for a single user. • Session information is local to each browser instance, since it is stored in URLs in each page being displayed. • Entirely static pages cannot be used with URL rewriting, since every link must be dynamically written with the session state.
URL Rewriting: Disadvantages • Every URL on a page which needs the session information must be rewritten each time a page is served • Computationally expensive • Can increase communication overhead • State stored in URLs is not persistent • Can make sharing of URLs difficult • URL rewriting limits the client's interaction with the server to HTTP GET requests • Unless used in combination with hidden fields
HTTP State: Hidden Fields • If the content contains forms (e.g., a multi-form questionnaire), state can be saved in the form(s) • Special ‘hidden’ form fields not displayed by the browser • Parameters encoded by the browser in the same way as for ordinary fields
Hidden Fields: Example • First form: <form method="post" action="form-handler.php"> <p>Enter your name: <input type="text" name="user" /> </p> <input type="hidden" name="stage" value="1" /> <input type="submit" value="Next" /> </form> • Server encodes the state (including values submitted by the user) in the second form: <form method="post" action="form-handler.php"> <p>Enter your age: <input type="text" name="age" /> </p> <input type="hidden" name="stage" value="2" /> <input type="hidden" name="user" value="Joe Random" /> <input type="submit" value="Next" /> </form> • When at the last stage, all data is processed
Hidden Fields: Pros and Cons • Pros: • State processing on the server side easier than URL rewriting; hidden fields simply treated as ordinary fields • Supported by all browsers, regardless of user’s (cookie) preferences • Cons: • Requires forms; not suitable for plain links • Others?
HTTP State: Cookies • Cookies are: • Small pieces of information • Sent by web servers to web clients • Stored by the clients • Read back by the server who sent the cookie • Cookies are used to maintain state on the client side
HTTP State: Cookies • Cookies are often used to store: • User IDs and passwords • Info about preferences or start pages • Contents of shopping baskets • But also for: • User tracking within a web site • Building user profiles • Targeted marketing (advertising)
HTTP State: Cookies • Cookies are set (by the server) via HTTP Response headers Set-Cookie: NAME=VALUE; expires=DATE; path=PATH; domain=DOMAIN_NAME; secure • And sent back (by the client) via HTTP Request headers Cookie: NAME=VALUE; NAME=VALUE; ... • Date format: DAY, DD-MMM-YYYY HH:MM:SS • Path format: / separated • Domain format: hostname.subdomain.tld
HTTP State: Cookies • A client will send along a cookie with an HTTP request provided that: • The server host name from the URL matches the domain for the cookie • The path name from the URL matches the path for the cookie • The cookie has not expired • Limitation: • Cookies are bound to the server that originally set them • Limits cookies within that server domain
Cookies: Example • First request: POST /basket-add.php HTTP/1.0 ... uid=12&pid=9828 • Could mean “user 12 adds product 9828 to her shopping basket.” • Server response: HTTP/1.0 200 OK Set-Cookie: basket=uid=12&pid=9828&pid=7884; expires=Tuesday, 23-11-2005 14:42:12; path=/books/; domain=www.ammozon.com; secure ... <html><body><p>The content of your shopping basket is...</p></body></html> • 7884 was in the basket already?
Cookies: Example • Second request: GET /books/special-offers.php HTTP/1.0 Cookie: uid=12&pid=9828&pid=7884 ... • Third request: GET / HTTP/1.0 ... • Fourth request: GET /gnus/ HTTP/1.0 ... • No cookies are sent because the paths ‘/’ and ‘/gnus/’ do not match ‘/books/’
Cookies: Pros and Cons • Pros • Highly transparent to user • Avoids server getting clogged with state • Great for personalizing content • Cons • Specific to the computer not the user • Privacy issues
HTTP State: Cookies • Seemingly innocent • Originally designed by Netscape as a simple way of letting users identify themselves • Has many uses • Also less friendly to users • Privacy Issues • Can track every single movement of a user through a web site! • Can be used to analyze (and improve) web sites • But also to build profiles of users • Try surfing with cookie warnings enabled
Dynamic Content • HTML in Code • CGI scripts (any language) • Java Servlets • AOLServer’s TCL support • Code in HTML • Java Server Pages (JSP) • Microsoft Active Server Pages (ASP) • PHP: Hypertext Preprocessor (PHP) • AOLServer Dynamic Pages (ADP) • mod_perl • Others?
Java Servlets • Java on the server side • Request/response based API • More efficient than CGI • Loaded once, stays resident • Multiple requests = multiple threads • Java Servlet Development Kit (JSDK) • Java Servlet API Specification, v2.3 • August 2001: Final Version • Implemented by Tomcat 5.1, Jigsaw & others
Basic Servlet Interaction HTTP request Web client Web Server HTTP response Servlet API Servlet
Java Servlets • Servlets can be used to extend web servers in a modular fashion • Extra functionality are kept outside the web server core • Increased web server reliability • Increased modularity
Java Servlets • Servlets can use the entire Java language • In particular the Java Database Connectivity (JDBC) API • Standard API means: • Servlets, once written, can be used with any web server implementing the Java Servlet API • Apache, iPlanet, Microsoft IIS, etc. • This is an advantage over some other server-side languages (e.g., ASP), which are (often) bound to a particular server
Servlet Basics • Servlets work with three types of objects • Requests • Responses • Sessions • Request objects • Methods to parse out name/value parameters • HTTP request header fields available
Servlet Basics • Response objects • Can set HTTP response, status codes and content • HTTP session objects • Methods to identify requests from same client • Implemented with cookies • Unique identifier allocated for each session
Servlet API and Lifecycle • A servlet is an instance of a class implementing the javax.servlet.Servlet interface • Most servlets extend one of the two classes • javax.servlet.GenericServlet • javax.servlet.http.HttpServlet • The servlet API include these methods: • init() is called when the servlet is loaded • service() processes requests (concurrently!) • destroy() is called when the servlet is unloaded
Example Servlet Lifecycle init service Time service service service service service service service destroy Thread 2 Thread 3 Thread 1
Servlet API • The service() method dispatches service requests to one of four methods • doGet, doPut, doPost, doDelete • These methods are passed two parameters • One of type HttpServletRequest • One of type HttpServletResponse • The parameters are objects that can be invoked to: • Read info about the HTTP request • Generate the HTTP response • Sessions are maintained via HttpSession objects • Accessed via the HttpServletRequest object
Servlet Example #1 import java.io.*; import javax.servlet.*; import javax.servlet.http.*; public class Hello extends HttpServlet { public void doGet (HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException { PrintWriter out; String title = "Snoop Servlet"; String ua = request.getHeader("User-Agent"); String ref = request.getHeader("Referer"); response.setContentType("text/html"); out = response.getWriter(); out.println("<html><head><title>"); out.println(title); out.println("</title></head><body>"); out.println("<h1>" + title + "</h1>"); out.println("<p>Hello!</p>"); out.println("<p>Your browser is " + ua + " and " + " you got here via " + ref + "</p>"); out.println("</body></html>"); out.close(); } }