1 / 54

Generating Dynamic Content for the Web

Generating Dynamic Content for the Web. University of Georgia CSCI 4800/6800. Technologies for generating dynamic content. CGI Servlets JSP Struts JSF. Web Content Types. Three types of content: Static Dynamic Active. Static Content.

hestia
Download Presentation

Generating Dynamic Content for the Web

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Generating Dynamic Content for the Web University of Georgia CSCI 4800/6800

  2. Technologies for generating dynamic content • CGI • Servlets • JSP • Struts • JSF

  3. Web Content Types • Three types of content: • Static • Dynamic • Active

  4. Static Content • 􀂄 defined in text file by page author • 􀂄 remains unchanged until edited

  5. Dynamic content • 􀂄 generated on demand by HTTP server • 􀂄 program on server returns output to client • 􀂄 counters, database searching, search engines, questionnaires, up-to-date info

  6. Active content • executes code on the client computer • user interaction, display updating, remote connections, smart forms

  7. Dynamic Content • Server must be able to execute program • The program generates the document dynamically • 􀂄 Server programs can be written in any language • Shell scripts, C, C++, Java, Perl, Tcl, PHP, Python, ASP, etc. • 􀂄 Program output returned to web client via HTTP server • 􀂄 Output must be in form of static page • e.g., Content-type: text/html, image/gif etc. • Some types of content can contain dynamic components • 􀂄 Server needs to recognize dynamic document request • On a per-directory basis, e.g., /cgi-bin/* • Or via file names, e.g., *.jsp

  8. Common Gateway Interface(CGI) • 􀂄 CGI standard defines server-program interaction • Developed at the National Center for Supercomputing Applications (NCSA) • 􀂄 CGI was the first way of generating dynamic content • 􀂄 Based on the Unix shell model • Parameters passed via stdin/stdout and shell environment variables • 􀂄 Typically, a special directory is used on the server for CGI programs • 􀂄 /cgi-bin/ • 􀂄 URL selects program to run • http://host/cgi-bin/program

  9. CGI WWW Client WWW Server CGI program Invoke CGI request response CGI output internet server

  10. CGI: Pros and Cons • Pros of CGI: • 􀂄 Simple; suitable for small once-off tasks • 􀂄 Supported by all web servers • 􀂄 Cons of CGI: • 􀂄 Slow; web server forks new process for every request • 􀂄 Parameter decoding tedious

  11. HTML Forms • Dynamic content is often generated in response to HTML forms • Example: • http://www.random.org/nform.html

  12. HTML Forms <form method=“get” action=“http://www.random.org/cgi-bin/randbyte”> <p>Generate <input type="text" name="nbytes"/> random bytes (maximum 16384).</p> <p>Format:</p> <input type="radio" name="format" value="hex" checked/> Hexadecimal <br/> <input type="radio" name="format" value="dec"/> Decimal <br/> <input type="radio" name="format" value="oct"/> Octal <br/> <input type="radio" name="format" value="bin"/> Binary <br/> <input type="radio" name="format" value="file"/> Download to a file <br/> <input type="submit" value="Get Bytes"/> <input type="reset" value="Reset Form"/> </form>

  13. HTML Forms and Parameters • Each form field has a name • Fields passed as (name, value) pairs • Names and values separated by ‘=’ • Multiple pairs separated by ‘&’ • e.g., nbytes=256&format=hex • Called the query string • Non-printable characters are encoded • Space encoded as ‘+’ or ‘%20’ • Any character can be encoded as %x where x is the character’s ASCII value in hex, e.g., %26 for ‘&’

  14. HTML Forms and Parameters • With GET requests, the query string is appended to the base URL as follows: • 􀂄 path?querystring GET /cgi-bin/randbyte?nbytes=256&format=hex HTTP/1.0 • 􀂄 Query string appears in browsers URL bar • 􀂄 Query string can be bookmarked • 􀂄 Query string can be contained in web pages

  15. HTML Forms and Parameters • With POST requests, the query string is sent in the optional data field of the HTTP request • Unlimited query length • Query string not part of URL • Hyper-references w/ POST request containing query strings cannot be bookmarked or used as hyperlinks

  16. Comparison …. • In both cases, server side program must decode the data supplied by the client • CGI just gives you the raw query string • Decoding can be tedious • Other approaches to dynamic content generation do this for you: • Example: Java Servlets: • HttpServletRequest.getParameter(name)

  17. More about query strings • Query strings can be constructed as a fixed URL, e.g., embedded in a page or bookmarked by the browser from a HTML form • Query strings constructed from forms follow the name-value pair format • Otherwise, the format is defined by the programmer • e.g., http://www.eboard.com/show.jsp?234873

  18. Parameter passing with CGI • When invoked with GET, the query string is passed as a shell environment variable called QUERY_STRING • 􀂄 CGI program must evaluate the variable and parse the string • When invoked with POST, the query string is passed through standard input • CGI program must read from stdin and parse the string • In both cases, the CGI program outputs the response to stdout

  19. Simple CGI script example #!/bin/sh echo “Content-type: text/html\n\n” echo “<html><body><p>” echo “Your query string was: $QUERY_STRING” echo “</p></body></html>”

  20. HTTP and State • 􀂄 Recall that HTTP is stateless • Server maintains no state about clients between successive HTTP requests • Statelessness is an attractive feature, because it makes servers less vulnerable to client failures (and vice versa) • 􀂄 However, state is useful • Maintain a history of previous invocations or visits • Correlate information from several requests • Trace users through a web site

  21. HTTP and State • State can take any form • In HTTP, typically one or more (name, value) pairs • Short-term state can be encoded in a variety of ways: • 􀂄 in URL to browser (URL rewriting) • 􀂄 in HTML documents served (hidden fields) • 􀂄 in cookies • Long-term state can be encoded: • 􀂄 Keep record of hosts addresses in file • 􀂄 Stored in cookies

  22. HTTP State: URL Rewriting • 􀂄 Server stores state in URLs embedded in content • 􀂄 State encoded as GET-style HTTP parameters • 􀂄 Subsequent requests for those URLs will include the parameters • 􀂄 e.g., http://www.random.org/essay.php?id=212 • 􀂄 Server generates content dynamically • 􀂄 All local links in the content (page) are translated by the web server to include the specified state • 􀂄 e.g., parameter (name, value) pair ‘id=212’

  23. URL Rewriting: Example • 􀂄 The request to www.random.org: GET /essay.php?id=212 HTTP/1.0 • 􀂄 Could result in the following page: <html> <head>...</head> <body> <p>There is the <a href="/users.php?id=212">users page</a>, there is the <a href="/clients/?id=212">client archive</a> and we here at <a href="/?id=212">random.org are grateful to <a href="http://www.tcd.ie/">Trinity College</a>...</p> </body> • 􀂄 Note: Only local links are rewritten

  24. URL Rewriting: Example • 􀂄 With URL rewriting, you need a way of creating the first URL • Typically done via a login procedure using an HTML form • 􀂄 If the server receives a request without parameters, it returns a login form instead of the content. • 􀂄 If the login form is submitted (and details validate), the server returns the first page with rewritten URLs.

  25. URL Rewriting • 􀂄 With URL rewriting, the hyperlinks are personalised • 􀂄 Support for URL rewriting in some technologies for dynamic content generation: • 􀂄 e.g., Java Servlets: HttpServletResponse.encodeURL()

  26. URL Rewriting: Advantages • URL rewriting works just about everywhere, especially when cookies are turned off. • Multiple simultaneous sessions are possible for a single user. • Session information is local to each browser instance, since it is stored in URLs in each page being displayed. • 􀂄 Entirely static pages cannot be used with URL rewriting, since every link must be dynamically written with the session state.

  27. URL Rewriting: Disadvantages • 􀂄 Every URL on a page which needs the session information must be rewritten each time a page is served • 􀂄 Computationally expensive • 􀂄 Can increase communication overhead • 􀂄 State stored in URLs is not persistent • 􀂄 Can make sharing of URLs difficult • 􀂄 URL rewriting limits the client's interaction with the server to HTTP GET requests • 􀂄 Unless used in combination with hidden fields

  28. HTTP State: Hidden Fields • If the content contains forms (e.g., a multi-form questionnaire), state can be saved in the form(s) • 􀂄 Special ‘hidden’ form fields not displayed by the browser • Parameters encoded by the browser in the same way as for ordinary fields

  29. Hidden Fields: Example • First form: <form method="post" action="form-handler.php"> <p>Enter your name: <input type="text" name="user" /> </p> <input type="hidden" name="stage" value="1" /> <input type="submit" value="Next" /> </form> • 􀂄 Server encodes the state (including values submitted by the user) in the second form: <form method="post" action="form-handler.php"> <p>Enter your age: <input type="text" name="age" /> </p> <input type="hidden" name="stage" value="2" /> <input type="hidden" name="user" value="Joe Random" /> <input type="submit" value="Next" /> </form> • 􀂄 When at the last stage, all data is processed

  30. Hidden Fields: Pros and Cons • Pros: • 􀂄 State processing on the server side easier than URL rewriting; hidden fields simply treated as ordinary fields • 􀂄 Supported by all browsers, regardless of user’s (cookie) preferences • 􀂄 Cons: • 􀂄 Requires forms; not suitable for plain links • 􀂄 Others?

  31. HTTP State: Cookies • 􀂄 Cookies are: • 􀂄 Small pieces of information • 􀂄 Sent by web servers to web clients • 􀂄 Stored by the clients • 􀂄 Read back by the server who sent the cookie • 􀂄 Cookies are used to maintain state on the client side

  32. HTTP State: Cookies • 􀂄 Cookies are often used to store: • 􀂄 User IDs and passwords • 􀂄 Info about preferences or start pages • 􀂄 Contents of shopping baskets • 􀂄 But also for: • 􀂄 User tracking within a web site • 􀂄 Building user profiles • 􀂄 Targeted marketing (advertising)

  33. HTTP State: Cookies • Cookies are set (by the server) via HTTP Response headers Set-Cookie: NAME=VALUE; expires=DATE; path=PATH; domain=DOMAIN_NAME; secure • And sent back (by the client) via HTTP Request headers Cookie: NAME=VALUE; NAME=VALUE; ... • Date format: DAY, DD-MMM-YYYY HH:MM:SS • Path format: / separated • Domain format: hostname.subdomain.tld

  34. HTTP State: Cookies • 􀂄 A client will send along a cookie with an HTTP request provided that: • The server host name from the URL matches the domain for the cookie • The path name from the URL matches the path for the cookie • The cookie has not expired • Limitation: • Cookies are bound to the server that originally set them • Limits cookies within that server domain

  35. Cookies: Example • First request: POST /basket-add.php HTTP/1.0 ... uid=12&pid=9828 • Could mean “user 12 adds product 9828 to her shopping basket.” • Server response: HTTP/1.0 200 OK Set-Cookie: basket=uid=12&pid=9828&pid=7884; expires=Tuesday, 23-11-2005 14:42:12; path=/books/; domain=www.ammozon.com; secure ... <html><body><p>The content of your shopping basket is...</p></body></html> • 7884 was in the basket already?

  36. Cookies: Example • Second request: GET /books/special-offers.php HTTP/1.0 Cookie: uid=12&pid=9828&pid=7884 ... • 􀂄 Third request: GET / HTTP/1.0 ... • 􀂄 Fourth request: GET /gnus/ HTTP/1.0 ... • No cookies are sent because the paths ‘/’ and ‘/gnus/’ do not match ‘/books/’

  37. Cookies: Pros and Cons • 􀂄 Pros • 􀂄 Highly transparent to user • 􀂄 Avoids server getting clogged with state • 􀂄 Great for personalizing content • 􀂄 Cons • 􀂄 Specific to the computer not the user • 􀂄 Privacy issues

  38. HTTP State: Cookies • 􀂄 Seemingly innocent • 􀂄 Originally designed by Netscape as a simple way of letting users identify themselves • 􀂄 Has many uses • 􀂄 Also less friendly to users • 􀂄 Privacy Issues • 􀂄 Can track every single movement of a user through a web site! • 􀂄 Can be used to analyze (and improve) web sites • 􀂄 But also to build profiles of users • 􀂄 Try surfing with cookie warnings enabled

  39. Dynamic Content • 􀂄 HTML in Code • 􀂄 CGI scripts (any language) • 􀂄 Java Servlets • 􀂄 AOLServer’s TCL support • 􀂄 Code in HTML • 􀂄 Java Server Pages (JSP) • 􀂄 Microsoft Active Server Pages (ASP) • 􀂄 PHP: Hypertext Preprocessor (PHP) • 􀂄 AOLServer Dynamic Pages (ADP) • 􀂄 mod_perl • 􀂄 Others?

  40. Java Servlets • 􀂄 Java on the server side • 􀂄 Request/response based API • 􀂄 More efficient than CGI • 􀂄 Loaded once, stays resident • 􀂄 Multiple requests = multiple threads • 􀂄 Java Servlet Development Kit (JSDK) • 􀂄 Java Servlet API Specification, v2.3 • 􀂄 August 2001: Final Version • 􀂄 Implemented by Tomcat 5.1, Jigsaw & others

  41. Basic Servlet Interaction

  42. Basic Servlet Interaction HTTP request Web client Web Server HTTP response Servlet API Servlet

  43. Java Servlets • Servlets can be used to extend web servers in a modular fashion • 􀂄 Extra functionality are kept outside the web server core • 􀂄 Increased web server reliability • 􀂄 Increased modularity

  44. Java Servlets • Servlets can use the entire Java language • 􀂄 In particular the Java Database Connectivity (JDBC) API • Standard API means: • 􀂄 Servlets, once written, can be used with any web server implementing the Java Servlet API • 􀂄 Apache, iPlanet, Microsoft IIS, etc. • 􀂄 This is an advantage over some other server-side languages (e.g., ASP), which are (often) bound to a particular server

  45. Servlet Basics • Servlets work with three types of objects • 􀂄 Requests • 􀂄 Responses • 􀂄 Sessions • 􀂄 Request objects • 􀂄 Methods to parse out name/value parameters • 􀂄 HTTP request header fields available

  46. Servlet Basics • 􀂄 Response objects • 􀂄 Can set HTTP response, status codes and content • 􀂄 HTTP session objects • 􀂄 Methods to identify requests from same client • 􀂄 Implemented with cookies • 􀂄 Unique identifier allocated for each session

  47. Servlet API and Lifecycle • A servlet is an instance of a class implementing the javax.servlet.Servlet interface • 􀂄 Most servlets extend one of the two classes • 􀂄 javax.servlet.GenericServlet • 􀂄 javax.servlet.http.HttpServlet • 􀂄 The servlet API include these methods: • 􀂄 init() is called when the servlet is loaded • 􀂄 service() processes requests (concurrently!) • 􀂄 destroy() is called when the servlet is unloaded

  48. Example Servlet Lifecycle init service Time service service service service service service service destroy Thread 2 Thread 3 Thread 1

  49. Servlet API • The service() method dispatches service requests to one of four methods • 􀂄 doGet, doPut, doPost, doDelete • 􀂄 These methods are passed two parameters • 􀂄 One of type HttpServletRequest • 􀂄 One of type HttpServletResponse • 􀂄 The parameters are objects that can be invoked to: • 􀂄 Read info about the HTTP request • 􀂄 Generate the HTTP response • 􀂄 Sessions are maintained via HttpSession objects • 􀂄 Accessed via the HttpServletRequest object

  50. Servlet Example #1 import java.io.*; import javax.servlet.*; import javax.servlet.http.*; public class Hello extends HttpServlet { public void doGet (HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException { PrintWriter out; String title = "Snoop Servlet"; String ua = request.getHeader("User-Agent"); String ref = request.getHeader("Referer"); response.setContentType("text/html"); out = response.getWriter(); out.println("<html><head><title>"); out.println(title); out.println("</title></head><body>"); out.println("<h1>" + title + "</h1>"); out.println("<p>Hello!</p>"); out.println("<p>Your browser is " + ua + " and " + " you got here via " + ref + "</p>"); out.println("</body></html>"); out.close(); } }

More Related