1 / 42

World Wide Web (WWW) A Distributed Document-Based System

World Wide Web (WWW) A Distributed Document-Based System. Group E Ricky Tong (D-A0-1611) Eddy Leong (D-A0-1623) Dick Lei (D-A0-1658). Schedule of Presentation. Overview of World Wide Web Document Model HTML DOM XML Document Type MIME Architectural Overview Discussion Time.

Download Presentation

World Wide Web (WWW) A Distributed Document-Based System

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. World Wide Web (WWW)A Distributed Document-Based System Group E Ricky Tong (D-A0-1611) Eddy Leong (D-A0-1623) Dick Lei (D-A0-1658)

  2. Schedule of Presentation • Overview of World Wide Web • Document Model • HTML • DOM • XML • Document Type • MIME • Architectural Overview • Discussion Time

  3. The World Wild Web • The www is a document-based system • It can be view as a huge distributed system consisting of millions of clients and servers for accessing linked documents • Sever maintain collections of documents, while clients provide users an easy-to-use interface for presenting and accessing those documents

  4. Overview of World Wide Web • Documents are stored as files in the servers. • Servers receive request and files are sent to the clients. • The client usually interacts with the web server through a browser.

  5. The overall organization of the Web

  6. Document Model • Some documents are represents are ASCII text files. • Some are expressed as a collection of script that will run on the browser automatically • Some contains references to other document such as: hyperlink. • The new document may replace the current one or open in a new browser

  7. HTML • Most web document are expressed in HTML. • An HTML file contains small markup tags telling the Web browser how to display the page. • An HTML file must has .htm or .html extension. • Create the HTML file by simple text editor

  8. Example of HTML <html> <head> <title>Title of page</title> </head> <body> This is my first homepage. <b>This text is bold</b> </body> </html>

  9. Document Object Model • DOM provides a standard programming interface to parsed web documents. • The interface is specified in CORBA IDL. • The interface is used by the scripts embedded in a document. • Scripts can be used to inspect and modify the document that they are part of.

  10. XML (Extensible Markup Language) • XML is a meta-markup language providing a format for describing structured data • This facilitates more precise declarations of content and more meaningful search results across multiple platforms.

  11. XML Example <?xml version="1.0" ?> <?xml-stylesheet href="greeting.xsl" type="text/xsl"?> <message> <greeting>Hi</greeting> <target>you all</target> </message>

  12. Other Document Types There are many types of documents besides HTML and XML: • Audio: .mp3 • Others: .pdf, etc • Image : .gif and .jpeg

  13. MIME (Multipurpose Internet Mail Extensions) • It was originally developed to provide information on the content of a message body that was sent as part of E-mail. • It is a specification for enhancing the capabilities of standard Internet E-mail. • It offers a simple standardized way to represent and encode a wide variety of media types for transmission via Internet mail.

  14. The 7 Content-types defined in MIME • Text - represent textual information • Image - transmit still images • Audio - transmit audio or voice data • Video - transmit video data or moving image data • Message - encapsulate an entire RFC 822 format messages • Multipart - combine several body parts of possibly different types & subtypes • Application - transmit application or binary data

  15. CGI(Common Gateway Interface) • It is a standard for interfacing external applications with information servers. Such as HTTP or Web severs • It is executed real time and give dynamic information.

  16. The principle of using server-side CGI programs

  17. Server-side script • It is executed by the server when the document has been fetched locally. Client-side using JavaScript <script language="JavaScript"> <!-- script code here --!> </script> Server-Side Using ASP <% ' 'script code here ' %>

  18. Client-side script • Client-side script is just software designed to be run by the browser

  19. Applet • It is another method to pass precompiled programs to a client • Applet is a Small Java program embedded in an HTML page. • For security reasons applets cannot read or write data on client computer. • The applet can only be executed if your browser supports Java.

  20. Servlet • Servlet is a precompiled program that is executed in the address space of the server. • Servlet is Java technology's answer to CGI programming. • The Web page is based on data submitted by the user. • The data change frequently.

  21. Architectural details of a client and server in the Web

  22. HTTP Connections • HTTP is a client-server protocol by which two machines can communicate over a TCP/IP connection. • HTTP is the protocol used for document exchange in the World-Wide-Web. • Everything that happens on the web happens over HTTP transactions.

  23. HTTP Headers • General Header Field (Use in both request and response messages) • Request Header Fields (Use in request messages only) • Response Header Fields (Used in response message only) • Entity Header Fields (Use in both request and response messages, containing the information about the entity-body of the message)

  24. Request Header Example GET /articles/news/today.asp HTTP/1.1 Accept: */* Accept-Language: en-us Connection: Keep-Alive Host: localhost Referer: http://localhost/links.asp User-Agent: Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0) Accept-Encoding: gzip, deflate

  25. Response Header Example HTTP/1.1 200 OK Server: Microsoft-IIS/5.0 Date: Thu, 13 Jul 2000 05:46:53 GMT Content-Length: 2291 Content-Type: text/html Set-Cookie: ASPSESSIONIDQQGGGNCG=LKLDFFKCINFLDMFHCBCBMFLJ; path=/ Cache-control: private

  26. Web Server • A Web server uses the client/server model and the WWW Hypertext Transfer Protocol • Every computer on the Internet that contains a Web site must have a Web server program. • Two leading Web servers are Apache the most widely-installed Web server, and Microsoft's Internet Information Server (IIS).

  27. Apache Server

  28. Processing HTTP Requests in Apache Server • 1. Resolving the document reference to a local file name. • 2. Client authentication. • 3. Client access control. • 4. Request access control. • 5. MIME type determination of the response. • 6. General phase for handling leftovers. • 7. Transmission of the response. • 8. Logging data on the processing of the request.

  29. Server Cluster

  30. The principle of TCP handoff

  31. Scalable content-aware cluster of web servers

  32. Uniform Resource Identifiers (URI) • A URI (Uniform Resource Identifier) is the way to identify the points of content. • The most common form of URI is the Web page address. • A URI typically describes: The mechanism used to access the resource The specific computer that the resource is housed in The specific name of the resource (a file name) on the computer

  33. Uniform Resource Locator (URL) • A URL contains information on how and where to access a document.

  34. Uniform Resource Name (URN) • A URN is an Internet resource with a name that has persistent significance. • A URN looks something like a Web page address or URL • Example: urn:def://blue_laser • Both URN and URL are types of a concept called the URI. • The URN is still being developed by members of the Internet Engineering Task Force (IETF).

  35. Web Distributed Authoring and Versioning (WebDAV) • An extension to HTTP is called WebDAV • WebDAV provides a simple means to lock a shared document, and to create, delete, copy, and move documents from remote Web servers. • WebDAV supports a simple locking mechanism. • There are two types of write locks, the exclusive write lock, and the shared write lock.

  36. Web Proxy Caching • Simply caching facility of Browser • Web-proxy caching • cache cover region or even country hierarchical caching.

  37. Neighbor Proxy Caching

  38. Server Replication • Fault tolerance in the Web is mainly achieved through client-side caching and server replication. • High availability in the Web is achieved through redundancy that makes use of generally available techniques in crucial services such as DNS.

  39. Security • Most of the security issues in the Web deal with setting up a secure channel between a client and server. • The predominant approach for setting up a secure channel in the Web is to use the Secure Socket Layer (SSL) • Transport Layer Security (TLS) an update of SSL.

  40. The position of TLS in the Internet protocol stack

  41. TLS with mutual authentication

  42. The End

More Related