330 likes | 347 Views
HTML and HTTP. Based on Chapter 32 in Computer Networks and Internets, Comer (Third Edition). Hypertext. HTML stands for HyperText Markup Language and HTTP stands for HyperText Transport Protocol, so that raises the question: what is hypertext?
E N D
HTML and HTTP Based on Chapter 32 in Computer Networks and Internets, Comer (Third Edition)
Hypertext • HTML stands for HyperText Markup Language and HTTP stands for HyperText Transport Protocol, so that raises the question: what is hypertext? • Hypertext is “a method of storing data through a computer program that allows a user to create and link fields of information at will and to retrieve the data non-sequentially.” (Webster’s) • A hyperlink is a region on one document (page) that when clicked brings up for the user another document. • It was developed by Ted Nelson in the 1960s.
URL • The “resources” (data or program files) are located on many computers through an internet or the Internet, hence this is a “distributed” system • The location of a resource is given by its URL (Uniform Resource Locator) • http://www.lasalle.edu:1234/it/fake.htm#attach
Browser • Hypertext is generally viewed in a web browser, an application used to locate (linked or otherwise) web pages and display them. • Some browsers such as Lynx only link text documents. • But when most people think of browsers they think of Netscape Navigator and/or Microsoft Internet Explorer, which support more than just text.
Hypermedia • Modern browsers link information in non-textual format (graphics, sound, video, etc.) and so are “multimedia” or “hypermedia” programs. • The browser may need a plug-in to support some formats. A plug-in adds a particular feature or service to a larger system. • Browsers plug-ins are based on MIME file types.
Mosaic • The first widely used multimedia browser was Mosaic. • Marc Andreessen is credited with initiating the development of Mosaic. • Mosaic moved the Internet out of the realm of academics and computer hobbyists by making it accessible to a much more general audience. • It helped the Internet maintain its exponential growth in number of users.
Fig. 2.1: Computers connected to the Internet vs. Year mosaic
Mosaic (Cont.) • Andreessen started Mosaic while working for the National Center for Supercomputing Applications (NCSA) at the University of Illinois. • Andreessen helped found Netscape Communications, which was originally called Mosaic Communications. • Mosaic is distinct from Netscape. In fact, Mosaic is also licsensed for commercial use and is provided to users by some Internet access providers.
HTML • Browsers interpret web documents, especially HTML documents • HyperText Markup Language is an “authoring” scheme for creating documents for the World Wide Web. • The World Wide Web (WWW) is the collection of resources available through HTTP to users on the Internet.
Markup • The M in HTML stands for “Markup” • Markup refers to the sequence of characters (or symbols) inserted in a document to indicate how the file should look when it is printed or displayed and/or to describe the document's logical structure. • The markup indicators are often called "tags."
Tags • These formatting instructions must be distinguishable from the text they are in. • In HTML, angle brackets < and > are used as delimiters to indicate the beginning and end of a tag • This gives <b>bold</b> type. • As with the byte stuffing we saw in Ethernet frames (where soh an eot were special characters), angle brackets must be replaced in a HTML document with < and >
Tags (Cont.) • The formatting or structure the tag indicates often refers to an entire region, so many HTML tags occur in pairs (heading and trailing). The trailing tag includes a slash. • An HTML document begins an <HTML> tag and ends with an </HTML> tag. • An HTML document is broken into two pieces: the head and the body • The head is the part between the head tags <head> and </head> • The body is the part between the body tags <body> and </body>
Page from my site A space
HTML (Cont.) • There are hundreds of other tags used to format and layout the information in a Web page. • For instance, <P> is used to make paragraphs and <I> … </I>is used to italicize fonts. • Tags are also used to specify hypertext links. • <a href=“http://www.lasalle.edu”>La Salle</a> • HTML is not the only Markup Language.
SGML • HTML has similarities to SGML, Standard Generalized Markup Language, a generic system for organizing and tagging elements of a document. • GML was started by IBM and became SGML when it was taken over by the International Organization for Standards (ISO). • SGML is not about formatting, it’s more general. SGML provides rules for tagging elements. • Those tags might be interpreted as formatting as is done in HTML but can be interpreted in other ways as well.
cHTML • Compact HTML is a reduced set of HTML used for hand-held and other devices with limited CPU, memory, storage and so on. • The display has limited color, no jpg files and the user moves and selects with “buttons” instead of a mouse. • XHTML has been taking its place.
XML • Extensible Markup Language • “Extensible” means capable of being extended, and markup language involves tags, so XML is a scheme in which the user can define his or her own tags. • For example, a company may elect to designate a social security number by placing it in tags defined for that purpose • <ssn>123456789</ssn> • This data can be transported from application to application and system to system and is carrying around a self-identifying tag with it.
XML (Cont.) • Unlike HTML tags, XML tags are not necessarily about formatting and presentation. • However, a presentation application can be instructed to represent a certain type of data (as identified by its XML tags) in a particular way. • On the other hand, a database interface program can be instructed to place the information into the appropriate field.
XHTML • Extensible Hypertext Markup Language is a mixture of HTML and XML designed for network display devices. • XHTML is written in XML; therefore, it is an XML application.
CFML • ColdFusion Markup Language is a proprietary mark up language developed by Allaire for use with ColdFusion. • CFML is a tag-based scripting language supporting dynamic Web page creation and database access. • ColdFusion tags are embedded in HTML files. The HTML tags determine the page's layout while the CFML tags import content based on user input or the results of a database query. • Files created with CFML have the file extension .cfm
DOM • The Document Object Model is a set of specifications concerning how web-page objects (such as text, images, textboxes, buttons) look and operate. • The DOM defines the attributes and events associated with each object, and so forth. • Dynamic HTML (DHTML) uses DOM to dynamically alter the appearance of Web pages after they have been downloaded (client-side).
DOM (Cont.) • Alas Netscape Navigator and Microsoft Internet Explorer use different DOMs. • This is why their implementations of DHTML are so different. • Both companies have submitted their DOMs to the World Wide Web Consortium (W3C) for standardization.
HTTP • HTML and other web documents are transported across the network using HTTP Hypertext Transport Protocol, originally developed by Dr. Tim Berners-Lee. • HTTP defines rules for how messages are formatted and transmitted, what actions are allowed by Web servers, what actions are allowed by clients, etc.
HTTP • A Web server has an HTTP daemon that waits for HTTP requests and handles them when they arrive. • A Web browser is an HTTP client, sending requests to server machines. • For example, entering a URL in the location field of a browser (client) sends an HTTP request to the appropriate Web server, which responds with the page.
HTTP • HTTP is a stateless protocol because each command is executed independently, without any knowledge of the commands that came before it. • This is good for keeping transmission lines available, since there are no ongoing sessions tying up resources. • This is bad for having a web site respond in an intelligent way to a user. • This shortcoming of HTTP is addressed in a variety of ways, including ActiveX, Java, JavaScript and cookies.
HTTP 1.1 • Most modern browsers support HTTP 1.1 • Instead of opening and closing a connection for each application request, HTTP 1.1 provides a persistent connection that allows multiple requests to be batched or pipelined to an output buffer. • The underlying TCP layer can put multiple requests (and responses to requests) into one TCP segment. • Fewer segments, less overhead.
HTTP 1.1 (Cont.) • Compression: If a browser (client) indicates that it can decompress HTML files, then a server compresses them for transport across the Internet. • Standard image files are already in a compressed format, so this improvement applies only to HTML and other non-image data types.
sHTTP • Secure HTTP is an extension to the HTTP protocol for sending data securely over the Web. • Not all browsers and servers support S-HTTP. • Another technology for secure communications over the Web is Secure Sockets Layer (SSL). • SSL and S-HTTP have different designs and goals. SSL is designed to establish a secure connection between two computers, S-HTTP is designed to send individual messages securely.
Cache • To increase speed, browsers cache web page documents locally. • There are also cache servers, machines on the local network that cache web page documents. • First, the page is looked for on the local machine, then on the local network (cache server) and then at the remote location.
Other References • http://www.webopedia.com • http://www.whatis.com