1.01k likes | 1.27k Views
XHTML. Steven Pemberton CWI, Amsterdam Chair, W3C HTML Working Group. Overview. History Philosophy XML and related technologies XHTML 1.0 Modularisation XHTML Basic XHTML 1.1 The Future. HTML 1. The original HTML was designed in the early 1990’s for scientific reports
E N D
XHTML Steven Pemberton CWI, Amsterdam Chair, W3C HTML Working Group
Overview • History • Philosophy • XML and related technologies • XHTML 1.0 • Modularisation • XHTML Basic • XHTML 1.1 • The Future
HTML 1 • The original HTML was designed in the early 1990’s for scientific reports • Each document was a single resource (not even <IMG>) • (This explains much about HTTP by the way)
(HTML 1) • It is amazing how much we have been able to do with a language with such beginnings • It was described using SGML
HTML as an SGML Application • SGML: an international standard in 1986 • It is a Meta-language that describes data formats, using DTD’s (Document Type Definitions) • Describes structure, not presentation<H1>HTML as SGML Application</H1>
Example of a DTD fragment <!ELEMENT table (caption?, (col*|colgroup*), thead?, tfoot?, (tbody+|tr+))> <!ELEMENT caption %Inline;> <!ELEMENT thead (tr)+> ...
Attributes <!ATTLIST TABLE %attrs; -- %coreattrs, %i18n, %events -- summary %Text; #IMPLIED width %Length; #IMPLIED border %Pixels; #IMPLIED … >
Entities <!ENTITY % fontstyle "TT | I | B | BIG | SMALL"> <!ENTITY % inline "#PCDATA | %fontstyle; | %phrase; | %special; | %formctrl;"> <!ENTITY % Length "CDATA" -- nn for pixels or nn% for percentage length -->
Problems with SGML • Arcane syntax • Very difficult to implement fully • No support for types
Changes to HTML • Netscape and Microsoft start adding to HTML: mostly presentation-oriented tags (like <BLINK>, <CENTER>), and frames • The World Wide Web Consortium (W3C) started effort to: • Keep HTML Pure • Do presentation via Style Sheets
Separating content and presentation • HTML was designed as a data-structuring language, but the later changes undermined this. • Separating content from presentation has distinct advantages
For the author • Easier to write your documents • Easier to change your documents • Easy to change the look of your documents • Access to professional designs • Your documents are smaller • Visible on more devices • Visible to more people
For the webmaster • Separation of concerns • Simpler HTML, less training • Cheaper to produce, easier to manage • Easy to change house style • Reach more people • Search engines find your stuff easier • Visible on more devices
For the reader • Faster download (one of the top 4 reasons for liking a site) • Easier to find information • You can actually read the information if you are sight-impaired • Information more accessible • You can use more devices
For the implementor • Improves the implementation (separation of concerns) • Can produce smaller browsers
Changes to HTML (2) • Another change that Netscape made, with insufficient thought was Frames • Frames create significant problems with web pages
The problems with frames • Can’t bookmark framesets • [Back] does odd things • [Page up] and [page down] work oddly • [Reload] often doesn’t work right • Security is compromised • Nested frames are hard to deal with (how do you get out?)
What frames can do • Search and show interfaces • Keeping script variables in a hidden frame
Style languages • The first action that W3C did was to start an activity on Style Sheets (Nov 1995) • This produced CSS1 initially (Dec 1996), then CSS2 (May 1998) (CSS3 is in preparation) • Later produced XSL, an XML-based language, as complementary to CSS
CSS • CSS is a separate language from HTML that allows you to specify how an HTML document, or set of documents, should look • Separates content from presentation • HTML can be a structure language again
Examples of CSS h1 { font-weight: bold; font-size: 2em } h2 { font-weight: bold; font-size: 1.5em } em {background-color: yellow} body {margin-left: 20%}
Using CSS • Use the following at the top of an XML document: <?xml-stylesheet type='text/css' href=’mystyle.css'?> • Or this in the <head> of an HTML document: <link rel="stylesheet" type="text/css" href=”mystyle.css" />
Advantages of CSS • Makes HTML easier to write (and read) • You can define a house style • Compatible: you can still see the content on non-CSS browsers • Pages are much smaller • Accessible to sight-impaired • ...
By the way... • Check your logs: more than 95% of people browsing now use a CSS-enabled browser • The current generation of browsers (IE 5, NS 6, Opera 4) have excellent support for CSS. • You never need to use the <FONT> and <FONTFACE> elements again!
Documents • As mentioned, HTML was designed for just one sort of document (scientific reports), but is now being used for all sorts of different documents • You could use SGML to define other sorts of document, but SGML is notoriously hard to fully implement • Enter XML
Enter XML • XML is a W3C effort to simplify SGML • It is a meta-language: a language for defining languages • It is a subset of SGML • One of the aims is to allow everyone to invent their own tags • DTD is optional: a DTD can be inferred from a document
Consequences • The requirement of being able to infer a DTD from a document has an effect on the languages you can define: • Closing tags are now required<LI>....</LI> <P>....</P> • Empty tags are marked specially <IMG SRC=”pic.gif”/> <BR/> <HR/> (or <HR></HR> etc)
Consequences 2 • CDATA sections must be marked as such (only necessary if they contain “<”, “&” etc.): <SCRIPT> <![CDATA[ ... script content ... ]]> </SCRIPT>
Not Like This <H1>XML</H1> An underlying problem with HTML is that ... <P> You could use SGML to define ... But Like This <H1>XML</H1> <P> An underlying problem with HTML is that … </P> <P> You could use SGML to define ...</P> By the way: <P> is not like <BR>
Consequence of XML • Anyone can now design their own (Web-delivered) languages • CSS makes them viewable <address> <name>Steven Pemberton</name> <company>CWI</company> <street>Kruislaan 413</street> <postcode>1098 SJ</postcode> <city>Amsterdam</city> <speaker/> </address>
So do we still need HTML? • Workshop in May 1998 • XML is still a meta-language • There is still a perceived need for a base-line mark-up • HTML has some useful semantics, both implied and explicit (search engines gladly use it, for instance)
HTML as XML application • Clean up (get rid of historical flotsam) • Modularise – split into separate parts • Allows other XML applications to use parts • Allows special purpose devices to use subset • Add any required new functionality (forms, better event handling, Ruby)
The HTML Working group • International membership, around 20 members • Many major players (IBM, Microsoft, Netscape, etc) • Meets weekly by phone, quarterly face-to-face
Group experience • There was more to be worked out than we anticipated • XHTML is the first major application of XML, so the world’s eyes are on us • XML still needs the wrinkles ironed out
Philosophy of XHTML • Transition from ‘old world’ to XML • Clean up the language • Return to structure only • Use generic XML as much as possible • Modularise • Address wider needs (International, Accessibility) • Add new functionality
Plan of action • HTML 4.01: corrected version • XHTML 1.0: transitional version of HTML 4.01 in 3 flavours • Modularisation: agreement on split and methodology • XHTML Basic: Small devices • XHTML 1.1: clean version of 1.0 strict
(plan of action) • Events: accessible and device-independent • Ruby: needed Asian markup • Forms: more control • XHTML 2.0: Putting it all together
Differences HTML:XHTML • Because of the difference between SGML and XML, there are some necessary differences, for instance: • Use lower case: <p> not <P> • Attributes are always quoted: <th colspan=”2”> • Anchors use id attribute not name (and not just on <a> by the way):<a id=”index”> <p id=”top”>
Example XHTML 1.0 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml” xml:lang="en"> <head><title>Virtual Library</title></head> <body> <p>Moved to <a href="http://vlib.org/">vlib.org</a>. </p> </body> </html>
Namespaces • Namespaces have been added to XML to allow you to mix fragments from different languages (e.g. HTML + Maths) • In the same way that object-oriented languages allow you to identify which function you are using, namespaces allow you to identify which tags you are using.
Example of nesting <html xmlns="http://www.w3.org/1999/xhtml"> <head><title>A Math Example</title></head> <body> <p>The following is MathML markup:</p> <math xmlns="http://www.w3.org/TR/REC-MathML"> <apply><log/><logbase><cn> 3 </cn> </logbase> <ci> x </ci> </apply> </math> </body> </html>
Example of colonising <math xmlns="http://www.w3.org/TR/REC-MathML" xmlns:html="http://www.w3.org/1999/xhtml"> <apply><log/><logbase><cn> 3 </cn> </logbase> <ci> x </ci> </apply> <html:p>This is a paragraph</html:p> </math>
Namespaced attributes • Attributes normally come from the element itself: <html:a href="next.xml"> • But you may also use ‘global’ attributes from a namespace: <pointer html:href="x.xml"> <music style="classical" html:style="color: red">Beethoven’s 5th</music>
XML ‘namespace’ • XML also has its own pseudo-namespace for reserved attributes: <para xml:lang="en">
Using ‘generic’ XML • Presentation use CSS • Links use Xlink or Schemas • Forms use CSS? • Images etc. use Xlink or Schemas • (Natural) language of elements use xml:lang attribute
Xlink? • HTML has several ‘built-in’ hyperlinks: <a>, <img>, <object>, <link>, etc. • Since XML allows you to define your own elements, a browser doesn’t know which are links • Xlink was started to solve this problem.
Xlink • Xlink started as a method of describing which attributes of an element were a link • It later changed into a language of links, so it could no longer be used to describe XHTML • The current plan is now to introduce types into Schemas to describe links
Example of Xlink <crossReference xmlns:xlink="http://www.w3.org/1999/xlink" xlink:type="simple" xlink:href="students.xml" xlink:role="studentlist" xlink:title="Student List" xlink:show="new" xlink:actuate="onRequest"> Current List of Students </crossReference>
Schemas • Schemas are a new technology to replace much of DTDs. • Schemas are expressed in XML • They have support for data types • Much easier to parse and implement than DTDs
Schemas: but • They don’t support the definition of entities (é) • Not easy to read (or write)