200 likes | 316 Views
A Tour of XML. SNU IDB Lab. Table of Contents. What is XML ? The Origin of XML Elements and Attributes What is the DTD? Hypertext Links Document Formatting. What is XML?. An acronym for ‘ eXtensible Markup Language’ ’ A meta-language that describes other languages
E N D
A Tour of XML SNU IDB Lab.
Table of Contents What is XML ? The Origin of XML Elements and Attributes What is the DTD? Hypertext Links Document Formatting
What is XML? • An acronym for ‘eXtensibleMarkup Language’’ • A meta-language that describes other languages • A data format for storing structured and semi-structured text for dissemination and ultimate publication, perhaps on a variety of media • Properties • tags enclose identifiable parts of the document • self-describing • physical/logical structure • physical structure : allows components of the document, called entities • logical structure : allows a document to be divided into named units and sub-units, called elements
What is XML? Physical Structure Logical Structure Document entities Unit (internal) (separate) Sub-unit elements
What is XML? XML markup <warning> <para> This substance if hazardous to health </para> <para> See procedure 12A. 7 for information on protective clothing required. </para> <logo …/> </warning>
What is XML? • DTD(Document Type Definition) • define the elements allowed in a particular type of document • a parser uses it to check the validity of documents • Style sheet • used to specify an output format for each element
The Origin of XML XML 1997 WWW HTML 1992 SGML 1986 GM Internet GM = Generalized Markup 1960
Applications • Data exchange applications • identified domain : XML-EDI • general meta-data part : MCF, XML-Data, RDF • Document publishing applications
Applications Interactive publishing Web publishing Page layout Complex document layout Simple document layout XLL XSL HTTP XML CSS SPDL TCP/IP ASCII / ISO 10646 / Unicode
Elements • An element consists of a start tag, an end tag, and data • e.g.) Are you going to <name> Scarborough </name> failr ? • Element names are case-sensitive • Some hierarchical structures may be recursive
Elements • Content types • element content • an element that does not directly contain text, but contains other elements • mixed content • an element that contains a mixture of elements and text • data content • an element that happens to contain only text • empty element • an element that may not be allowed to contain data
Elements mixed content data content <section> <p> … </p> <p> This paragraph contains an <em>emphasized phrases</em> in the middle. </p> <p> This paragraph contains a figure <fig …/> here. </p> <list> … </list> </section> empty element element content
Attributes • Provides refined information about an element • Embedded in the element start-tag • Consists of an attribute name and an attribute value • value is enclosed by quotes • name and value are case-sensitive
Reserved Attributes • Languages • ‘xml:lang’ is reserved for storage of both language and country details • e.g.) <paraxml:lang=“en”> … </para> • sub-code specify a country code • e.g.) <instruction xml:lang=“en-GB”> … <instruction> • Significant spaces • ‘xml:space’ is reserved for distinguish space characters in elements that contain other elements from spaces in elements that contain text • possible values : ‘default’, ‘preserve’
Declarations • Contain instructions to the XML processor • Delimited by ‘<!’ and ‘>’ • Types of declarations • document type declaration • e.g.) <!DOCTYPE MyBook> • comments • e.g.> <!-- This is a comment --> • character data sections • <![CDATA[Press the <<<Enter>>> button.]]> • XML declaration • <?XML version=“1.0” encoding=“UTF-8” standalone=“yes” ?>
Concepts of DTD • DTD(Document Type Definition) • An optional but powerful feature of XML • Comprises a set of declarations that define a document structure tree • Some XML processors read the DTD and use it to build the document model in memory • Establishes formal document structure rules • It define the elements and dictates where they may be applied in relation to each other
Concepts of DTD Well formed XML Document Valid XML Document • Declare Vs. Define • Declare “This document is a concert poster” • Define “A concert poster must have the following features” • DTD define • Element type + Attribute + Entities • Valid Vs. Invalid • Valid conforms to DTD • Invalid fail to conform to DTD
Hypertext Links(XLL) • Terminology • resource • target object • linking element • source • traversal • the act of moving from the liking element to the resource
Hypertext Links(XLL) • Simple link & Extended link • simple link • the primitive one-directional linking scheme, but make it possible to traverse links between documents • Extended link • resources can be cross-related • an extended link contains a number of locator elements, each one points to a resource
Hypertext Links(XLL) • Attributes in the linking element can influence • the means by which a link can be activated • a link could be activated by the person(‘user’ link) • directly by the application(‘auto’ link) • the presentation technique required once it has been activated • application may jump to the specified resource(‘replace’) • display the resource in another window(‘new’) • insert the resource into the original text(‘embed’)