180 likes | 546 Views
XML. Introducing XML. What is XML? . XML is the Extensible Markup Language. It is designed to improve the functionality of the Web by providing more flexible and adaptable information identification.
E N D
XML Introducing XML
What is XML? • XML is the Extensible Markup Language. It is designed to improve the functionality of the Web by providing more flexible and adaptable information identification. • It is called extensible because it is not a fixed format like HTML (a single, predefined markup language). Instead, XML is actually a `metalanguage' —a language for describing other languages—which lets you design your own customized markup languages for limitless different types of documents. XML can do this because it's written in SGML, the international standard metalanguage for text markup systems (ISO 8879).
Why not just carry on extending HTML? • HTML is already overburdened with dozens of interesting but incompatible inventions from different manufacturers, because it provides only one way of describing your information. • XML allows groups of people or organizations to create their own customized markup applications for exchanging information in their domain (music, chemistry, electronics, hill-walking, finance, surfing, petroleum geology, linguistics, cooking, knitting, stellar cartography, history, engineering, rabbit-keeping, mathematics, genealogy, etc). • HTML is at the limit of its usefulness as a way of describing information, and while it will continue to play an important role for the content it currently represents, many new applications require a more robust and flexible infrastructure.
XML Parsers • An XML processor (also called XML parser) evaluates the document to make sure it conforms to all XML specifications for structure and syntax. • XML parsers are strict. It is this rigidity built into XML that ensures XML code accepted by the parser will work the same everywhere. • Microsoft’s parser is called MSXML and is built directly in IE versions 5.0 and above. • Netscape developed its own parser, called Mozilla, which is built into version 6.0 and above.
Creating an XML Document • There are two categories of XML documents – Well-formed – Valid • An XML document is well-formed if it contains no syntax errors and fulfills all of the specifications for XML code as defined by the W3C. • An XML document is valid if it is well-formed and also satisfies the rules laid out in the DTD or schema attached to the document.
The Structure of an XMLDocument XML documents consist of three parts – The prolog – The document body – The epilog • The prolog is optional and provides information about the document. • The document body contains the document’s content in a hierarchical tree structure. • The epilog is also optional and contains any final comments or processing instructions.
The XML Declaration The XML declaration at the beginning of an XML document is not necessary, but it's the best way to say "this is definitely an XML document and here's the release of XML it conforms to." The following is typical: <?xml version="1.0"?>
Elements and Attributes XML supports two types of elements: – Closed elements, and – empty elements • A closed element, has the following syntax: <element_name>Content</element_name> • Example: <Artist>Miles Davis</Artist> • An attribute is a feature or characteristic of an element. Attributes are text strings and must be placed in single or double quotes. The syntax is: <element_name attribute=“value”> … </element_name>
Linking to a Style Sheet . A stylesheet is a separate document that provides hints and algorithms for rendering or transforming the data in the XML document. 1-The CSS stylesheet language is general and powerful enough to be applied to XML documents, although it is oriented toward visual rendering of the document and does not allow for complex processing of the document's data. 2-more complex and powerful stylesheet language is XSLT, the Transformations part of the Extensible Stylesheet Language, which can be used to transform XML to other formats, including HTML, other forms of XML, and plain text .
Binding XML Data withInternet Explorer • Data in a data source is organized by fields,records, and recordsets. • A field is an element that contains a single item of information such as an employees last name. • A record is a collection of those fields. • A recordset is a collection of records. • The first step in data binding is to attach the Web page to a recordset. The attached data is called a data island. They can be either external files or code entered into the HTML file. • The syntax to create a data island from an external file is: <xml id=“id” src=“URL”></xml> • For example: <xml id=“Company” src=“Company.xml”></xml> • ActiveX Data Objects (ADO) is a data-access technology developed by Microsoft. ADO allows you to work with the Data Source Object by applying a method or by changing one of the properties of the DSO. • For example, if you want to display the last record in a DSO whose id is “Staff_Info”, run the following method: Staff_Info.recordset.moveLast
Creating a Valid Document An XML document can be validated using either DTDs (Document Type Definitions) or schemas. • A DTD can be used to: – Ensure all required elements are in present the document – Prevent undefined elements from being used – Enforce a specific data structure – Specify the use of attributes and define their possible values – Define default values for attributes – Describe how the parser should access non-XML or non-textual content • DTDs define five different types of element content: – Any elements. No restrictions on the element’s content. – Empty elements. The element cannot store any content. – Character data. The element can only contain a text string. – Elements. The element can only contain child elements. – Mixed. The element contains both a text string and child elements.
How do I create my own DTD? § You need to use the XML Declaration Syntax (very simple: declaration keywords begin with <! rather than just the open angle bracket, and the way the declarations are formed also differs slightly). Here's an example of a DTD for a shopping list, <!ELEMENT Shopping-List (Item)+> <!ELEMENT Item (#PCDATA)> It says that there shall be an element called Shopping-List and that it shall contain elements called Item: there must be at least one (that's the plus sign) but there may be more than one. It also says that the Item element may contain parsed character data (PCDATA, ie text). Because there is no other element which contains Shopping-List, that element is assumed to be the `root' element, which encloses everything else in the document. You can now use it to create an XML file: give your editor the declarations: <?xml version="1.0"?> <!DOCTYPE Shopping-List SYSTEM "shoplist.dtd"> (assuming you put the DTD in that file). Now your editor will let you create files according to the pattern: <Shopping-List> <Item>Chocolate</Item> <Item>Sugar</Item> <Item>Butter</Item> </Shopping-List> See http://www.w3.org/QA/2002/04/valid-dtd-list.html
Working with Namespaces and Schemas A namespace is a collection of element and attribute names identified by a Uniform Resource Identifier reference. The reference may appear in the root element as a value of the xmlns attribute. For example, the namespace reference for an XML document with a root element x might appear like this: <x xmlns="http://www.company.com/company-schema">. More than one namespace may appear in a single XML document, to allow a name to be used more than once. Each reference can declare a prefix to be used by each name, so the previous example might appear as <x xmlns:spc="http://www.company.com/companyschema">, which would nominate the namespace for the `spc' prefix: <spc:name>Mr. Big</spc:name>.
schema A schema is a model for describing the structure of information. It's a term borrowed from the database world to describe the structure of data in relational tables. In the context of XML, a schema describes a model for a whole class of documents. The model describes the possible arrangement of tags and text in a valid document. A schema might also be viewed as an agreement on a common vocabulary for a particular application that involves exchanging documents. Schemas may sound a little technical, but we use them to analyze the world around us. For example, suppose I ask you, "is this a valid postal address?" <address> <name>Youssef Hijazi</name> <street>1469 alphada ave</street> <city>Akron</city> <state>OH</state> <zip>12481</zip> </address>
Schema ..cont Mentally, you compare the address presented with a schema that you have in your head for addresses. It probably goes something like this: a postal address consists of a person, possibly at a company or organization, one or more lines of street address, a city, a state or province, a postal code, and an optional country. So, yes, this address is valid.In schemas, models are described in terms of constraints. A constraint defines what can appear in any given context. There are basically two kinds of constraints that you can give: content model constraints describe the order and sequence of elements and datatype constraints describe valid units of data. For example, a schema might describe a valid <address> with the content model constraint that it consist of a <name> element, followed by one or more <street> elements, followed by exactly one <city>, <state>, and <zip> element. The content of a <zip> might have a further datatype constraint that it consist of either a sequence of exactly five digits or a sequence of five digits, followed by a hyphen, followed by a sequence of exactly four digits. No other text is a valid ZIP code. The purpose of a schema is to allow machine validation of document structure. Every specific, individual document which doesn't violate any of the constraints of the model is, by definition, valid according to that schema.Using the schema described (informally) above, a parser would be able to detect that the following address is not valid: <address> <name>Youssef Hijazi</name> <street>1469 alphada ave</street> <city>Akron</city> <state>MA</state> <state>OH</state> <zip>blue</zip> </address> It violates two constraints of our schema: it does not contain exactly one <state> and the ZIP code is not of the proper form. A formal definition of this schema for addresses is presented in the syntax section.
How do I use graphics in XML? Graphics have traditionally just been links which happen to have a picture file at the end rather than another piece of text. They can therefore be implemented in any way supported by the XLink and XPointer specifications , including using similar syntax to existing HTML images. They can also be referenced using XML's built-in NOTATION and ENTITY mechanism in a similar way to standard SGML, as external unparsed entities. The linking specifications, however, give you much better control over the traversal and activation of links, so an author can specify, for example, whether or not to have an image appear when the page is loaded, or on a click from the user, or in a separate window, without having to resort to scripting. § XML itself doesn't predicate or restrict graphic file formats: GIF, JPG, TIFF, PNG, CGM, and SVG at a minimum would seem to make sense; however, vector formats are normally preferred for non-photographic images
Examples http://www.cs.kent.edu/~yhijazi/xml/question.xml http://www.cs.kent.edu/~yhijazi/xml/home.html Dr. Arvind Bansal’s lectures http://www.cs.kent.edu/~yhijazi/xml/xml-tutorials/a1.pdf http://www.cs.kent.edu/~yhijazi/xml/xml-tutorials/a2.pdf http://www.cs.kent.edu/~yhijazi/xml/xml-tutorials/a3.pdf http://www.cs.kent.edu/~yhijazi/xml/xml-tutorials/a4.pdf