370 likes | 554 Views
XML introduction. What is XML? Examples of XML use A XML document Structuring data with tags More about tags. What is XML?. eXtensible Markup Language Is a markup language for making other markup languages XML is a meta language Examples of markup languages : HTML
E N D
XML introduction • What is XML? • Examples of XML use • A XML document • Structuring data with tags • More about tags
What is XML? • eXtensible Markup Language • Is a markup language for making other markup languagesXML is a meta language • Examples of markup languages : • HTML • XHTML (a xml version of html) • RTF • (S)GML • Except from XHTML, these are not xml • With XML, tags specific to the domain can be defined. Sounds complicated? You can make your own tags to describe data with in XML
In Other Words… • XML is a meta language - a language to define other languages in • You define the language using XML as meta language • Your XML language describes the data structure of your domain
XML tells nothing about layout XML describes structure and semantics, not formatting HTML mixes data and layout <BODY> <DT>Hot Cop <DD>by Jacques Morali, Henri Belolo, and Victor Willis <UL> <LI>Jacques Morali <LI>PolyGram Records <LI>6:20 <LI>1978 <LI>Village People </UL> </BODY>
But be careful …and created and edited using a text editor:
Data descripes itself • What is descriped here:
Data exchange between applications • XML is a non propritary format • Meaning it is not owned by somebody or belongs to certain application • It is easy for humans to read and write xml documents • It can be written in a simple editor • It can easily be send by the networkin a secure or unsecure format • Note: But xml is not smart, if you send big data amounts. Why? In contrast to doc (word, word perfect...), pdf, mdb, qt, drw, etc.
XML standards Not an XML-language it selves • DTD (Document Type Definition) Definition of a XML-language. The old way. • XSCHEMADefinition of a XML-language. The new way. • XPATHTraversing / navigation in a xml-document. • XSL(T)Stylesheet. Used for converting from one format to another. For instance from xml to html. • XSL-FOFormatting stylesheets. Used for documents in a non markup langauge, e.g. pdf-document og svg-documents
Examples of use of xml • Data exchange • Handling of multiple client types on www • SOAP – Simple Object Access ProtocolServices are descriped with xml. Data is transported in a xml-document. Example: WebServices • RSS feeds – See for instance Visual Studio • Configuration files e.g. web.config
Examples of xml-languages • ODFOpen Document Format • OOXMLOpen Office XML • CMLChemical Markup Language • OFXOpen Financial eXchange • SVGScalable Vector Graphics • OSDOpen Software Description • XHTMLHTML i xml udgave
XML might be written NotePad XML can be viewed in a msie browser XML can also be viewed in Mozilla Demo: Hello World
XML document XML with stylesheet Stylesheet (XSL) Format with a stylesheet
Example: TV schedule By this example the following is discussed: • Good xml-style • Structure and syntax in xml-documents • Terminology and definitions • XML tree
Find candidates for tags As in database modelling: • By finding objects / entities in the user domain • Find keywords in the problem description • Look at existing output.E.g. Look in TV section of the paper. • Map from an existing database. • ... • Keywords: • - Title - Description • - Station - Category • - Date - Production year • - Start time - Stars • - Duration
Declaration Rod element Underelement Strukturering Et XML-dokuments opbygning: • Dokumentet er hierarkisk opbygget i en træstruktur • Giver nogle fordele, men også ulemper ift. relations databaser • Der er eet rod-element, som indeholder alt det andet • Declarations <?..?> er ikke en del af træet • Træet kan indeholde et vilkårligt antal knuder
A little about xml-style • As in programming you should choose a strict way of writing: • How shall naming be? Use _ or capital first letter?Or only capitals? • When to use text between start and end tags? and when to use attributes? • Else: Use w3.org's way (only non capitals and _)
Root Parent Child Siblings XML documents is alway a tree structure
Nesting of elements • Terminology: • Child elements – elements that are contained in other elements • Parent elements – elements that contains other elements • Sibling elements – elements that have the same parent element
XML Syntax • “Syntax” refers to the rules of the language • Syntax is necessary for making documents written in the language consistent • Programs that interpret documents assume that the rules of the syntax is satisfied, else it can't be guarantied that the document is interpreted correctly.
Components in a XML Documents • XML Declaration • Elements • Attributes • Entities • Commentary <?declaration?> <element attribute="value"> < is a character reference. It is also an entity <!-- Comment: < is the less-than symbol --> </element>
Components: XML Declaration • XML Declaration: • Tells that the document is a XML document and other optional information's • The XML declaration is always the first line in a XML document • Attributes that can be used in the XML Declaration: • version • encoding • standalone • <?xml version=“1.0” Encoding=“UTF-8” standalone=“yes”?><?xml-stylesheet type="text/xsl" href="HelloWorld_v1.xsl"?>
Components: XML Elements • Elements: • Used for describing data. Contains of: • start tag • contents • end tag • Examples: <element>Content</element> • The “root” element in a document is the outer most element and contains all other elements in the documents. A document can only contain one root element • A element without content is called a “empty element”, but is still an element • Example: <empty_element attribute="value"/>In real world, html: <input> and <img> are empty elements
Components: XML Attributes • Attributes helps to describe XML elements • Attributes are always a part of the start tag • Attributes is also known as “name-value par” • Instead of <SHOW> ... <START_TIME> 3:45 </START_TIME> ... </SHOW> you could write <SHOW start_time="3:45“> ... </SHOW> It’s your design decision if something is an element or an attribute of some element (as in other data modelling processes)
Components: XML Entities • Two types of entities: • General – Can contain information stored in a XML document • Parameters – used in DTD for referring an element-group • Three types of general entities: • Character – used instead of special characters(e.g. < <) • Starts with ‘&’ and ends with ‘;’ • Content – used for reuse of blocks of text (variables) • Unparsed – used for binary or non-textual data (e.g. pictures)
Examples of entities • Character entity: • Character: > • Entity reference: > or > • Usage: <formula> x > y </formula> • Content entity: • Declaration: <!ENTITY address “123 Main St”> • Usage: <ship_address> &address; <ship_address> • Unparsed entity: • Declaration: <!ENTITY image SYSTEM “sunset.gif” NDATA GIF> • Usage: <picture> &aimage; </picture>
Defining entities • An entity is a name and an associated value • The value may be: • A special character code • A piece of text • A filename • Entity definitions in DTD (next session) or in the xml-document it selves: <!DOCTYPE DOCUMENT [<!ENTITY signature "Kys og Klap">]>
Components: Comments • Comments are ignored by the xml interpreter • The content of the commentary is surrounded by <!–- and --> • Example: <!-- This is a comment -->
Well-Formed XML Documents • A “well-formed” document satisfies the syntax rules of XML: • A XML document contains one root element • All elements must have a start and an end tag, except empty elements • Elements must be nested correctly • All attributes must have a value • Attributes can only be defined in start tags and must be unique • Element names are case-sensitive • Special characters must be written as entities • Names of elements must start with a letter or underscore and may only contain letters, numbers, hyphen, period and underscore
Parser • A xml parser is a program, that as minimum can validate whether a xml document is well formed. • Most parsers are also able to validate a xml document according to some xml language defined by a DTD or Schema. • A quick way is a view the document in a browser: MSIE5.5 +, Netscape 6+ or Mozilla/Firefox. • The most common parsers is properly Xerces (java) and MSXML (ms). • And you can write your own.....
Source Internet Explorer 6.0 Mozilla 1.7 Parsing with the browser • We have seen when it goes well. Here is a document with errors errors What is the error?
Reading stuff i.e. • Books: • XML and Web Services, Ron Schemlzer et al., Sams • XML- How To Program, Deitel et al., Prentice Hall • An Introduction to XML and Web Technologies, Anders Møller and Michael Schwartzbach, Addison-Wesley • Links: • www.zvon.orgReferences and tutorials • www.w3.orgStandards • www.w3school.comTutorials • www.topxml.comDeveloper forum • www.ibiblio.org/xmlXML bible • http://www.brics.dk/ixwt/Web site accompanying the book by Møller&Schwartzbach