280 likes | 441 Views
Ziele des Kapitels über XML. Studierende verstehen die Bedeutung von XML. Studierende erhalten einen Überblick über die XML Sprachfamilie. Studierende lernen, einfache XML Dokumente und deren Layout zu spezifizieren. . XML – Kapitel: Überblick. XML in 7 points
E N D
Ziele des Kapitels über XML • Studierende verstehen die Bedeutung von XML. • Studierende erhalten einen Überblick über die XML Sprachfamilie. • Studierende lernen, einfache XML Dokumente und deren Layout zu spezifizieren.
XML – Kapitel: Überblick • XML in 7 points • Überblick über XML Entwurfsziele und XML als Familie von Technologien • XML Motivation und erste Beispiele • Grenzt XML von HTML und SGML ab und zeigt einfache Anwendungsbeispiele • XML Schema and UML • Ausführliches Beispiel der Abbildung • XML Tutorial • Enthält Spezialbeispiele zur XML Sprachfamilie
XML • Extensible Markup Language, abbreviated XML, describes a class of data objects called XML documents and partially describes the behavior of computer programs which process them. XML is an application profile or restricted form of SGML, the Standard Generalized Markup Language [ISO 8879]. By construction, XML documents are conforming SGML documents. • see http://www.w3.org
XML • Definition: A software module called an XML processor is used to read XML documents and provide access to their content and structure. • Definition: It is assumed that an XML processor is doing its work on behalf of another module, called the application.
XML • XML documents are made up of storage units called entities, which contain either parsed or unparsed data. • Parsed data is made up of characters, some of which form character data, and some of which form markup. • Markup encodes a description of the document's storage layout and logical structure. XML provides a mechanism to impose constraints on the storage layout and logical structure.
Design Goals for XML (1) The design goals for XML are: 1.XML shall be straightforwardly usable over the Internet. 2.XML shall support a wide variety of applications. 3.XML shall be compatible with SGML. 4.It shall be easy to write programs which process XML documents. 5.The number of optional features in XML is to be kept to the absolute minimum, ideally zero.
Design Goals for XML (2) 6. XML documents should be human-legible and reasonably clear. 7.The XML design should be prepared quickly. 8.The design of XML shall be formal and concise. 9.XML documents shall be easy to create. 10.Terseness in XML markup is of minimal importance.
XML in 7 pointssee also http://www.w3.org/1999/XML-in-10-points • XML, XLink, Namespace, DTD, Schema, CSS, XHTML,... If you are new to XML, it may be hard to know where to begin. • XLink, XPointer: generalized link concepts • XSL: more powerful than CSS, serves formatting purposes for XML documents
1. XML is a method for putting structured data in a text file • “Structured data" : spreadsheets, address books, configuration parameters, financial transactions, technical drawings, etc. • Text format allows one to look at the data without the program that produced it. XML is a set of rules, guidelines, conventions, for designing text formats for such data, in a way that produces files that are easy to generate and read (by a computer) and that are unambiguous and platform-independent.
2. XML looks a bit like HTML but isn't HTML • Like HTML, XML makes use of tags (words bracketed by '<' and '>') and attributes (of the form name="value") • While HTML specifies what each tag & attribute means (and often how the text between them will look in a browser), XML uses the tags only to delimit pieces of data, and leaves the interpretation of the data completely to the application that reads it. • E.g.: If you see "<p>" in an XML file, don't assume it is a paragraph. Depending on the context, it may be a price, a parameter, a person, etc.
3. XML is text, but isn't meant to be read • XML files are text files, but theyare not meant to be read by humans. They are text files, because that allows experts (such as programmers) to more easily debug applications. • The rules for XML files are much stricter than for HTML. A forgotten tag, or an attribute without quotes makes the file unusable, while in HTML such practice is often explicitly allowed, or at least tolerated.
4. XML is a family of technologies • We will look at the following technologies: • XML • DTD (Document Type Definition) • XML Schema, XSchema • XPath, XPointer • XInclude, • XSLT, CSS • XLink
4. XML is a family of technologies • XML 1.0: specification that defines what "tags" and "attributes" are, but around XML 1.0, there is a growing set of optional modules that provide sets of tags & attributes, or guidelines for specific tasks.
4. XML is a family of technologies When starting with XML, it's important to realize that XML is not a markup language itself (like HTML), but it provides rules (like SGML) for defining a markup language. The names of the tags are up to the authors. Example: <myFirstTag> <Hello/> <World/> </myFirstTag> So XML is about the characters, tags can consist of and defines a set of rules for well-formedness. (every opening tag must have a closing tag, …)
DTD's solve the problem of defining the structure of a document. Example: <myFirstTag myFirstAttribute="Hello World"> <Hello/> <World/> </myFirstTag> A correct DTD tells, how the tags should be arranged, to form a valid document, or what attributes a tag caninclude. In our case a DTD tells the following: • The name of the main tag is "myFirstTag" • myFirstTag has an attribute myFirstAttribute • Inside myFirstTag there must exist two tags, "Hello" and "World" The main problem of the DTD concept is, that it tells nothing about the content of a tag (data type, format, pattern, …) <myFirstTag>12.3</myFirstTag> <mySecondTag>Text Content</mySecondTag>
4. XML is a family of technologies • XML Schemas 1 and 2 help developers to precisely define their own XML-based formats. • XSchema fills the gaps of the DTD concept. In fact you can replace your DTD's entirely by an "XML Schema“ • In addition to a DTD you can write exact rules about the content of attributes and tags. This includes: • data types (integer, string), • patterns (the content has to be a valid email address) • lists of tokens
4. XML is a family of technologies • If we talk about XML as set of rules that define the syntax of a document, then XSchema's purpose is to specify a pattern/semantic for a special purpose. • Example: myFirstTag has to contain a number between 5 and 10.7, that has to have 5 decimal places (6.30020)
4. XML is a family of technologies • "XPath (XML Path Language) is a language for addressing parts of an XML document, designed to be used by both XSLT and XPointer.“ • In other words: XPath provides the functionality to jump to a certain part of an XML Document. It's like a bookmark to identify a certain point/part of an XML Document.
4. XML is a family of technologies • "XPointer, which is based on XPath, supports addressing into the internal structures of XML documents. It allows for traversals of a document tree and choice of its internal parts based on various properties, such as element types, attribute values, character content, and relative position.„ • Basically works like anchors in html: <link xlink:href="mydocument.xml#xpointer(//AAA/BBB[1])"/>
4. XML is a family of technologies • XInclude allows to include documents / parts of documents into a XML document (mostly like programming languages do through #include).
4. XML is a family of technologies • CSS, the style sheet language, is applicable to XML as it is to HTML. • XSL (Extensible Stylesheet Language) is the advanced language for expresing style sheets. • A transformation expressed in XSLT describes rules for transforming a source tree into a result tree. The transformation is achieved by associating patterns with templates. • A pattern is matched against elements in the source tree. • A template is instantiated to create part of the result tree. • The result tree is separate from the source tree. • The structure of the result tree can be completely different from the structure of the source tree.
In constructing the result tree, elements from the source tree can be filtered and reordered, and arbitrary structure can be added. A transformation expressed in XSLT is called a stylesheet. This is because, in the case when XSLT is transforming into the XSL formatting vocabulary, the transformation functions as a stylesheet. In other words: XSL provides you with a way to transform any XML input to match a certain purpose. • Examples: • transform an order, saved in XML, to a delivery note • transform UML logic, saved in XML, to class definitions in a given programming language. • …
4. XML is a family of technologies • Xlink (still in development) describes a standard way to add hyperlinks to an XML file. XPointer & XPath are syntaxes for pointing to parts of an XML document.
"XLink (XML Linking Language) allows elements to be inserted into XML documents in order to create and describe links between resources. It uses XML syntax to create structures that can describe links similar to the simple unidirectional hyperlinks of today's HTML, as well as more sophisticated links." With XLink's you can replace today's <a href> links, and tie links together like resource chains. XLink's can contain information about the context, they are used in and about additional information available.
4. XML is a family of technologies (3) • The DOM is a standard set of function calls for manipulating XML (and HTML) files from a programming language. • XML Namespaces is a specification that describes how you can associate a URL with every single tag and attribute in an XML document. What that URL is used for is up to the application that reads the URL, though. • RDF, W3C's standard for metadata, uses it to link every piece of metadata to a file defining the type of that data.
5. XML is verbose, but that is not a problem • Since XML is a text format, and it uses tags to delimit the data, XML files are nearly always larger than comparable binary formats. That was a conscious decision by the XML developers. The advantages of a text format are evident (see 3 above), and the disadvantages can usually be compensated at a different level. • Communication protocols such as modem protocols and HTTP/1.1 (the core protocol of the Web) can compress data on the fly.
6. XML is new, but not that new • Development of XML started in 1996 and it is a W3C standard since February 1998. • The technology isn't very new. Before XML there was SGML, developed in the early '80s, an ISO standard since 1986, and widely used for large documentation projects. • For HTML, development started in 1990. • The designers of XML simply took the best parts of SGML, guided by the experience with HTML, and produced something that is no less powerful than SGML, but vastly more regular and simpler to use.
7. XML is license-free, platform-independent and well-supported • By choosing XML as the basis for some project, you buy into a large and growing community of tools and engineers experienced in the technology. Opting for XML is a bit like choosing SQL for databases: you still have to build your own database and your own programs/procedures that manipulate it, but there are many tools available and many people that can help you. • XML isn't always the best solution, but it is always worth considering.