210 likes | 510 Views
Document Object Model (DOM) MV4920 – XML 24 September 2001. Simon R. Goerger MAJ, US Army srgoerge@cs.nps.navy.mil. Goal DOM Description DOM Basics Tree Levels DOM Examples DOM Demo DOM Resources Summary. Outline.
E N D
Document Object Model(DOM)MV4920 – XML24 September 2001 Simon R. GoergerMAJ, US Army srgoerge@cs.nps.navy.mil
Goal DOM Description DOM Basics Tree Levels DOM Examples DOM Demo DOM Resources Summary Outline
The goal of the DOM group is to define a programmatic interface for XML and HTML. Goal
Document Object Model (DOM) A programming interface for XML documents It defines the way an XML document can be accessed and manipulated. A platform- and language-neutral interface that will allow programs and scripts to dynamically access and update the content, structure and style of documents. The document can be further processed and the results of that processing can be incorporated back into the presented page. What is a DOM?
Although the Document Object Model was strongly influenced by "Dynamic HTML", in Level 1, it does not implement all of "Dynamic HTML". In particular, events have not yet been defined. Level 1 is designed to lay a firm foundation for this kind of functionality by providing a robust, flexible model of the document itself. The Document Object Model is not a binary specification. DOM programs written in the same language will be source code compatible across platforms, but the DOM does not define any form of binary interoperability. The Document Object Model is not a way of persisting objects to XML or HTML. Instead of specifying how objects may be represented in XML, the DOM specifies how XML and HTML documents are represented as objects, so that they may be used in object oriented programs. What a DOM is NOT
The Document Object Model is not a set of data structures, it is an object model that specifies interfaces. Although this document contains diagrams showing parent/child relationships, these are logical relationships defined by the programming interfaces, not representations of any particular internal data structures. The Document Object Model does not define "the true inner semantics" of XML or HTML. The semantics of those languages are defined by W3C Recommendations for these languages. The DOM is a programming model designed to respect these semantics. The DOM does not have any ramifications for the way you write XML and HTML documents; any document that can be written in these languages can be represented in the DOM. What a DOM is NOT
The Document Object Model, despite its name, is not a competitor to the Component Object Model (COM). COM, like CORBA, is a language independent way to specify interfaces and objects; the DOM is a set of interfaces and objects designed for managing HTML and XML documents. The DOM may be implemented using language-independent systems like COM or CORBA; it may also be implemented using language-specific bindings like the Java or ECMAScript bindings specified in this document. What a DOM is NOT
With the XML DOM, a programmer can create an XML document, navigate its structure, and add, modify, or delete its elements. What can a DOM do?
The DOM represents a tree view of the XML document. The documentElement is the top-level of the tree. This element has one or many childNodes that represent the branches of the tree. DOM Tree
"Dynamic HTML" is a term used by some vendors to describe the combination of HTML, style sheets and scripts that allows documents to be animated. The W3C has received several submissions from members companies on the way in which the object model of HTML documents should be exposed to scripts. These submissions do not propose any new HTML tags or style sheet technology. The W3C DOM WG is working hard to make sure interoperable and scripting-language neutral solutions are agreed upon. Why the Document Object Model?
The DOM is separated into three parts: Core, HTML, and XML. The Core DOM provides a low-level set of objects that can represent any structured document. While by itself this interface is capable of representing any HTML or XML document, the core interface is a compact and minimal design for manipulating the document's contents. Depending upon the DOM's usage, the core DOM interface may not be convenient or appropriate for all users. The HTML and XML specifications provide additional, higher-level interfaces that are used with the core specification to provide a more convenient view into the document. These specifications consist of objects and methods that provide easier and more direct access into the specific types of documents. DOM Level (Architecture)
The DOM Architecture is divided in four module levels. Each module addresses or specializes an existing module for a particular domain. Domains covered by the current DOM API are XML, HTML, CSS, and events. DOM Level 0 Functionality equivalent to that exposed in Netscape Navigator 3.0 and Microsoft Internet Explorer 3.0 is referred to as "Level 0". There is no W3C specification for this Level. DOM Level (Architecture)
DOM Level 1 The first Level of the DOM specifications (DOM Level 1) was completed in October 1998. Level 1 provides support for XML 1.0 and HTML DOM Level 1 (Architecture)
DOM Level 2 (Architecture) • DOM Level 2 • The second Level of the DOM specifications (DOM Level 2) was completed in November 2000. Level 2 extends Level 1 with support for XML 1.0 with namespaces and adds supports for Cascading Style Sheets (CSS), events (user interface events and tree manipulation events), and enhances tree manipulations (tree ranges and traversal mechanisms).
DOM Level 3 (Architecture) • DOM Level 3 • The third Level of the DOM specifications, DOM Level 3, is under development. Level 3 will extend Level 2 by finishing support for XML 1.0 with namespaces (alignment with the XML Infoset and support for XML Base) and will extend the user interface events (keyboard). It will also add abstract schemas support (for DTDs, XML Schema, ...), the ability to load and save a document or an abstract schema, explore further mixed markup vocabularies and the implications on the DOM API ("Embedded DOM"), and will support XPath.
DOM Level Future • Further Levels. These may specify some interface with the possibly underlying window system, including some ways to prompt the user. They may also contain a query language interface, and address multithreading and synchronization, security, and repository.
Simple example of DOM usage (input) DocumentBuilderFactory documentBuilderFactory = DocumentBuilderFactory.newInstance (); documentBuilderFactory.setNamespaceAware (true); documentBuilderFactory.setValidating (true); DocumentBuilder documentBuilder = documentBuilderFactory.newDocumentBuilder (); Document document = documentBuilder.newDocument (); try { document = documentBuilder.parse (fileName); } // end try catch (Exception badParse) { … } Example
Simple example of DOM usage (output) Writer writer = new OutputStreamWriter(file); switch (node.getNodeType()) { case Node.CDATA_SECTION_NODE: writer.write ("<![CDATA[" + node.getNodeValue() + "]]>\n"); break; } // end switch Example
Problem: read incomplete XML document, merge with DTD to identify missing attributes, read directory structure to locate information to update default values, and output new XML document with updates. Demo XMLtoDOMtoXML.bat XML Document DTD Document Demo DOM Updated Dom Updated XML Document Directories (access with Java code)
Some resources related to the DOM are to be found at Robin Cover's DOM pages http://www.oasis-open.org/cover/dom.html#W3CDocs the Open Directory Project W3C DOM pageshttp://dmoz.org/Computers/Programming/Internet/W3C_DOM/ Some related DOM-based APIs are being developed as well, for example in the specifications for Mathematical Markup Languagehttp://www.w3.org/TR/MathML2/ Scalable Vector Graphics http://www.w3.org/TR/SVG/ Synchronized Multimedia Integration Languagehttp://www.w3.org/TR/smil-boston-dom/ NOTE: Placing the following *.jar files from JAXP in your jdk*.*.*/jre/lib/ext directory, maybe required for use of the DOM: crimson.jar, jaxp.jar, and xalan.jar Resources
DOM is a programming interface for XML documents that defines the way an XML document can be accessed and manipulated. Summary