330 likes | 421 Views
XML Development. Chuck Wood Carlson School of Management University of Minnesota. HTML was derived from SGML for Web displays. It is not: Extensible (You can’t make your own tags.) Structured (You can’t define types of data.) Descriptive (Fixed Head, Body, and that’s all.)
E N D
XML Development Chuck Wood Carlson School of Management University of Minnesota
HTML was derived from SGML for Web displays. It is not: Extensible (You can’t make your own tags.) Structured (You can’t define types of data.) Descriptive (Fixed Head, Body, and that’s all.) Validating (There’s no way to test your data for valid representation.) XML, also a derivative of SGML, allows better handling of data XML, HTML, SGML, … SGML HTML XML
What Is XML (and, of course, what is it not)? • XML (eXtensible Markup Language) is a markup language, like HTML, that allows relatively easy transfer of information via a Web page. • You still need HTML for traditional Web pages!!!!! HTML is • Faster • Easier • Smaller
Why You Should Learn XML… • XML is a new technology • MIS managers need to understand technology • If I offered to set you up with a Web-based EDI system using XML for $50,000, is that a good deal or a bad deal? • Evaluation of some MIS managerial questions require that you have a somewhat-deep understanding of the underlying technology
Format for XML • Valid XML requires several elements: • An XML file for data and tags • A DTD (Document Type Definition) file to define meaning to the tags • Possibly, an XSL file for formatting. Invoice.xml Invoice.dtd Invoice.xsl
You define your ownelement tags Every elementtag has an ending tag or terminator Attributes are in quotes XML is case sensative!!! Well-formed XML is not necessarily valid XML XML consists mainly of: element tags attributes. Well-formed XML Element Attribute <class id="IDSc6442"> <section id="001"> <instructor name="Chuck Wood“ id=“1234"/> <student>Joe Student</student> <student>Jane Student</student> </section> </class> Terminator Ending Tag
XML Comments • Like HTML, XML uses a <!– tag for a comment: <!-- This is a comment. --> • Be sure to comment your code (so I know what you were trying to do)!!!
XML Processing Instructions • Processing instructions (PIs) allow XML documents to contain instructions for application, using the <? Tag: • Often, the only processing instruction in an XML document is used to describe the XML version used: <?xml version="1.0" ?>
What’s It Look Like in IE 5.0? In IE 5.0, XML defaults to display as a tree document.
DTDs and Valid XML • XML is well-formed when attributes are in quotes and every tag is terminated • Well-formed XML is not necessarily valid XML. • A DTD is used to define each possible element inside a valid XML file: • Each element must be declared. • Each element needs a data type. • Each cardinal relationship (one to many, one to one, etc.) between elements must be declared. • Every attribute of each element must be declared. DTDs allow a shared language between those with shared interests
The Importance of DTDs DTDs are extremely important. Companies can use DTDs to develop a standard language with each other’s XML. Several scientific, educational, and financial industries have already defined DTDs that are available so different companies can communicate with each other through XML. XML requires a DTD to be valid. XML will not (and should not) work without this type of document.
FIXML – XML for Financial Information Exchange How long before the SEC requires electronic transfer of all 10K information? FIXML allows the transfer of financial data.
FpML – Financial Products Markup Language FIXML may have some competition. FpML allows the easy transfer of financial products data.
Chemistry Markup Language -- CML Having a common language is vital to Web-EDI within industries. The Food and Drug Administration (FDA) and the Oak Ridge National Laboratory have developed a chemistry DTD called CML. Elements like <ATOMS>, <MOL> (for molecule), <BONDS>, <MOLE>, and <FORMULA> are defined.
WML – Wireless Markup Language for WAP FIXML may have some competition. FpML allows the easy transfer of financial products data.
DTDs for Dessert? A common language for hobbies? Ya gotta eat, right? Why not have a DTD that allows easy transfer of recipies, like this one?
DTDs for Dissertations This one is, of course, near and dear to my heart…
A DTD for (more than) Everyone XML is relatively new, but already there seems to be more XML DTD specifications than there are Washington lobbyists. Let the fight for standards begin!
Tying XML to a DTD • DTDs typically exist in a separate file. • To access this file within XML, just add a DOCTYPE command with the <! tag: <?xml version="1.0" ?> <!-- student.xml --> <!DOCTYPE class SYSTEM "student.dtd"> <class id="IDSc6442"> <section id="001"> <instructor name="Chuck Wood“ id=“1234"/> <student>Joe Student</student> <student>Jane Student</student> </section> </class> DTD File
Valid XML and DTDs A DTD, like this DTD used with the class XML in the previous slide, is necessary to change well-formed XML into valid XML • DTDs are required for valid XML. • Every XML tag needs to be defined in the DTD file: <?xml version="1.0" encoding="UTF-8" ?> <!-- student.dtd --> <!ELEMENT class (section*) > <!ATTLIST class id ID #REQUIRED> <!ELEMENT section (instructor?, student*)> <!ATTLIST section id ID #REQUIRED> <!ELEMENT instructor EMPTY> <!ATTLIST instructor name CDATA #REQUIRED id ID #IMPLIED> <!ELEMENT student (#PCDATA)>
DTD Language Structures • DTDs consist of two basic structures: • Elements are tags that you define in your XML. • Attributes are part of elements. They allow you to describe an element more fully. • Relationship between elements can be defined • Entities can also be defined, but are rarely used (and not addressed much in this class.
Entities (Our First and Last Coverage of XML Entities) Entities are probably misnamed. They are not often used and define some text or a file that is important. Consider the trademark entity below: <!ENTITY trademark SYSTEM "http://www.chuckwood.com/trademark.xml"> This allows XML to incorporate file information by using a &trademark; tag. Entities can be used to define certain constants to be used in XML pages.
Elements • An XML <!ELEMENT tag defines an element tag allowable in XML. • XML Elements can be one of three types, as demonstrated by DTD coding of HTML tags: • Elements to be terminated immediately, such as the <BR> HTML tag: <!ELEMENTBREMPTY> • Elements that have corresponding child elements, such as the <TR> HTML tag: <!ELEMENTTR (TH*, TD*)> • Elements that have text, usually character data, such as the <H1> HTML tag: <!ELEMENT H1 (#PCDATA)>
Element-to-Element Cardinality • You can establish cardinal elemental relationships in a DTD. For example, consider the red operators in the following DTD: <!ELEMENT section (room, instructor?, student*)> The “?” and “*” indicate cardinal relationships: • “?” indicates one-to-zero or one • “” (blank) indicates one-to-one • “*” indicates one-to-zero or more • “+” indicates one-to-one or more This XML states that each section has no or one instructor, and zero or more students.
Element Order • DTD allows you to force a specific element order inside your XML document. The following DTD: <!ELEMENT section (room, instructor?, student*)> forces room, instructor, and students to be declared in that order • Use the “pipe” operator (|) instead of a comma to allow declaration in any order: <!ELEMENT section (room| instructor?| student*)>
Declaring No Cardinality • You signify that an element has no relationships to any other element using the EMPTY DTD attribute: <!ELEMENT instructor EMPTY> • The instructor element is allowed to have attributes, but can have no text or subordinate XML element tags: <instructor name="Chuck Wood"/> or <instructor name="Chuck Wood"></instructor>
PCDATA • Declaring #PCDATA as a child type allows text to be entered between a tag and its ending tag. • The following DTD code: <!ELEMENT student (#PCDATA)> allows the following XML: <student>John Doe</Student> • Without #PCDATA, no text (e.g., “John Doe”) would be allowed to be entered between the starting and ending tags.
XML Attributes • XML attributes are attributes that are placed within tags: <instructor name="Chuck Wood" id="1234"/> • Attributes are defined in an attribute list in the DTD using the <!ATTLIST directive: <!ELEMENT instructor EMPTY> <!ATTLIST instructor name CDATA #REQUIRED id ID #IMPLIED>
Attribute Types • There are several different attribute types, but two are particularly important: • CDATA – Allows any character or alphanumeric data • ID – Forces a unique value of an attribute For a complete list, if you want one, check out W3C’s document definition markup language at: http://www.w3.org/TR/NOTE-ddml.
Attribute Restrictions • Each attribute declaration provides information in whether the attribute’s presence is required: • #REQUIREDindicates the attribute must always be present • #IMPLIEDindicates that the attribute need not be present • Default valueindicates a default value for an attribute • #FIXED default valueindicates that the attribute must always have a specific value
Designing DTDs and XML • The subject of how to design a DTD has come up quite frequently – specifically, what are attributes vs. what are child elements? • The most common response is that you should use object-oriented design techniques: • Entities in the OO design should be classified as XML elements. • Attributes in the OO design should become attributes in XML.
Things you should know… • Many tools (like IE 5.0, Microsoft DOM) have the capability to parse invalid, but well-formed XML. • Never transmit invalid XML (XML without a DTD)!!! • DTDs are important in developing a shared language between partners, and for debugging bad XML. • XmL iS CaSe SenSAtive!!! • TIP: Many sites use lower case only for their tags. It’s easier.