1 / 101

XML and COBOL

XML and COBOL. XML. XML = Extensible Markup Language Used to expose the structure and content of a document Becoming a universal means of exchanging data Tag language <author> <firstname>Charles</firstname> <lastname>Dickens</lastname> </author>.

bella
Download Presentation

XML and COBOL

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. XML and COBOL

  2. XML • XML = Extensible Markup Language • Used to expose the structure and content of a document • Becoming a universal means of exchanging data • Tag language <author> <firstname>Charles</firstname> <lastname>Dickens</lastname> </author>

  3. XML • Tags are user-defined • Every start tag has a matching stop tag <atag> …</atag> • Sometimes the tags are combined into one start and stop tag <media type = “CD” /> • Tags can’t overlap NO: <a> <b> </a> </b>

  4. XML • Tags can be nested <a> <b> </b> </a> • Documents are tree-structured <a> <b></b> <c> <d></d> </c> </a> a b c d

  5. XML • Text based documents • Case sensitive • Must contain a single root element • Start with an XML declaration and comments <?xml version =“1.0”?> <!– comment line - -> <a> </a>

  6. XML • XML is “Well Formed” if 1) Single root element 2) Start and end tags matching for all elements 3) Proper nesting 4) Attribute values in quotes

  7. XML Parsers • An XML parser is a program that can read an XML document and provide programmatic access to the document • Two types of parsers: 1) DOM based – Document Object Model Constructs a tree that represents the document 2) SAX based – Simple API for XML Generates events when parts of the document are encountered. • Can also be classified as “push” or “pull” parsers

  8. XML Characters • Consist of carriage returns, line feeds and Unicode characters • XML is either “markup” or “text” • Markup is enclosed in < and > • Character text is the text between a start and end tag • Child elements are considered markup

  9. White Space • Parsers consider whitespace inside text data to be significant and must pass it to an application • An application can consider whitespace significant or insignificant. • Normalization is the process in which whitespace is collapsed or removed

  10. Entities • &, <, >, ‘ (apostrophe), and “(double quote) are special characters and may not be used in character data directly • To use these characters we code entity references which begin with an ampersand and end with a semicolon • &amp; &lt; &gt; &apos; &quot; • <mytag>David&apos;s Tag</mytag>

  11. Unicode • XML supports Unicode • Each Unicode character starts with an ampersand, followed by a sharp (#), an integer, and a semicolon • &#1583;

  12. Determining the Encoding Type • Sources used to determine the encoding of an XML document when XMLPARSE(XMLSS) is in effect: • The type of data item that contains the XML document. (We will only consider alphanumeric.) • The ENCODING phrase (if used) on the PARSE statement • The CCSID specified in the CODEPAGE compiler option

  13. Two ways to Specify the Encoding for XMLPARSE(XMLSS) • Put the document in an alphanumeric item (PIC X) 1) Specify the encoding on the PARSE statement: PARSE MYDOC WITH ENCODING 1208 … 2) Add a CODEPAGE compiler option

  14. Markup • Most items have distinct begin and end tags: <name>David</name> • Empty elements begin and end with one tag: <img src = “img.gif” /> • Tags can contain “attributes” as in the src attribute in the img tag above • Attribute values must be quoted with single or double quotes • Element and attribute names can be any length and may contain letters, digits, underscores, hyphens and periods. Must begin with letter or underscore.

  15. Comments • Comments in XML have the same format as HTML • Start with <!— • End with --> • <!– This is a comment -->

  16. Processing Instructions • Example: <?xml:stylesheet type=“text/xsl” href=“usage.xsl”?> • Delimited by <? and ?> • Passed to the parser for additional information about the document • Contain a “PI Target”: xml:stylesheet • Contain a “PI Value”: type=“text/xsl” href=“usage.xsl” • Allow a document author to embed application specific info in the document

  17. CDATA Sections • CDATA section can contain characters, reserved characters, and white space • Not processed by the XML parser • Sometimes used for scripting code or embedding XML inside a document • Begin with <![CDATA[ • End with ]]> • < and > must be coded as entities

  18. CDATA Example <![CDATA if (x &gt; y) { x = x + y; } ]]>

  19. XML Namespaces • Different authors might create the same tag names with different meanings • Namespaces provide a means for authors to prevent collisions on tag names • <book:title>The Idiot</book:title> • <movie:title>Avatar</movie:title>

  20. XML Namespaces • Each namespace is tied to a Uniform Resource Identifier (URI) • Authors create their own namespace prefixes • Namespaces are created in the root tag: <library xmlns:book = “urn:woolbright:bookinfo” xmlns:movie = “urn:woolbright:movieinfo”> • URLs are sometimes used for URIs

  21. Default Namespaces • To avoid coding a prefix on every tag, code a default namespace: <library xmlns = “urn.woolbright:default” xmlns:book = “urn:woolbright:bookinfo” xmlns:movie = “urn:woolbright:movieinfo”>

  22. Enterprise COBOL • Contains two event-based parsers that allows you to read XML documents and process them with COBOL • XML documents can be retrieved from an MQ message, CICS TD queue, or IMS message processing queue • XML documents that are read from a file must be brought into storage as a single item. (Records can be combined using STRING or other techniques) • If XMLPARSE(XMLSS) is in effect, you can parse an XML file by passing one record at a time

  23. z/OS COBOL Features for Processing XML Input XML PARSE – begins parsing the document and identifies the processing procedure in your document XML PARSE MYDOCUMENT PROCESSING PROCEDURE 100-PARSE ON EXCEPTION DISPLAY ‘XML DOCUMENT ERROR’ XML-CODE STOP RUN NOT ON EXCEPTION DISPLAY ‘PARSED DOCUMENT SUCCESSFULLY’ END-XML

  24. z/OS COBOL Features for Processing XML Input • Processing XML involves passing control between the parser and the processing procedure you write to handle events • Processing Procedure – receives and processes the events that are generated by the parser. This is a paragraph or section in your Cobol program

  25. Parser/Procedure Interaction • Parser passes control to the procedure for each XML event • Control returns to the parser at the end of the procedure • This continues until either: 1) the parser detects an error in the document and signals an EXCEPTION event, or 2) the parser signals END-OF-INPUT event and the processing procedure returns to the parser with XML-CODE still set at 0 3) you terminate parsing deliberately by setting XML-CODE to -1 before returning to the parser

  26. XML Parsers • Compiler options control the parser type • CBL XMLPARSE(XMLSS) – chooses the z/OS XML System Services Parser. This provides enhanced features – namespace processing, validation with respect to a schema • CBL XMLPARSE(COMPAT) – chooses the parser built into the COBOL library

  27. Processing Procedure • The parser reads the document and responds to events • Gives control to the processing procedure when events occur • The processing procedure responds to the event and turns control back to the parser for further parsing

  28. Parse Syntax Here is the syntax for the XML PARSE statement

  29. XML PARSE • XML PARSE begins parsing, identifies the source document, and the processing procedure • Specify the ENCODING option to describe the document’s encoding • Specify the VALIDATING option to identify an XML schema against which the document will be validated

  30. COBOL Features for Processing XML Input • Special Registers • XML-CODE - to determine the status of XML parsing. PIC S9(9) BINARY • XML-EVENT - to receive the name of the event. PIC X(30) • XML-NTEXT – to receive XML document fragments that are returned as national character data (Unicode). Variable-length alphanumeric item • XML-TEXT – to receive XML document fragments returned as aphanumeric data. Variable-length alphanumeric item

  31. COBOL Features for Processing XML Input • Special Registers • XML-NAMESPACE - to receive a namespace identifier event, or for an element name or attribute name that is in the namespace. Variable-length alphanumeric item • XML-NNAMESPACE – national namespace. Variable-length alphanumeric item • XML-NAMESPACE-PREFIX – to receive a namespace prefix. Variable-length alphanumeric item • XML-NNAMESPACE-PREFIX – to receive a national namespace prefix. Variable-length alphanumeric item

  32. Prior to Parsing • In order to process an XML document, the entire document must be in memory • Common sources of XML: • WebSphere MQ message • CICS Transient Queue • CICS Communications area • IMS message processing queue • Reading a file of records

  33. Reading XML Off A File • The entire XML file must be placed in a COBOL data item • You will need: • A FILE-CONTROL entry to define the file • An OPEN statement to open the file • READ statements to read all the records into a data item in WORKING-STORAGE • Optionally, a STRING command to string all the separate records together into one continuous stream, removing extraneous blanks, and to handle variable length records

  34. Parsing with XMLPARSE(XMLSS) XML PARSE document PROCESSING PROCEDURE event-handler-name ON EXCEPTION … NOT ON EXCEPTION … END-XML • Parsing continues until 1) an END-DOCUMENT event occurs 2) the parser signals EXCEPTION and the procedure returns to the parser with the XML-CODE register still set to 0 which indicates that no further XML data will be provided to the parser 3) you terminate processing by moving -1 to XML-CODE

  35. Parsing • If XMLPARSE(XMLSS) is in effect, you can also use any of these optional phrases of the XML PARSE statement: • ENCODING, to specify the CCSID of the document • RETURNING NATIONAL to cause the parser to automatically convert UTF-8 or single byte characters to national characters for return to the processing procedure • VALIDATING, to cause the parser to validate the document against an XML schema

  36. Events • For each event that occurs during parsing, the parser sets the associated event name in the XML-EVENT register and passes this to the processing procedure. • Depending on the event, other registers can also be set • Typically, XML-TEXT is set with the data that caused the event • Some typical events: START-OF-DOCUMENT START-OF-ELEMENT ATTRIBUTE-NAME END-OF-ELEMENT CONTENT-CHARACTERS START-OF-CDATA-SECTION END-OF-DOCUMENT

  37. Processing Flow

  38. Parsing • The parser checks XML documents for most aspects of well formedness. • Documents can be parsed with or without validation • Validation insures that the document adheres to the content and structure described in the schema. • Validation can insure that there are no unexpected elements, no required elements are missing, and that element and attribute values are legal

  39. Transforming XML Text to Cobol Data Items • For alphanumeric items decide if the XML data should be at the left or right end. For right justification define a field as JUSTIFIED RIGHT • You might be able to move a numeric field that is decorated by moving it to a numeric-edited Cobol field. Then move it (de-edit) to a numeric field. • Use intrinsic function NUMVAL to extract and decode simple numeric values • Use intrinsic function NUMVAL-C to extract and decode XML data that represents monetary values

  40. Exercise #1 • Use the file BCST.SICCC01.PDSLIB(XMLDATA2) • The file structure is similar to the one below: <?xml version=”1.0” encoding=”ibm-1140” standalone=”yes”?> <batch> <trans> <name>Joe Smith</name> <amt>12.32</amt> <amt>5.42</amt> </trans> <trans <name>Tina Louise</name> <amt>8.99</amt> </trans> … </batch

  41. Exercise #1 • Write an XML Cobol program that reads the file and copies it to memory. • Print out a report that lists each customer name and a total for each customer. • Print a grand total for the entire file Name Amount Joe Smith 17.74 Tina Louise 8.99 Grand Total 26.73

  42. Parsing XML in Segments • Read a segment (record) from the file • Pass the record to the parser using XML PARSE to start the parser • Control flows between the parser and the processing procedure until the end of the record • At record end, the parser returns control to the processing procedure after setting XML-EVENT to END-OF-INPUT and setting XML-CODE to 0.

  43. Parsing XML in Segments • If the processing procedure reads the next record successfully, it sets XML-CODE to 1 to signal more input, and returns to the parser to continue parsing. • This process continues until EOF when the processing procedure returns to the parser after leaving XML-CODE set to 0. • CSU.PUBLIC.XML(XMLSEG) is a good example program to use

  44. Exercise #2 Convert(XMLDATA2) The file structure is similar to the one below: <?xml version=”1.0” encoding=”ibm-1140” standalone=”yes”?> <batch> <trans> <name>Joe Smith</name> <amt>12.32</amt> <amt>5.42</amt> </trans> <trans <name>Tina Louise</name> <amt>8.99</amt> </trans> … </batch

  45. Exercise #2 For this exercise, rework the code you wrote in Exercise #1 Use the same input file BCST.SICCC01.PDSLIB(XMLDATA2) Instead of reading all the XML into a single area of memory, read and process the XML file one record at a time

  46. Exception Processing • Document errors cause the parser to set an exception code in XML-CODE and to signal an XML exception event. • The exception event can be handled by the ON EXCEPTION or NOT ON EXCEPTION clause of the PARSE statement

  47. Exception Processing • XML-CODE contains a four-byte field that is the concatenation of two two-byte fields: Return Code 2 Bytes Reason Code 2 Bytes XML-CODE 2 Bytes

  48. Exception Processing • Cobol definition of the return code and reason code fields: 1 XML-DECODE. 2 RTN COMP PIC 9(2). 2 RSN COMP-5 PIC 9(4). • The two values combine to describe the error. Consult the IBM XML documentation for codes: http://publib.boulder.ibm.com/cgi-bin/bookmgr/BOOKS/gxlza120/CCONTENTS

  49. Printing the XML-CODE 1 XML-DECODE. 2 RTN COMP PIC 9(2). 2 RSN COMP-5 PIC 9(4). 1 HV PIC X(16) VALUE '0123456789ABCDEF'. DISPLAY ' RC=' RTN ',REASON=X ''' HV(FUNCTION MOD(RSN / 4096 16) + 1:1) HV(FUNCTION MOD(RSN / 256 16) + 1:1) HV(FUNCTION MOD(RSN / 16 16) + 1:1) HV(FUNCTION MOD(RSN / 1 16) + 1:1) ''''

More Related