630 likes | 784 Views
IMS XML Database and XQuery. Christopher Holtz IBM August 15 th , 2007 #1231. Ok…so whats the big idea behind XML these days anyway?!?. What is XML. A Standardized, Simple, and Self-Describing Markup Language for documents containing structured or semi-structured information.
E N D
IMS XML Databaseand XQuery Christopher Holtz IBM August 15th, 2007 #1231
Ok…so whats the big idea behind XML these days anyway?!?
What is XML • A Standardized, Simple, and Self-Describing Markup Language for documents containing structured or semi-structured information. <?xml version=“1.1” encoding=“UTF-8”?> <Presentation xmlns="http://www.SHARE.com"> <title>IMS XML Database</title> <length>60</length> <presenter> <lastName>Holtz</lastName> <firstName>Christopher</firstName> </presenter> <Comments session=“Is IMS A Native XML Database?”> <comment>Loved It</comment> <comment>Can’t wait to start using it</comment> </Comments> </Presentation>
Why XML… • Standard Internet Data Exchange Format • Self-Describing • Handles encoding (internationalization) • Handles byte ordering • Can Represent Almost Anything • Forces Syntax-Level Interoperability • Easily Parsed • Confers Longevity • Standard! <?xml version=“1.1” encoding=“ebcdic-cp-us”?> <OrderNumber>110203</OrderNumber>
Standards built on XML (to name a few…) Financial Other Field Vocabularies • DTD • DOM • SAX • SOAP • SQL/XML • VoiceXML • WAP • WSDL • WS-Policy • XForms • XHTML • XInclude • XLink • XML Base • XML Encryption • XML Key Management • XML Processing • XML Schema • XML Signature • XPath • XPointer • XQuery • XSL and XSLT • … • FIXML • FpML • IFX • MMDL • OFX Schema • RIXML • SWIFTNet • XBRL • Accounting • Advertising • Astronomy • Building • Chemistry • Construction • Education • Food • Finance • Government • Healthcare • Insurance • Legal • Manufacturing • News • Physics • Telecommunications • … Open Source and Java • Xerces • Xalan • JAXB • JAXP • JAXR • JAX-RPC • JDOM • XQJ
The XML Schema Definition Language An XML language for defining the legal building blocks of a valid XML document An XML Schema: • defines elements and attributes that can appear in a document • defines which elements are child elements • defines the order and number of child elements • defines whether an element is empty or can include text • defines data types for elements and attributes • defines default and fixed values for elements and attributes Defines an agreed upon communication contract for exchanging XML documents
XML Schema Example <?xml version="1.0" encoding="UTF-8"?> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns="http://www.SHARE.net" targetNamespace="http://www.SHARE.net" elementFormDefault="qualified"> <xsd:element name=“Presentation”> <xsd:complexType> <xsd:sequence> <xsd:element name=“title" type="xsd:string"/> <xsd:element name=“length“ type="xsd:integer”/> <xsd:element name=“Presenter” minOccurs=“0” maxOccurs=“unbounded”> <xsd:element name=“lastName“ type="xsd:string”/> <xsd:element name=“firstName“ type="xsd:string”/> </xsd:element> <xsd:element name=“Comments” minOccurs=“0” maxOccurs=“unbounded”> <xsd:element name=“comment” minOccurs=“0” maxOccurs=“unbounded” type=“xsd:string”/> <xsd:attribute name=“session” type="xsd:string”/> … </xsd:schema> Presentation Presenter Comments comment
Well formed vs. Valid XML Document • Well formed – Obeys the XML Syntax Rules • must begin with the XML declaration • must have one unique root element • all start tags must match end-tags • XML tags are case sensitive • all elements must be closed • all elements must be properly nested • all attribute values must be quoted • XML entities must be used for special characters • Valid – Conforms to a specific XML Schema
Great….so XML is fantastic as a data exchange …but why would I want to store my data as XML.
Why XML Databases… • Physical Storage Benefits • Very much depends on the type of data and how it is going to be used!! • Data coming in as XML and going out as XML • Need to convert to / from XML • Structure • Types / Encoding • Conceptual divide between XML and Physical Storage Layout • XML Standards (XML Schema’s, XQuery, etc)
Sounds great…but I certainly can’t migrate my data off of IMS
XML Schema <?xml version="1.0" encoding="UTF-8"?> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:ims="http://www.ibm.com/ims" xmlns="http://www.ibm.com/ims/PSBName/PCBName" targetNamespace="http://www.ibm.com/ims/PSBName/PCBName" elementFormDefault="qualified"> <xsd:annotation> <xsd:appinfo> <ims:DLI mode="store" PSB="AUTPSB11" PCB="AUTOLPCB" dsg="DATASETG" meanLength="1000" numDocs="100"/> </xsd:appinfo> </xsd:annotation> <xsd:element name=“A”> <xsd:complexType> <xsd:sequence> <xsd:element name=“field1" type="xsd:int"/> <xsd:element name=“field2"> <xsd:simpleType> <xsd:restriction base="xsd:string"> <xsd:maxLength value="30"/> </xsd:restriction> … DLIModel Utility mapping XML view of IMS data XML document definition IMS XML-DB Metadata • “Natural” mapping between hierarchic XML data and hierarchic IMS database definitions. PSB DBD IMS DB definition
XML Schema IMS DBD book PCB: BIB21 @year seq BOOK YEAR TITLE PUBLISH PRICE xs:date title choice publisher price 0:oo 0:oo xs:string xs:string xs:decimal author editor AUTH EDIT LAST FIRST LAST FIRST AFFIL seq seq last first last first affiliation xs:string xs:string xs:string xs:string xs:string IMS XML Database • Introduces a way to view/map native IMS hierarchical data to XML documents • Aligns IMS Database (DBD) with XML Schema • Allows the retrieval and storage of IMS Records as XML documents with nochange to existing IMS databases IMS Data XML Documents
doc a b c a b c d d j e f g e f g j k l k h i l h i A B C B D D XML Visualization On Disk I/O XQuery 1.0 XPath 2.0 Data Model DBD, PCB, Copybooks IMS Hierarchical Model A B D C XML Schema
It’s the Metadata, Stupid! • Physical Metadata • Segment Sizes • Segment Hierarchy (field relationships – 1-to-1, 1-to-n) • DBD Defined Fields • Application Defined Fields • Field Type, Type Length, Byte Ordering, Encoding, etc. • Offer Field/Segment Renaming (lift 8 char restriction) • Structural Metadata • XML layout for fields (field relationships must still match) • Element vs. Attribute (names must match) • Type Restrictions, Enumerations, etc. Defined in DBD Defined in Copylibs (IMS Java) Defined in XML Schema
number lastName firstName payment type date INT CHAR CHAR CHAR CHAR DATE It’s the Metadata, Stupid! Defined in DBD 10110001001101 01110101001001 101100000101001001101 011101011001001001001 1011011001000101001101 0111011100011001001001 0101101010110110010101001010100101010101 0000111011101100101011101010000010101011 1010001 0111001 01010100101111010101 00001001110100101111 Defined in Copylibs (IMS Java) <PurchaseOrder number=“ ”> <lastName> </lastName> <firstName> </firstName> <date> </date> <payment type=“ ”> </payment> 113246 Holtz Christopher 10/21/2003 MC 5414 2263 4895 1145 Defined in XML Schema
Decomposed XML Retrieval in IMS <?xml version="1.0" encoding="UTF-8"?> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:ims="http://www.ibm.com/ims" xmlns="http://www.ibm.com/ims/PSBName/PCBName" targetNamespace="http://www.ibm.com/ims/PSBName/PCBName" elementFormDefault="qualified"> <xsd:annotations go here/> <xsd:element name=“A”> <xsd:complexType> <xsd:sequence> <xsd:element name=“field1" type="xsd:int"/> <xsd:element name=“field2"> <xsd:simpleType> … <xsd:element name=“B”> <xsd:complexType> … <xsd:element name = “C”> <xsd:complexType> … <xsd:element name = “D”> <xsd:complexType> XML Schema/ Metadata Composed XML <A> <f1> </f1> <f2> </f2> <f3> </f3> <B> <f4> </f4> <f5> </f5> </B> <B> <f4> </f4> <f5> </f5> GNP GNP GU GNP GNP GNP GE/GB A A </B> <C> <f6> </f6> <f7> </f7> B B B C C <D> <f8> </f8> <f9> </f9> D D </D> </C> </A>
Simple XML Structuring <?xml version=“1.0”?> <SegName> <A> </A> <B> </B> <C> </C> <D> </D> <E> </E> <Seg2Name> <F> </F> <G> </G> <H> </H> </Seg2Name> <Seg2Name> <F> </F> <G> </G> … </SegName> • Uses Default structure • All data is represented as elements A B C D E F G H
Complex XML Structuring (V10) • Uses Attribute Mapping • Uses hard coded values • Uses hard coded levels <?xml version=“1.0”?> <document A=“ “ B=“ “> <G F=“ “ H=“ “> </G> <G F=“ “ H=“ “> </G> … <sub C=“ “> <another attr=“38”> <E D=“ “> </E> </sub> </document> A B C D E F G H
Using IMS as an XML Database… • Metadata Generation – DLIModel utility • Physical Metadata (PSB, DBD, Copybooks) • Structural Metadata (XML Schema) • Application Programming – IMS JDBC • retrieveXML / storeXML • XQuery support • Example
DL/I Model Utility PSB COBOL copybook members • Control statements: • 1) Choose PSBs/DBDs • 2) Choose copybook members • 3) Aliases, data types, • new fields. If you can read this you do not need glasses; however this is just silly writting to represent the control statements that are the input to the utility. DBD (HFS or PDS) DL/I Model Utility (PDS) XMI 1.2(HFS) DLIModel IMS Java Report ======================== Class: AUTPSB11DatabaseView in package: samples.dealership generated for PSB: AUTPSB11 ================================================== PCB: Dealer ================================================== Segment: DealerSegment Field: DealerNo Type=CHAR Start=1 Length=4 ++ Primary Key Field ++ Field: DealerName Type=CHAR Start=5 Length=30 (Search Field) Field: DealerCity Type=CHAR Start=35 Length=10 (Search Field) Field: DealerZip Type=CHAR Start=45 Length=10 (Search Field) Field: DealerPhone Type=CHAR Start=55 Length=7 (Search Field) ================================================== Segment: ModelSegment Field: ModelKey Type=CHAR Start=3 Length=24 ++ Primary Key Field ++ Field: ModelType Type=CHAR Start=1 Length=2 (Search Field) Field: Make Type=CHAR Start=3 Length=10 (Search Field) Field: Model Type=CHAR Start=13 Length=10 (Search Field) Field: Year Type=CHAR Start=23 Length=4 (Search Field) Field: MSRP Type=CHAR Start=27 Length=5 (Search Field) Field: Count Type=CHAR Start=32 Length=2 (Search Field) ================================================== Segment: OrderSegment Field: OrderNo Type=CHAR Start=1 Length=6 ++ Primary Key Field ++ Field: LastName Type=CHAR Start=7 Length=25 (Search Field) Field: FirstName Type=CHAR Start=32 Length=25 (Search Field) Field: Date Type=CHAR Start=57 Length=10 (Search Field) Field: Time Type=CHAR Start=67 Length=8 (Search Field) ================================================== Segment: SalesSegment Field: SaleNo Type=CHAR Start=49 Length=4 ++ Primary Key Field ++ ... IMS Java Report(HFS) IMS Java Metadata classes(HFS) package samples.dealership; import com.ibm.ims.db.*; import com.ibm.ims.base.*; public class AUTPSB11DatabaseView extends DLIDatabaseView { // The following DLITypeInfo[] array describes Segment: DEALER in PCB: AUTOLPCB static DLITypeInfo[] AUTOLPCBDEALERArray= { newDLITypeInfo("DealerNo", DLITypeInfo.CHAR, 1, 4, "DLRNO"), newDLITypeInfo("DealerName", DLITypeInfo.CHAR, 5, 30, "DLRNAME"), newDLITypeInfo("DealerCity", DLITypeInfo.CHAR, 35, 10, "CITY"), newDLITypeInfo("DealerZip", DLITypeInfo.CHAR, 45, 10, "ZIP"), newDLITypeInfo("DealerPhone", DLITypeInfo.CHAR, 55, 7, "PHONE") }; static DLISegment AUTOLPCBDEALERSegment= new DLISegment ("DealerSegment","DEALER",AUTOLPCBDEALERArray,61); ... // An array of DLISegmentInfo objects follows to describe the view for PCB: AUTOLPCB static DLISegmentInfo[] AUTOLPCBarray = { newDLISegmentInfo(AUTOLPCBDEALERSegment,DLIDatabaseView.ROOT), newDLISegmentInfo(AUTOLPCBMODELSegment,0), newDLISegmentInfo(AUTOLPCBORDERSegment,1), newDLISegmentInfo(AUTOLPCBSALESSegment,1), newDLISegmentInfo(AUTOLPCBSTOCKSegment,1), newDLISegmentInfo(AUTOLPCBSTOCSALESegment,4), newDLISegmentInfo(AUTOLPCBSALESINFSegment,5) }; ... } XML Schema(s)(HFS)
GUI DL/I Model Utility • Install eclipse Plug-in • Create new DL/I Utility Model Project
GUI DL/I Model Utility • Select DBDs / PSBs • must be FTPed locally • Source is Parsed • any errors are reported • XMI Metamodel • generated • opened for editing
IMS Java Class Library IMS DB Metadata Business Logic XML Shredder, XML Materializer Code IMS Dep. Region Transaction and Message Processing IMS Java App DLI Database View Customer Code A p p JDBC/SQL XML-DB DB IMS Java Class Library Base Mapping to DL/I APIs JNI CEETDLI Interface Assembler Layer Interface to IMS JDBC, JCA interface Java to C interface
XMLContext XMLContext retrieveXML() UDF SELECT retrieveXML(A) FROM C WHERE C.fieldA = ’35’ 35 35 *Two Rows of XML CLOBs in the ResultSet
storeXML() UDF INSERT INTO A (storeXML()) VALUES (?) XMLContext *Insert Statement must be a Prepared Statement
IMS Java App DLI Database View IMS Java App IMS Java App IMS Java App IMS Java App IMS Java App DLI Database View DLI Database View DLI Database View DLI Database View DLI Database View A p p DB JDBC / SQL X M S A p p A p p A p p A p p A p p DB DB DB DB DB JDBC / SQL JDBC / SQL JDBC / SQL JDBC / SQL JDBC / SQL Base JNI Base Base Base Base Base CEETDLI Interface JNI JNI JNI JNI JNI CEETDLI Interface CEETDLI Interface CEETDLI Interface CEETDLI Interface CEETDLI Interface IMS JDBC Runtime DB2 WebSphere CICS Stored Procedure EJB JCICS Java Virtual Machine Java Virtual Machine Java Virtual Machine IMS / TM I F P M P P B M P ODBA DRA IMS DB JMP JBP Java Virtual Machine DBDGEN PSBGEN ACBGEN DL/I Model DBDs PSBs COPYLIB
Information as a Service Content Data XQuery support in IMS V10 • Further aligns IMS with industry direction • XML, SOA, Web Services, etc. • More natural fit for hierarchical data querying • Enables customers to leverage emerging standard skill set • Enhanced product and tooling integration • Our IMS XML solution has a 38+ year head start • Immediately usable with no migration of existing IMS data
(XML) IMS v9 (2004) Road to Interoperability through XQuery SQL Relational Engines Relational (XML View) Hierarchical (XML) XQuery IMS Engine DB2 UDB v9 (~2006) SQL DL/I Hierarchical IMS v10 (~2007)
XQuery FLWOR Expressions • FOR: iterates through a sequence, bind variable to items • LET: binds a variable to a sequence • WHERE: eliminates items of the iteration • ORDER BY: reorders items of the iteration • RETURN: constructs query results <bib> { for $b in /bib/book let $title := $b/title where $b/publisher = "Addison-Wesley“ orderby $b/@year return <book year="{ $b/@year }"> { $title } </book> } </bib> <bib> <book year=“1992"> <title>Advanced Programming in the Unix </book> <book year=“1994"> <title>TCP/IP Illustrated</title> </book> </bib>
Extended JDBC interface… <bib> { for $b in book where $b/publisher = 'Addison-Wesley ‘ and $b/@year > 1991 return <book year="{ $b/@year }"> { $b/title } </book> }</bib> SELECT retrieveXML( A,‘ ‘)FROM CWHERE C.fieldA = ’35’ XMLContext XMLContext 35 35
Extended JDBC interface… <bib> { for $b in /bib/book where $b/publisher = 'Addison-Wesley ‘ and $b/@year > 1991 return <book year="{ $b/@year }"> { $b/title } </book> }</bib> SELECT retrieveXML( ‘ XQueryContext ‘)FROM PCB PCB
Optimizing XQuery for IMS Step through every book and evaluate if it meets criteria <bib> { for $b in /bib/book where $b/publisher = 'Addison-Wesley ‘ and $b/year > 1991 return <book year="{ $b/@year }"> { $b/title } </book> }</bib> For each match return this
Optimizing XQuery for IMS Optimized using native IMS XQuery functions Move directly to matching book particles … <bib> { for $b in ims:gn( ims:particle('/bib/book'), ims:and( ims:eq(ims:particle('/bib/book/publisher'), 'Addison-Wesley '), ims:gt(ims:particle(‘/bib/book/@year’), 1991) ) return <book year="{ $b/@year }"> { $b/title } </book> }</bib> …using these SSAs and for each match return this We are working on Algorithms to do this For you… post-QPP
Continually Reducing Development Costs • Example Application Development (very simplified) “Publishing tracking information for specified package” Desired Output Tracking Database <packageInfo> <scan date=“10/09/2006”>San Jose</scan> <scan date=“10/10/2006”>Los Angeles</scan> <scan date=“10/10/2006”>San Diego</scan> </packageInfo> PACKAGE … NUM SHIPFROM SHIPTO 0:oo SCAN DATETIME LOCATION
Continually Reducing Development Costs (COBOL Application Programming) • Build SSA Template and Establish Parentage over Package Record • Iterate through child SCAN segments with GNP calls • Keep count and move selected fields into XML template 01 PACKAGE-SSA 05 FILLER PIC X(9) VALUE ‘PACKAGE (‘. 05 FILLER PIC X(10) VALUE ‘NUM EQ’. 05 PACKAGE-NUM PIC X(12). 05 FILLER PIC X VALUE ‘)’. CALL ‘AERTDLI’ USING DLI-GU AIB-MASK PACKAGE-SEG PACKAGE-SSA. IF IPCB-STATUS-CODE = SPACES 01 SCAN-SSA 05 FILLER PIC X(9) VALUE ‘SCAN ‘. CALL ‘AERTDLI’ USING DLI-GNP AIB-MASK SCAN-SEG SCAN-SSA. IF IPCB-STATUS-CODE = SPACES ADD 1 TO numScans MOVE SCAN-DATE TO scan (numScans) date MOVE SCAN-LOC TO scan (numScans) location
Continually Reducing Development Costs(COBOL Application Programming) • Create COBOL XML Template* • For XML formatting 01 numScans PIC 99. 01 XMLDOC. 05 FILLER PIC X(13) VALUE ‘<packageInfo>’. 05 scans OCCURS DEPENDING ON numScans. 10 FILLER PIC X(12) VALUE ‘<scan date=“’. 10 date PIC X(20). 10 FILLER PIC X(2) VALUE ‘”>’. 10 location PIC X(35). 10 FILLER PIC X(7) VALUE ‘</scan>’. 05 FILLER PIC X(14) VALUE ‘</packageInfo>’. *XML GENERATE Command could be used
Continually Reducing Development Costs (Java SQL) • Start XML tag: • Build SQL call: • Iterate ResultSet: • End XML tag: StringBuffer result = new StringBuffer(“<packageInfo>”); • SELECT scan.date, scan.location • FROM PCB1.Scan • WHERE package.num = ‘000000000000’ • while (resultSet.next()) { • result.append(“<scan date=\“”); • result.append(resultSet.getString(“date”)); • result.append(“\”>”); • result.append(resultSet.getString(“location”)); • result.append(“</scan>”); • } result.append(“</packageInfo>”);
Continually Reducing Development Costs (Java XQuery) • Issue single XQuery call • Queries and formats as XML • <packageInfo> • { • let $package := /package[num = ‘00000000000’] • for $scan in $package/scan • return • <scan date=“{ $scan/date/text() }”> • { $scan/location/text() } • </scan> • } • </packageInfo>
Available Now • alphaWorks is an early adopter website for emerging IBM technologies in early stages of R&D. • Virtual XML Garden • Java-based XQuery • IMS is a featured implementation • Include DBD, PSB, load job, sample apps, and step by step instructions http://www.alphaworks.ibm.com
Why XML Databases… • Data coming in as XML and going out as XML • Need to convert to / from XML • Structure • Types / Encoding • Conceptual divide between XML and Physical Storage Layout • XML Standards (XML Schema’s, XQuery, etc) • Very much depends on the type of data and how it is going to be used!!
XML can be a better choice than relational for... • Data that’s inherently hierarchical or nested in nature • Example: Medical data, Bill-of-materials, etc. • Data sets with sparsely populated attributes • Example: FIXML, FpML, Customer profiles • Schema evolution • Example: Frequently changing services/products/processes • Variable schemas, many schemas • Example: Data integration, consolidation of diverse data sources • Combining structured & unstructured data • Example: CM, Life Sciences, News & Media These are already IMS’s Strengths These are not supported (not well) in IMS
XML Schema IMS DBD book PCB: BIB21 @year seq BOOK YEAR TITLE PUBLISH PRICE xs:date title choice publisher price 0:oo 0:oo xs:string xs:string xs:decimal author editor AUTH EDIT LAST FIRST LAST FIRST AFFIL seq seq last first last first affiliation xs:string xs:string xs:string xs:string xs:string IMS XML Database • Introduces a way to view/map native IMS hierarchical data to XML documents • Aligns IMS Database (DBD) with XML Schema • Allows the retrieval and storage of IMS Records as XML documents with nochange to existing IMS databases • Enables query of IMS data using XQuery IMS Data XML Documents
So…Is IMS a “native” XML database? • Defines a (logical) model for an XML document -- as opposed to the data in that document -- and stores and retrieves documents according to that model. • Has an XML document as its fundamental unit of (logical) storage. • Is not required to have any particular underlying physical storage model. A native XML database*... *According to XML:DB mailing list
Backup Slides …if needed
Decomposed Storage • XML document must be parsed and validated. • Data is converted to traditional IMS types • COMP-1, COMP-2, etc. • EBCDIC CHAR, Picture Strings • Stored data is searchable by IMS and transparently accessible by non-XML enabled applications.