1 / 168

XML Workshop

XML Workshop. XML – Standardformat für den Austausch von elektronischen Daten in der pharmazeutischen Industrie?. Joerg Dillert Senior Consultant March, 30th, 2004. 0. Allgemeines. Der Workshop …. ist in Germisch!. Ein paar Regeln. 9.00 – 16.30 Pausen 15,60,15

trella
Download Presentation

XML Workshop

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. XML Workshop XML – Standardformat für den Austausch von elektronischen Daten in der pharmazeutischen Industrie? Joerg Dillert Senior Consultant March, 30th, 2004

  2. 0. Allgemeines

  3. Der Workshop … • ist in Germisch!

  4. Ein paar Regeln • 9.00 – 16.30 • Pausen 15,60,15 • Handys aus oder Vibration! • Toiletten • Fluchtwege • Fragen - bitte jederzeit

  5. 1. Einführung Was ist eigentlich XML? Wie ist es entstanden?

  6. Handys, Smartphones und PDAs mit integrierter SyncML-Unterstützung Modell Anbieter Gerätetyp Verfügbarkeit • Alcatel: ot715 • Motorola: A830, A835, V600, E390 • Nokia 7250, 6800, 3650, 6220, 9210i, 7650 • Samsung SGH-D700 • Siemens: S55, SL55, M55, SX1 • Sony Ericsson: T68i, T610, P800, Z1010, PEG-NZ90 • PDAs: Sony PEG-NX70V, PEG-T675C,PEG-T625C

  7. Wir leben im Zeitalter der Buzzwords • B2B, B2M, E2B • DIA, EMEA, FDA • XML, DTD, XSL, SVG • Die (Computer) Industrie gibt uns viele neue Wörter jede Woche • Schauen Sie mal an Ihren Arbeitsplatz – welches sind denn so Ihre Buzzwords? (SOPs, DCFs, …)

  8. Smudo ... kennen Sie diese Deutsche Musikgruppe? MFG

  9. Urkundlich erwähnt … SGML Standard Generalized Markup Language ISO 88791 seit 1986

  10. The SGML family of markup languages – more buzzwords!! GML Generalized Markup Language Goldfarb, Mosher and Lorie, IBM, 1969 IBM Document Composition Facility DCF (Script) SGML Standardized Generalized Markup Language Content attributes. ISO-8879 first published in 1986 HTML HyperText Markup Language Functional attributes: hyperlink, frame Based on hyperdocument standard definitions CALS Continuous Acquisition and Life-cycle Support Based on DoD MIL-M-28001B standard definitions XML eXtensible Markup Language (Founding father: Dr. Charles F. Goldfarb, IBM)

  11. 1986 • Entwicklung SGML in den IBM Labs in Almaden • Charles Goldfarb • ISO Standard • Überarbeitung 1990, Ziel war eine universell einsetzbare Auszeichnungssprache für Dokumente

  12. A brief history of SGML The Evolution of Markup Languages • Plain text • Font attributes: Bold, underline, italics, font size • Document structure attributes: Heading level, index term • Document content attributes: Patient age, dosage unit

  13. 1990 • Am Kernforschungszentrum Cern in Genf begann Entwicklung von HTML • erster Entwurf 1993, Geburtstunde des Web • 1995 überarbeitet HTML Version 2.0

  14. 1994 • Um Wildwuchs zu verhindern – Gründung des World Wide Web Consortium (W3C) • primäre Aufgabe: Weiterentwicklung von HTML

  15. 1998 • W3C erkannte, daß mit HTML die Herausforderungen der Zukunft nicht gemeistert werden können • Zwischen Zuviel an Markup (SGML) und dem Zuwenig (HTML) sollte der goldene Mittelweg gefunden werden • Abschluß des Findungsprozesses – XML • 1998 als offizieller Standard verabschiedet

  16. 2001 • W3C verabschiedet als wichtigste Ergänzung die erste Version von XSL (Extensible Stylesheet Language) • stellt Regeln zur Umwandlung von XML Dokumenten und ein Vokabular zum Formatieren dieser Dokumente zur Verfügung • 2002 Arbeitsentwurf zu XHTML Version 2.0 , Bruch mit HTML 4.0 und XHMTL 1.0 – keine Rückwärtskompatibilität

  17. The big advantage of XML • You have flexibility - you can define your own TAGS • The Parser need only the DTD / Schemas for checking the correctness of your file • Readable for everyone • Vendor independent (No vendor can impose their own definitions, standards or undocumented formats) • License free ... all three types are in ASCII format! (American Standard Code for Information Interchange )

  18. XML and data interchange • This kind of information data interchange is the standard in other industries and is called B2B • The Germans favourite spare time object ... • is produced Just in Time

  19. B2B Server B2B Server What we‘ve learned from other industries ... Supplier System Y Car producer System X Request Order Delivery Proposal Order confirmation Delivery confirmation Invoicing

  20. XML is ... • The data interchange and document format for now and in the future (E2B, CDISC, CTD) • Is in practical use in many industries • E.g. car production industry • EVERY system can communicate with another • You need only ONE interface per system

  21. 2. XML eXtensible Markup Language

  22. XML • XML - eXtensible Markup Language • It is a subset of SGML • It focusses on content (sometimes also structure) • The XML file contains the DATA • It is restricted by TAGs • Example: <messagetype>ICSR</messagetype>

  23. An example as a graphic

  24. ... and as XML structure <ANA106> <Screening visit> <Inclusion criteria> <Inc1>YES</Inc1> <Inc2>YES</Inc2> <Inc3>YES</Inc3> <Inc4>YES</Inc4> </Inclusion criteria> <Exclusion criteria> <Excl1>NO</Excl1> <Excl2>NO</Excl2> <Excl3>NO</Excl3 <Excl4>NO</Excl4> </Exclusion criteria> <Demographic/Investigator> <Sex>Male</Sex> <DoB>07/26/1966</DoB> <Smoke>Yes</Smoke> <InvNo>128</InvNo> </Demographics/Investigator> ... --- more page(sections> </Screnning visit> <Visit 1> ... --- more blocks </Visit 1> ... --- more visits </ANA106>

  25. einfache XML Struktur <?xml version="1.0" encoding="ISO-8859-1"?> <DVMDTagung> <Workshop> <event > <stadt>Ulm</stadt> <ort>MedSchule</ort> </event> </Workshop> </ DVMDTagung >

  26. Attribute <?xml version="1.0" encoding="ISO-8859-1"?> <DVMDTagung> <Workshop name=„XML in der Pharmazeutischen Industrie" Leiter=„Joerg Dillert"> <event datum="31.03.2004"> <stadt>Ulm</stadt> <ort>MedSchule</ort> </event> <event datum=„25.06.2004"> <stadt>Berlin</stadt> <ort>PFOffice</ort> </event> </Workshop> </ DVMDTagung >

  27. Attribute <?xml version="1.0" encoding="ISO-8859-1"?> <!– zum Kommentieren --> <DVMDTagung> <Workshop name="XML in der Pharmazeutischen Industrie" Leiter=„Joerg Dillert"> <event datum="31.03.2004"> <stadt>Ulm</stadt> <ort>MedSchule</ort> </event> <event datum="25.06.2004"> <stadt>Berlin</stadt> <ort>PFOffice</ort> </event> </Workshop> </DVMDTagung > see in IE see in XML Notepad

  28. Characters • Character set • Characters that may be represented in XML document • e.g., ASCII character set • Letters of English alphabet • Digits (0-9) • Punctuation characters, such as !, - and ?

  29. Character Set • XML documents may contain • Carriage returns • Line feeds • Unicode characters • Enables computers to process characters for several languages

  30. Characters vs. Markup • XML must differentiate between • Markup text • Enclosed in angle brackets (< and >) • e.g,. Child elements • Character data • Text between start tag and end tag • e.g., Fig. 5.1, line 7: Welcome to XML!

  31. White Space, Entity References and Built-in Entities • Whitespace characters • Spaces, tabs, line feeds and carriage returns • Significant (preserved by application) • Insignificant (not preserved by application) • Normalization • Whitespace collapsed into single whitespace character • Sometimes whitespace removed entirely <markup>This is character data</markup> after normalization, becomes <markup>This is character data</markup>

  32. White Space, Entity References and Built-in Entities (cont.) • XML-reserved characters • Ampersand (&) • Left-angle bracket (<) • Right-angle bracket (>) • Apostrophe (’) • Double quote (”) • Entity references • Allow to use XML-reserved characters • Begin with ampersand (&) and end with semicolon (;) • Prevents from misinterpreting character data as markup

  33. White Space, Entity References and Built-in Entities (cont.) • Build-in entities • Ampersand (&amp;) • Left-angle bracket (&lt;) • Right-angle bracket (&gt;) • Apostrophe (&apos;) • Quotation mark (&quot;) • Mark up characters “<>&” in element message <message>&lt;&gt;&amp;</message> see in IE

  34. Using Unicode in an XML Document • XML Unicode support • e.g., displays Arabic words • Arabic characters • represented by entity references for Unicode characters

  35. XML document that contains Arabic words <?xml version = "1.0"?> <welcome> <from> &#1583;&#1575;&#1610;&#1578; &#1614;&#1604;&#1571;&#1606;&#1583; </from> <subject> &#1571;&#1607;&#1604;&#1575;&#1611; &#1576;&#1603;&#1605; &#1601;&#1610;&#1616; &#1593;&#1575;&#1604;&#1605; </subject> </welcome> see in IE

  36. Markup • XML element markup • Consists of • Start tag • Content • End tag • All elements must have corresponding end tag<img src =“img.gif”>is correct in HTML, but not XML • XML requires end tag or forward slash (/) for termination <img src =“img.gif”></img>or <img src =“img.gif”/>is correct XML syntax

  37. Markup (cont.) • Elements • Define structure • May (or may not) contain content • Child elements, character data, etc. • Attributes • Describe elements • Elements may have associated attributes • Placed within element’s start tag • Values are enclosed in quotes • Element car contains attribute doors, which has value “4” <car doors =“4”/>

  38. Markup (cont.) • Processing instruction (PI) • Passed to application using XML document • Provides application-specific document information • Delimited by <? and ?>

  39. 1 <?xml version = "1.0"?> 2 3 <!-- Fig. 5.5 : usage.xml --> 4 <!-- Usage of elements and attributes --> 5 6 <?xml:stylesheet type = "text/xsl"href = "usage.xsl"?> 7 8 <book isbn = "999-99999-9-X"> 9 <title>Deitel&amp;s XML Primer</title> 10 11 <author> 12 <firstName>Paul</firstName> 13 <lastName>Deitel</lastName> 14 </author> 15 16 <chapters> 17 <preface num = "1" pages = "2">Welcome</preface> 18 <chapter num = "1" pages = "4">Easy XML</chapter> 19 <chapter num = "2" pages = "2">XML Elements?</chapter> 20 <appendix num = "1" pages = "9">Entities</appendix> 21 </chapters> 22 23 <media type = "CD"/> 24 </book> PI discussed later

  40. CDATA Sections • CDATA sections • May contain text, reserved characters and whitespace • Reserved characters need not be replaced by entity references • Not processed by XML parser • Commonly used for scripting code (e.g., JavaScript) • Begin with <![CDATA[ • Terminate with ]]> see in IE

  41. 1 <?xml version = "1.0"?> 2 3 <!-- Fig. 5.7 : cdata.xml --> 4 <!-- CDATA section containing C++ code --> 5 6 <book title = "C++ How to Program" edition = "3"> 7 8 <sample> 9 // C++ comment 10 if ( this-&gt;getX() &lt; 5 &amp;&amp; value[ 0 ] != 3 ) 11 cerr &lt;&lt; this-&gt;displayError(); 12 </sample> 13 14 <sample> 15 <![CDATA[ 16 17 // C++ comment 18 if ( this->getX() < 5 && value[ 0 ] != 3 ) 19 cerr << this->displayError(); 20 ]]> 21 </sample> 22 23 C++ How to Program by Deitel &amp; Deitel 24 </book> CDATA

  42. CDATA (cont.)

  43. XML Namespaces • Naming collisions • Two different elements have same name <subject>Math</subject> <subject>Thrombosis</subject> • Namespaces • Differentiate elements that have same name<school:subject>Math</school:subject> <medical:subject>Thrombosis</medical:subject> • school and medical are namespace prefixes • Prepended to elements and attribute names • Tied to uniform resource identifier (URI) • Series of characters for differentiating names

  44. XML Namespaces (cont.) • Creating namespaces • Use xmlns keyword xmlns:text =“urn:deitel:textInfo” xmlns:image =“urn:deitel:imageInfo” • Creates two namespace prefixes text and image • urn:deitel:textInfo is URI for prefix text • urn:deitel:imageInfo is URI for prefix image • Default namespaces • Child elements of this namespace do not need prefix xmlns =“urn:deitel:textInfo”

  45. 1 <?xml version = "1.0"?> 2 3 <!-- Fig. 5.8 : namespace.xml --> 4 <!-- Namespaces --> 5 6 <directory xmlns:text = "urn:deitel:textInfo" 7 xmlns:image = "urn:deitel:imageInfo"> 8 9 <text:file filename = "book.xml"> 10 <text:description>A book list</text:description> 11 </text:file> 12 13 <image:file filename = "funny.jpg"> 14 <image:description>A funny picture</image:description> 15 <image:size width = "200" height = "100"/> 16 </image:file> 17 18 </directory> XML namespace - no default

  46. 1 <?xml version = "1.0"?> 2 3 <!-- Fig. 5.9 : defaultnamespace.xml --> 4 <!-- Using Default Namespaces --> 5 6 <directory xmlns = "urn:deitel:textInfo" 7 xmlns:image = "urn:deitel:imageInfo"> 8 9 <file filename = "book.xml"> 10 <description>A book list</description> 11 </file> 12 13 <image:file filename = "funny.jpg"> 14 <image:description>A funny picture</image:description> 15 <image:size width = "200"height = "100"/> 16 </image:file> 17 18 </directory> XML namespace with default default needs full name

  47. 3. DTD und Schemas

  48. DTD / Schema • DTD – Document Type Definition • Sometimes also called the ‘Document Type Description‘ • Today you have Schemas • Schemas are more detailed • Comes back to XML • Contains the ‘grammar’ of an XML document • The parser (a program) checks the correctness of the XML file based on the DTD / Schemas • numerics • characters ...

  49. Parsing / Validieren Correct (well-formed) file XML file DTD Schemas

  50. DTDs vs. Schemas • beschreiben den prinzipiellen Aufbau von Dokumenten eines bestimmten Typs • können entweder mit DTDs (Document Type Definitions) oder XML-Schemata spezifiziert werden • DTDs wurden von SGML übernommen und sind Teil von XML 1.0. • XMLSchema sind ein eigener W3C-Standard.

More Related