130 likes | 273 Views
Using XML in Cooperative Knowledge Discovery, Organization and Exchange for Internet Applications. Yang Wang Pattern Discovery Software Systems Ltd. A Common Data Format is Important. Nature of the project Cooperation Organization Exchange Sake of software development Common language
E N D
Using XML in Cooperative Knowledge Discovery, Organization and Exchange for Internet Applications Yang Wang Pattern Discovery Software Systems Ltd.
A Common Data Format is Important • Nature of the project • Cooperation • Organization • Exchange • Sake of software development • Common language • Interface • Internal • External
A Structured Information/Knowledge Representation is More Important • The velocity of Information • Speed of information • Importance of information • Different sending and receiving devices • Into the smart network/environment • Human understandability vs machine understandability • Information encoded in ways compatible to both human and machine • Separation of information itself from displaying information • Enabling communication between computerized agents • Toward the active document
Why XML • XML is structured • XML is extensible • XML is network enabled • XML is simple and (almost) standardized • XML is programming language independent • XML is suitable for this project • Standard for information sharing, retrieval and exchanging over the network • Ideal for agent based approaches – machine understandable • Ability to model document and the discovered knowledge – extendable grammar ideal for describing features, patterns, and semantics of web documents • Potentials of commercialization
XML Essentials – An Example <?XML version="1.0"?> <!DOCTYPE ContactRec SYSTEM "ContactRec.dtd"> <ContactRec> <Name> <Honorific Title="Mr"></Honorific> <First>David</First><Last>Lewis</Last> </Name> <Company> <JobTitle>Principal Webmaster</JobTitle> <CompanyName>Lewis Systems</CompanyName> </Company> <Address> <Street>2600 Wilson Boulevard</Street> <City>San Tabisco</City> <Region>CA</Region> <PostCode>94583</PostCode> <Country>US</Country> <Phone> <DayTime>415-000-0000</DayTime> </Phone> <Internet> <Email>drlewi1@lewis.com</Email> <Web>http://www.lewis.com</Web> </Internet> </Address> <Product SGMLEditor="Yes" ActiveViews="Yes“SymposiaPro="Yes" SymposiaDocPlus="Yes" SGMLEditorJapanese="No" SGMLEditorKorean="No" XMLProducts="Yes" General="No"/> <Contacts> <Language Preference="English"/> <History> <Events> <Date> <Day>01</Day><Month>04</Month><Year>96</Year> </Date> <Venue>SGML '96</Venue> <Notes>Send Evaluation Software</Notes> </Events> </History> </Contacts> </ContactRec>
XML Essentials - Components • Prolog (optional) • XML declaration<?xml version=“1.0” encoding=“UTF-8”?> • Comments <!-- This document is about NSCERC project --> • Processing Instructions (PIs) • Document Type Declaration <!DOCTYPE nserc SYSTEM “http://pami.uwaterloo.ca/NSERC.dtd”> • Root – The body of an XML document <nserc> <title>NSERC Project</title> </nserc> • Epilog (optional) • Following root (not recommended to use) • Contains misc. information, comments and PIs
XML DTD • Why a DTD • Consistency • Rigor – thought before action • Some XML features requiring a DTD • Idea feature for this project • DTD Structure • A tree • Attributed node
XML DTD Example <?XML version="1.0"?> <!DocType ContactRec [ <!Element ContactRec (Name, Company, Address, Product, Contacts)> <!Element Name (Honorific?, First, Middle?, Last)> <!Element Company (JobTitle?, CompanyName)> <!Element Address (Street+, City, Region?, PostCode, Country, Phone, Internet)> <!Element Product Empty> <!AttList Product SGMLEditor (Yes|No) #REQUIRED SGMLEditorKorean (Yes|No) #REQUIRED SGMLEditorJapanese (Yes|No) #REQUIRED ActiveViews (Yes|No) #REQUIRED SymposiaPro (Yes|No) #REQUIRED SymposiaDocPlus (Yes|No) #REQUIRED XMLProducts (Yes|No) #REQUIRED General (Yes|No) #REQUIRED> <!Element Contacts (Language, History)> <!Element Honorific (#PCDATA)> <!AttList Honorific Title (Mr|Ms|Mrs|Miss|Dr|Professor|M|Mme|Mlle|SeeContent) "SeeContent"> <!Element First (#PCDATA)> <!Element Middle (#PCDATA)> <!Element Last (#PCDATA)> <!Element JobTitle (#PCDATA)> <!Element CompanyName (#PCDATA)> <!Element Street (#PCDATA)> <!Element City (#PCDATA)> <!Element Region (#PCDATA)> <!Element PostCode (#PCDATA)> <!Element Country (#PCDATA)> <!Element Phone (DayTime, Fax?)> <!Element Internet (Email, Web)> <!Element Language EMPTY> <!AttList Language Preference (English|French) "English"> <!Element History (Events+)> <!Element DayTime (#PCDATA)> <!Element Fax (#PCDATA)> <!Element Email (#PCDATA)> <!Element Web (#PCDATA)> <!Element Events (Date, Venue, Notes)> <!Element Date (Day, Month, Year)> <!Element Venue (#PCDATA)> <!Element Notes (#PCDATA)> <!Element Day (#PCDATA)> <!Element Month (#PCDATA)> <!Element Year (#PCDATA)> ]>
XSL and XSLT • XSL (eXtensible Stylesheet Language) • XSLT (XSL Transformations)
XML for this NSERC Project • Internal information / knowledge sharing among different agents • The basis of communication protocol • Interfaces • Internal data organization • Data archive • Knowledge base, etc. • External information / knowledge delivery • Standard profile in XML • Rules (association, production) in XML • Class, cluster descriptions in XML • Interfaces to other (commercial) systems
Software for Developing XML • XML parser • Xparse (JavaScript), Ælfred (Java), Xerces(C/C++) • Document editors • WordPerfect, Xeena, Emacs with pSGML, Clip! • DTD editors/generators • XML Spy, XMLOutline, Oracle XML Schema • Style sheet tools • Editors: HomeSite, Excelon Stylus • Processors: XT, Saxon • XML browsers • MS IE 5, Netscape, XML Viewer
Open Problems and Suggestions • Is the tree-like DTD good enough for us? • How to start using XML in the project? • How to evolve XML documents while the project moves ahead? • How stable is the current XML standard? • What are the risks?
Resources • XML Annotated Recommendations • http://www.xml.com/axml/axml.html • Robin Cover’s SGML/XML page • http://www.oasis-open.org/cover/ • XML software and tools • http://www.xmlsoftware.com • XML FAQ • http://www.ucc.ie/xml • Links to free XML tools • http://www.garshol.priv.no/download/xmltools/