160 likes | 272 Views
Vocabulary Markup Language (Voc-ML) Project. Joseph A. Busch Content Intelligence Evangelist Interwoven. Agenda. Background Revision summary Issues and Next steps. Soergel’s SemWeb Proposal. System of integrated access to data on concepts and terminology.
E N D
Vocabulary Markup Language (Voc-ML) Project Joseph A. Busch Content Intelligence Evangelist Interwoven
Agenda • Background • Revision summary • Issues and Next steps
Soergel’s SemWeb Proposal • System of integrated access to data on concepts and terminology. • Bring together variety of sources that exist largely in separate worlds, including dictionaries, thesauri, classification schemes, etc. • Federated system with multiple collaborators. • Common interface to all concept & terminology knowledge bases on the Internet.
The Real Semantic Web • Namespace for uniquely identifying a semantic scheme & each concept within each scheme. • Broad template or conceptual schema for holding all types of semantic information & specifying relationships among them. • Definitions of services for interacting with the System.
Vocabulary Markup Language (Voc-ML) • XML schema for the Semantic Web. • Broad template for structured representation of semantic schemes. • Dublin Core metadata. • Tags and syntax for uniquely identifying each concept. • Typed relationships (hierarchical, associative, etc.) • Typed notes. • Host agency: Networked Knowledge Organization Systems (nkos.slis.kent.edu)
Dublin Core Unique ID Typed Relationships <?xml version="1.0"?> <!DOCTYPE VocML SYSTEM "VocML.dtd“> <Voc-ML version=”1.99“> <SrcVocab> <SVHeader> <dc:Title>DFSIC-1998</dc:Title> <dc:Source>Standard Industrial Classification (1987)</dc:Source> <dc:Creator>Interwoven</dc:Creator> <dc:Contributor>U.S. Department of Commerce</dc:Contributor> … <workNum UIDprefix=”DFSIC-1998”DisplayTitle=”Standard Industrial Classification”BriefDisplay=”SIC”> </SVHeader> <SVTermUID=”DFSIC-1998::0139”CCID”104:43”> <label>Field Crops, except Cash Grains, not elsewhere classified</label> <definition>Establishments primarily engaged in the production of field crops, except cash grains, not elsewhere classified. This industry also includes establishments deriving 50 percent or more of their total value of sales of agricultural products from field crops, except cash grains (Industry Group 013), but less than 50 percent from products of any single industry.</definition> <cla>0139</cla> <typedRelation UREF=”DFSIC-1998::013” UTYPE=”Z39.19-1980::2" Name=”BT”> <typedRelation UREF=”DFSIC-1998::013900” UTYPE=”Z39.19-1980::3" Name=”NT”> …
Agenda • Background • Revision summary • Issues and Next steps
Voc-ML Version 1.7 Revisions • Added editDate and source attributes to all “customer” updatable elements. • Removed most remnants of Datafusion Concept Catalog, including CCID, XWalkHeader, etc.
Voc-ML Version 1.8 Revisions • Defined and commented Path ID syntax and usage. PID allows encoding of complex polyhierarchies with context-dependent children, e.g., MeSH. * • This is still in flux. Currently all path information can be encoded in parent/child tags, but current Interwoven software will not read path tags.
Voc-ML Version 1.9 Revisions • Removed more excess elements from Datafusion Concept Catalog, including CCLoadFile, ForbiddenIDs. • Removed all instances of CTYPE attribute. (New ID reference semantics). • Changed optional Note & Misc elements to be allowed only once, not many times. * • This is still in flux. Currently reflects what Interwoven software does, not what the standard should do.
Voc-ML Version 1.95 Revisions • Added editHistory and Edit elements in SVHeader. • Added editDate and Source attributes to Path element. • Edited and updated comments on Path, Parent, & Child to reflect new semantics of the ID reference attributes in these elements. • Added comments on commonHeaderBlock parameter entity to explain Dublin Core and cite DC reference. • Cleaned up many other comments for readability.
Voc-ML Version 1.99 Revisions • Added xmlns:dc attribute to root SrcVocab element. • Changed PID attribute in path to CDATA. • Changed UREF attribute in child to CDATA. • Made editHistory in the SVHeader optional.
Agenda • Background • Revision summary • Issues and Next steps
Voc-ML Issues • UID syntax (Light, 5/25/01) • Topic maps vs Voc-ML • Functions to be served by standards for machine-readable thesauri (Soergel, 11/21/01) • Data input, transfer • Query and view • URIs • Normalization vs specialization (ASIS meeting, 11/13/01) • Type all relationships (remove Parent, Child, RelatedTerm elements) • Type all notes (remove Definition element, add type for Note)
Proposed Next Steps • Post Voc-ML 2.0 • Provide examples of marked-up resources • Prepare W3C RFC
Joseph A. BuschDirector, Solutions Architecture • Interwoven • 415-778-3129 • fax 415-778-3131 • jbusch@interwoven.com