1 / 26

Strategies for subject navigation of linked Web sites using RDF topic maps

Strategies for subject navigation of linked Web sites using RDF topic maps. Carol Jean Godby Devon Smith OCLC Online Computer Library Center Knowledge Technologies 2002 – Seattle, WA. Complex Web sites. Many institutions are struggling to solve problems with their official Web sites. But:

Download Presentation

Strategies for subject navigation of linked Web sites using RDF topic maps

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Strategies for subject navigation of linked Web sites using RDF topic maps Carol Jean Godby Devon Smith OCLC Online Computer Library Center Knowledge Technologies 2002 – Seattle, WA

  2. Complex Web sites • Many institutions are struggling to solve problems with their official Web sites. • But: • The contents constantly change. • The editors can’t exercise sufficient control. • One result: an institution’s major presence on the Web is difficult to navigate.

  3. The Semantic Web Tim Berners-Lee’s vision: • “The current Web has documents for people, not computers. By augmenting Web pages with data designed for automated processing, users will transform the Web into the Semantic Web.” • “Computers will find the meaning of semantic data by following hyperlinks to definitions of key terms and rules for reasoning about them logically.”

  4. The Semantic Web:An Architecture Trust Rules Digital signature Data Proof Data Logic Self- describing documents. Ontology vocabulary RDF + RDFschema XML + XML namespaces + XMLschema Unicode URI Source: Tim Berners-Lee

  5. The promise of the Semantic Web • A common data model • Conceptual links • Limited inferences

  6. Our demo: goals • Represent subject/topic information obtained from different sources. • Demonstrate the value of hypothetical metadata-based navigation for a collection of related Web sites. • oclc.org • Portions of w3c.org • dublincore.org • Develop and evaluate the utility of Open Source prototyping tools based on RDF.

  7. Some common topics oclc.org w3c.org xml fragment xml stylesheet traditional library library users library network digital library xml dublin core xml namespace xml schema metadata library automation classification xml profile schema processor uri syntax element node dc element syntax dublincore.org

  8. Sources of subject/topic metadata • HTML keywords • Subject lines in email messages • An index of library/information science terms • Terms extracted automatically from text using natural-language-processing algorithms

  9. Some term relationships Singular/PluralLibrary, libraries Acronyms Standard Generalized MarkupLanguage--SGML Library of Congress Subject Headings--LCSH Coordination library and information science--library science, information science information storage and retrieval--information storage, information retrieval Broad/Narrow Computational linguistics—linguistics Classification scheme—classification Type-of Library—digital library, traditional library Related Library—library classification scheme, library automation

  10. An RDF encoding <Topic rdf:about=http://purl.org/rdf/topics/classification> <name>classification</name> <related_concepts rdf:resource=“http://purl.org/rdf/topics/classification_codes”/> <related_concepts rdf:resource=http://purl.org/rdf/topics/classification number”/> <types_of rdf:resource=http://purl.org/rdf/topics/automatic classification”/> <types_of rdf:resource=“http://purl.org/rdf/topics/library_classification”/> <coordinate rdf:resource=“http://purl.org/rdf/topics/resource_discovery and classification”/> <coordinate rdf:resource=“http:/purl.org/rdf/topics/classification and knowledge”/> </Topic>

  11. Connected RDF encodings <Topic rdf:about=http://purl.org/rdf/topics/resource_discovery> <name>resource discovery</name> <broad_concepts rdf:resource=“http://purl.org/rdf/topics/resource”/> </Topic> <Topic rdf:about=http://purl.org/rdf/topics/resource> <name>resource</name> <related_concepts rdf:resource=http://purl.org/rdf/topics/resource discovery”/> <types_of rdf:resource=http://purl.org/rdf/topics/resource description framework”/> <related rdf:resource=“http://purl.org/rdf/topics/web_resource”/> </Topic>

  12. A graphical representation of relationships classification codes rdf Acronym classification Broad/Narrow resource description framework Type_of automatic classification Coordination Related resource discovery and classification resource resource discovery Coordination

  13. The philosophy of our system • Modular • Open Source Project Web site accessible at: topicmap.oclc.org:5000

  14. System architecture: 1 Normalized HTML data Extract terms Filter terms Structure terms RDF graph

  15. Term filters: using knowledge encoded in the text Positive contexts for terms:study of, informationabout, professor of, department of information science, metadata applications, data processing, automatic classification, computational linguistics, internet resources Negative contexts for terms:very different things, few messages, good point, interesting example, appealing idea, small extension, terse document, simple kind

  16. System architecture: 2 Database Harvester (Perl) XML/RDF Loader File System (HTML) Metadata Scraper (Perl) File System (XML/RDF) File System (Normalized HTML) Term manipulator (Java)

  17. Open issues • RDF knowledge in the user interface. • Encoding in RDF or XML? • The construction of knowledge ontologies.

  18. Conclusions • The enterprise succeeds or fails on the strength of the knowledge ontology. • RDF and the XTM standard are descriptively equivalent for our work. • Sophisticated user interface design is required to exploit all of the encoded information.

  19. For more information • Sharon Caraballo. Automatic Construction of a Hypernym-Labeled Noun Hierarchy. PhD dissertation. Brown University, 2001. • Carol Jean Godby. A Computational Study of Lexicalized Noun Phrases in English. PhD dissertation. The Ohio State University, 2002.

More Related