130 likes | 325 Views
Semantic Information Access. Atilla ELÇİ Dept. of Computer Engineering Eastern Mediterranean University. Semantic Info Access. Shortcomings of common-place Web and search technology Applications of SemTech to knowledge access: Semantic search and browse tools Natural language generation
E N D
Semantic Information Access Atilla ELÇİ Dept. of Computer Engineering Eastern Mediterranean University CmpE 588 Spring 2008 EMU
Semantic Info Access • Shortcomings of common-place Web and search technology • Applications of SemTech to knowledge access: • Semantic search and browse tools • Natural language generation • Device independence. • Davies et al. Ch. 8. CmpE 588 Spring 2008 EMU
Shortcomings of Current Web and Search Technology • Query construction: • Syntactic units such as keywords/terms are used. • Polysemy: multiple meanings • Query ambiguity: # of keywords used per query (circa 2000): 2.2! • Lack of semantics: • Inability to handle synonymy & polisemy • Missed semantic links • Lack of context: • Missing the context to disambiguate the user’s query. • Presentation of results: • Often too many results • Managing heterogeneity: • Providing a coherent view of diverse sources and types of information: very difficult and unsatisfactory at the best. Lots of data but lacking information! High recall but low precision! CmpE 588 Spring 2008 EMU
Advantages of Semantic Web Technology • Resolving shortcomings of the current Web and search engines by: • Exploiting machine-processable metadata • Using ontological concepts to define queries • Using semantic relations in defining queries • Providing information not simply data Future search engines must adapt to “information-centric” approach rather than document-centric one in order to seek: • Relevant sections not simply documents • Digest of info from several docs/sections CmpE 588 Spring 2008 EMU
Semantic Search and Browse Tools • Searching the XML: • QuizXML: XML-aware search; produces index map of keywords vs tags • XSearch semantic search engine: provides semantically-relevant document fragments in response to a query (tags & keywords). • XRANK XML search engine: query term is matched against document content & document markup. Extends Google Page Rank Algorithm. • Searching semantic data: RDF: • QuizRDF: free-text search & RDF annotation search. Provides searching browsing. • Exploiting domain-specific knowledge: • Rocha et al.’s Search Arch: • Spead activation: keyword-based document search, and • Using domain-specific semantic model • Guha et al.’s ABS (activity-based search): people, places, events, news items. Combines searching conventşional search engine and enhancing findings through semantic knowledge base (RDF annotation) • Popov et al.’s KIM (knowledge and information management) infrastructure, in order to enhance search: • Exploits ontological knowledge base • Provides automated semantic annotation method CmpE 588 Spring 2008 EMU
Semantic Search and Browse Tools • Searching for Semantic Web resources: • Swoogle Semantic Web Search Engine: to find ontologies and related instance data on the Web. • Other ontology search engines? Mostly for RDF (instance data). • Wikipedia entry: Semantic Search “attempts to augment and improve traditional Research Searches by leveraging XML and RDF data from semantic networks to disambiguate semantic search queries and web text in order to increase relevancy of results.” • Study Google’s search algorithm if you can find it. • Take a look at the Alexa Web Search Platform search engine. • Take a look at WebCrawler.com paradigm: metasearch engine of search engines. • At last, a semantic search engine: See Hakia.com: • A Turkish initiative but in the USA. Check Partners, one of them is KVK. • Just raised another US$5 M totalling so far US$18 M. • Another one: PowerSet. • Another recent one: AskMeNow. CmpE 588 Spring 2008 EMU
Semantic Search & Browse Tools(cont’d) • Semantic Browsing: • Magpie: plug-in that adds an ontology-based semantic layer onto the web pages as they are browsed. • CS AKTiveSpace: web application for UK CS research domain. • Haystack: a browser for semweb info agregating and visualizing RDF data from multiple arbitrary locations. • W3C’s Annotea Project & Amaya • Yuce’s Site Insight. CmpE 588 Spring 2008 EMU
Natural Language Generation (NLG) from Ontologies Def.: taking structured data in a knowledge base as input and producing natural language text, tailored to the presentational context and the target reader. • Taxonomy / ontology verbalizers: • Template-based or text-generator-based • Takes advantage of tax/ont hierarchy, user history, and available semantic annotation in KB. • Exs.: Wilcock’s general purpose verbalizer, Ontogeneration Project • Summarizers: • Ontosum using RDF triples • Miakt: domain- & ontology-specific • Approach: • Verbalize based on discourse schema: active-action, passive-action, attribute, part-whole • Semantic agregation CmpE 588 Spring 2008 EMU
Device Independence at Presentation Layer • Skim through Sect. 8.4 CmpE 588 Spring 2008 EMU
Advanced Semantic Querying • Leveraging the Expressivity of Grounded Conjunctive Query Languages • By AlissaKaplunova, Ralf Möller and Michael Wessel • PDF (325.7 KB) CmpE 588 Spring 2008 EMU
Conferences • ESWC-08 Workshop on Semantic Search : • Workshop pf ESWC 2008, 1-5 June 2008, Tenerife, Spain. • In recent years we have witnessed tremendous interest and substantialeconomic exploitation of search technologies. On the other hand semanticrepositories and reasoning engines have advanced to a state where queryingand processing of this knowledge can scale to realistic IR scenarios. Assuch, semantic technologies are now in a state to provide significantcontributions to IR problems. This workshop intends to investigate thepotential and the challenges of Semantic Search systems. Main topics ofinterest of the workshop cluster around the areas: • Tasks and InteractionParadigms for Semantic Search, • Query Construction and ResourceModelling for Semantic Search, • Algorithms and Infrastructures forSemantic Search, and • Evaluation of Semantic Search. CmpE 588 Spring 2008 EMU
Commercial Conferences • Data Modeling Seminars and Workshops by Wilshire Conferences: • Designing and Building Ontologies: An ontology is a formal description of the meaning of the information stored in a system. It resembles a conceptual model, but goes much beyond a conceptual model in that the formal definitions allow the system to infer class membership based on properties. Additionally, inference engines, running on ontologies, allow users to extract and integrate information stored in distributed systems. This workshop, which will contain a number of live demos and student exercises, will cover practical issues in employing ontologies. A 4-DAY Seminar with Dave McComb and Simon Robe, $1795. • DAMA Symposium + Wilshire Meta-Data Conference • 2007 Semantic Technology Conference CmpE 588 Spring 2008 EMU
References • John Davies, Rudi Studer, Paul Warren (Editors): Semantic Web Technologies: Trends and Research in Ontology-based Systems, John Wiley & Sons (July 11, 2006). ISBN: 0470025964. Ch. 8.: pp. 139-169. • W3C Semantic Web Tools Wiki page: • Check Jena, SemWeb, Protégé, Swoop, etc. CmpE 588 Spring 2008 EMU