110 likes | 194 Views
Semantically Enhanced Desktop Search Using Directory-Based Clustering and Wordnet Knowledge. Ş tefania GHI ŢĂ. Content. Project Overview Google Purpose Structure Photo Prototype Offline Content Prototype Conclusions. Project Overview . Background Search
E N D
Semantically Enhanced Desktop Search Using Directory-Based Clustering and Wordnet Knowledge Ştefania GHIŢĂ
Content • Project Overview • Google • Purpose • Structure • Photo Prototype • Offline Content Prototype • Conclusions Hannover
Project Overview • Background • Search • No personalization user preferences • No context • Topic classification in DMOZ • Purpose • Contextualize / personalize search using additional metadata • Advantages • Precision of search • Expresiveness of search results Hannover
Google • A possible solution – indexing data on the PC (Google): • Increase search efficiency • Doesn’t use specific characteristics of the user like : • Folder hierarchies • Browser caches Hannover
Purpose • Finding new solutions for: • Increasing precision of search according to the user’s profile • Expresiveness of search results by adding additional information to the search • Ranking the search results • Metadata as the answer to these problems Hannover
Structure • How to characterize and obtain a user profile • Define metadata models for different types of information • Automatically generating such metadata • Enriching data by adding additional information: Wordnet • Extending additional information using file structure and user behaviour • Search engine that uses the metadata Hannover
Photo prototype • /My Pictures/ Holidays/ Germany/ Hannover/ Rathaus/ building.jpg • <location_info>Holidays</location_info> • … • <location_info>building</location_info> • <lastModified>date</lastModifies> • <sizeBytes>XX</sizeBytes> <resolution>0</resolution> <sizeX>(pixels)</sizeX> <sizeY>(pixels)</sizeY> <colorScheme>X</colorScheme> Hannover
Enriching Data with Wordnet • Holidays/ Germany/ Hannover RDF • Add Wordnet extensions: • Synonims • Holonyms (Germany is a part of …) • Meronyms (Germany has part …) • Hypernims (Holiday is a kind of …) • Hyponims (… is a kind of Holiday) • Troponyms Hannover
Example <rdf:Description rdf:about="file:\\C:\Stefi\L3S\beautiful\home\plant\cat.jpg"> <j.0:location_info>C:\Stefi\</j.0:location_info> <j.0:location_info>C:\Stefi\L3S\</j.0:location_info> <j.0:location_info> <rdf:Description rdf:about="file:\\C:\Stefi\L3S\beautiful\"> <j.0:sense>beautiful</j.0:sense> </rdf:Description> </j.0:location_info> <j.0:location_info rdf:resource="file:\\C:\Stefi\L3S\beautiful\home\"/> <j.0:location_info> <rdf:Description rdf:about="file:\\C:\Stefi\L3S\beautiful\home\plant\"> <j.0:sense>plant</j.0:sense> <j.0:sense>establish</j.0:sense> <j.0:sense>implant</j.0:sense> </rdf:Description> </j.0:location_info> <j.0:location_info>cat</j.0:location_info> <j.0:sense>cat</j.0:sense> <j.0:sense>kat</j.0:sense> <j.0:sense>guy</j.0:sense> <j.0:sense>cat-o'-nine-tails</j.0:sense> <j.0:sense>big_cat</j.0:sense> <j.0:sense>vomit</j.0:sense> <j.0:sense>Caterpillar</j.0:sense> <j.0:sense>computerized_tomography</j.0:sense> <j.0:lastModified>Tue Oct 26 17:36:44 CEST 2004</j.0:lastModified> <j.0:sizeBytes>291851</j.0:sizeBytes> </rdf:Description> </rdf:RDF> Hannover
Offline Content Prototype • Additional information for the user’s profile • Browsing behaviour • Relevant results • Additional context for results • Structure: • ID of the page • Date of access • Link from which the user came • Links accessed on the page • Others annotations of the content Hannover
Conclusion • Metadata models for contextualized search for different types of files • Tools for automatically generating metadata • Tools for enriching metadata • Search engine and algorithms that use the metadata Hannover