340 likes | 364 Views
This Dagstuhl Seminar discusses the importance of making database and information retrieval (IR) systems socially meaningful by incorporating personalized and socially relevant recommendations. The seminar explores topics such as social recommender systems, personalization, and relevance feedback for interactive query refinement.
E N D
Outline • About Dagstuhl Seminars • Interesting Talks • Social Recommender Systems • Personalization • Relevance Feedback for Interactive Query Refinement • Conclusions
About Dagstuhl Seminars • Participants: renowned researchers of international standing and promising young scholars. • Report on current, as yet unconcluded research work and ideas, and conduct in-depth discussions. • Almost every week throughout the year. • http://www.dagstuhl.de/en/program/dagstuhl-seminars/
Outline • About Dagstuhl Seminars • Interesting Talks • Social Recommender Systems • Making DB and IR (Socially) Meaningful • Personalization • Relevance Feedback for Interactive Query Refinement • Conclusions
Making DB and IR (Socially) Meaningful Sihem Amer-Yahia (Yahoo Research - New York)
Motivation Social Content Sites • Lots of data and opinions • Collaborative tagging sites: Flickr,del.icio.us, etc. • Collaborative reviewing sites: Y! Movies, Y! Local, etc. • Key features of these sites • User-contributed content, user relationships (user’s network), user ratings • Hotlists, search results, recommendations are offered to users with lists of ranked content. • The accuracy of ranking is tied to not only relevance (in a traditional Web sense) but also people whose opinion matters
Recommendations (Amazon) But who are these people ?
New Ranking Semantics • Not only relevance (in a traditional Web sense) but also about people whose opinion matters. • Take into account social connection information • Relevance Factors • Text features: title, description (TF-IDF) • Timeliness and freshness. • Incoming links and tags. • Popularity. • Social distance: user’s social network • Relevance for people flowers
Conclusions • Efficient and effective recommendation platform • Serve (socially! ) relevant content to users • Better recommendations • relevance determined by people who matter to you! • social context, explanation, diversity, temporality, etc • Characterize users’ interests and connections • I enjoy watching Schindler’s list with my parents • and very different movies with my friends!
Outline • About Dagstuhl Seminars • Interesting Talks • Social Recommender Systems • Personalization • Personalizing XML Search with PIMENTO • Multidimensional Search for Personal Information Systems • Relevance Feedback for Interactive Query Refinement • Conclusions
Personalizing XML Search with PIMENTO Irini Fundulaki (ICS-FORTH, Greece)
Motivation • XML search has become popular • Personalization is becoming important • Large number of users, with focused and different needs. • XML Personalization is essential!
car dealer car car des location color price des location make price color low mileage good condition YNC red $1000 Mustang $1000 NJ red Example Looking for a car, with a price lower than $2000, in good condition User resides in NYC, and prefers red cars with low mileage A different user prefers Mustangs
Summary • XML queries are both on structure and content • Given a user interest: • Customize query context: modify candidate set of answers using conditions on both structure and keywords • Customize ranking of answers • Adapt top-k processing to account for user interests.
location des price NYC ftcontains(“good condition”) <2000 PIMENTO car * • Query: • User Profiles: • Scoping rules: If true then add parent (car, location) & ftcontains (location, ”NYC”) • Ordering rules: x.tag=car & y.tag=car & x.color=‘red’ & y.color≠’red’ →x﹤y • Query Personalization: • Rewriting a user query using scoping rules • Ranking query answers using ordering rules
Outline • About Dagstuhl Seminars • Interesting Talks • Social Recommender Systems • Personalization • Personalizing XML Search with PIMENTO • Multidimensional Search for Personal Information Systems • Relevance Feedback for Interactive Query Refinement • Conclusions
Multi-Dimensional Search for Personal Information Systems Amélie Marian (Rutgers University)
Motivation • Large collections of heterogeneous data. • Need simple and efficient search approach • Typical desktop search tools use • Keyword search for ranking • Possibly some additional conditions (e.g. metadata, structure) for filtering • e.g. Find a pdf file created on March 21,2007 that contains the words “proposal draft” • Filtering conditions: *.pdf, 03/21/2007 • Ranking expression: “proposal draft” • Miss some relevant files: *.txt documents created on 03/21/2007 contain words “proposal draft”
Multi-Dimensional Search • Allow users to provide fuzzy structure and metadata conditions in addition to keyword conditions. • Three query dimensions: (content, structure, metadata) • Example: • For $i In /File[FileSysMetadata/FileDate=’03/21/07’] For $j In /File[ContentSummary/WordInfo/Term=‘proposal’ AND ContentSummary/WordInfo/Term=‘draft’] For &m In /File[FileSysMetadata/FileType=‘pdf’] WHERE $i/@fileID=$j/@fileID AND $i/@fileID=$m/@fileID RETURN $i/fileName • Individually score each dimension and then integrate the three dimension scores into a meaningful unified score.
Outline • About Dagstuhl Seminars • Interesting Talks • Social Recommender Systems • Personalization • Relevance Feedback for Interactive Query Refinement • Conclusions
Relevance Feedback in TopX Search Engine Ralf Schenkel
Users vs. Structural XML IR I need information about a professor in SB who teaches IR. //professor[contains(.,SB) and contains(.//course,IR] • Structural query languages do not work in practise: • Schema is unknown or heterogeneous • Language is too complex • Results often unsatisfying • System supports to generate good structured queries: • User interfaces (advanced search) • Natural language processing • Interactive query refinement
XML 1 IR 2 index IR 3 Fagin index 4 IR index XML … Relevance Feedback for Interactive Query Refinement Query evaluation XMLnot (Fagin) 1. User submits query 2. User marks relevant and nonrelevant docs • Feedback for XML IR: • Start with keyword query • Find structural expansions • Create structural query 3. System finds best terms to distinguish between relevant and nonrelevant docs 4. System submits expanded query
article frontmatter body backmatter User marksrelevant result Structural Features sec sec Sec Semistructured data author: Baeza-Yates subsecXML has evolved subsec pWith the advent of XSLT p p Possible features: Content of result Tag + Content of descendants Tag + Contentof ancestors Tag + Content of descen-dants of ancestors C: XML D: p[XSLT] A: sec[data] AD: article//author[Baeza]
descendant-or-self axis article author[Baeza] sec[data] p[XSLT] Query Construction query evaluation Initial query: needs schema information! *[query evaluation XML] *[query evaluation] Content of result Tag + Content ofdescendants Tag + Contentof ancestors Tag + Content of descen-dants of ancestors C: XML D: p[XSLT] A: sec[data] AD: article//author[Baeza]
Conclusions • Queries with structural constraints to improve result quality • Relevance Feedback to create such queries • Structure of collection matters a lot
Outline • About Dagstuhl Seminars • Interesting Talks • Social Recommender Systems • Personalization • Relevance Feedback for Interactive Query Refinement • Conclusions
Conclusions (1) • XQuery and exact matches for querying XML documents are not likely to be sufficient. • Techniques based on approximate matching of query content and structure, scoring potential answers, and returning a ranked list of answers are more appropriate. • Trend: blending together the techniques addressed by DB, IR and the Web/Applications communities.
Conclusions (2) • Opinions on ranking: • Ranking for XML Search should take into account the structure (not contents only) • Add scoring to ranking with preferences • Declaring properties of scoring functions/process so they can be matched against application needs. • The accuracy of ranking in social networksis tied to users behavior. • Data uncertainty presents interesting issues in uncertain top-k queries.
Thank you! Questions?