1 / 19

Query Models

Query Models. CSCI 572: Information Retrieval and Search Engines Summer 2010. Outline. Discovering your data General Approaches Forms-based (fielded) Facet/Guided Navigation Free-text Advanced approaches Clustering/Concept Map Geospatial/Local Search What’s out there.

tamayo
Download Presentation

Query Models

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Query Models CSCI 572: Information Retrieval and Search Engines Summer 2010

  2. Outline • Discovering your data • General Approaches • Forms-based (fielded) • Facet/Guided Navigation • Free-text • Advanced approaches • Clustering/Concept Map • Geospatial/Local Search • What’s out there

  3. So, you’ve got your data indexed • …what do you do with it? Free-text Facet/Guided

  4. Another example Facets

  5. Yet another example • Fillout formfields and click submit

  6. Generic Query Models • Initially forms-based approaches were extremely common because of their appeal to a particular domain • Usually expose the interesting parts of the data only known by those domain experts • In other words, the users know what they are interested in searching for • Type Specificity • If it’s a date, show a calendar, if it’s a number, limit its input box size, etc.

  7. Generic Query Models • Google popularized free-text search • Though plenty of other companies have offered it since 1990’s • Recall the lecture on Search Engines and the evolution of the web • Type specificity • Free-text search is a harder since it’s difficult to detect parameter types • “cars 2006” • Is the user searching for the movie with the title field “cars 2006”, or is the user looking for cars with the year field (a YYYY formatted integer) set to 2006?

  8. Generic Query Models • Facet-based • Also called “guided navigation”, the model was popularized by eBay, and Yahoo! early on, and then by Google and others later • The data is “bucketed” into groups, which are essentially views into the value space of indexed attributes • Example: index 4 documents, with an “author” field • Doc1, author=Chris • Doc2, author=Sam • Doc3, author=Sam • Doc4, author=Bob author: Chris (1) Sam (2) Bob (1)

  9. Faceted-Navigation example • Usuallycombinedwith free-textelementas well • TypeSpecificity • Implied Selected Facets Further refinement

  10. Hybrid Approaches • Typically, the aforementioned three general query models are combined to form powerful, Hybrid approaches • Guided Navigation/Faceting typically always includes a free-text element • Sometimes it even includes forms-based elements

  11. Hybrid Approaches • All 3 combined, another example

  12. Query languages • Usually specific to a particular model (1:1) • KEV models (keyword=value) • Forms-based • attribute:value AND attribute2:value2… • Logical Operators (AND/OR/NOT, others) • Comparators (>, >=, <=, <, etc.) • Use of “ “ denotes entire phrase rather than tokenized • Use of attribute:[startrange TO endrange] indicates range • Facet-based • In many ways, just refinements and restrictions of KEV type models above

  13. Query languages • Free-text • You are given so little information and have to sense so much richness, so this is where information retrieval techniques come in • IR query models • Must understand Analyzers • Must understand Stopwords, Tokenizers, Lexical analysis, Language analysis (and detection) • Must understand (somehow) underlying field types • Default attributes • Inclusion/Exclusion of terms

  14. IR Example • “cars 2006” • Inclusion of “ “ indicates whole string match • Default field is searchableTxt, which is made up of • page text description • link alt text • page title • … • Type: string field • Tokenization (“ “, “/”, etc.) • Stop-word removal • Eventual query: searchableTxt:”cars 2006” OR page text description:cars OR link alt text:cars OR page title:cars

  15. Advanced Query Model Approaches • Clustering • Facets sensedautomaticallybased on textanalysis • TFIDF to find the mostfrequent terms

  16. Local Search • GIS methods • Point/radius • Bounding box • Polygon • Combine with otherapproaches • Free-text • Facet

  17. GIS search • Overlay “layers” to navigate and search throughinformation space • Typically used with Local approach to deliver search

  18. GIS search challenges • Sometimes the data isn’t annotated with lat and lon • How to discover this? • Even when the data is annotated with spatial information,computation of e.g.,bounding box aroundthe poles is difficult • Efficiency and speed are difficult since data is at scale

  19. Wrapup • Plenty of query models out there to discover your data that you’ve indexed • Typically combined together to form powerful “hybrid” and rich query interfaces • Important to understand underlying data complexity • Type specificity • Structure • Query languages are guided, but not always 1:1 with general query models

More Related