260 likes | 367 Views
Enterprise Search – Where do we go from here?. Aya Soffer , PhD DGM, Information and Interaction Technologies IBM Haifa Research Lab. The World of Search. Public (owned and stored by others) “global”. Proprietary (owned & stored by the SE) “local”. Content. Access.
E N D
Enterprise Search –Where do we go from here? Aya Soffer, PhD DGM, Information and Interaction Technologies IBM Haifa Research Lab
The World of Search Public (owned and stored by others) “global” Proprietary (owned & stored by the SE) “local” Content Access Public“outward facing” Web search E-commerce, News, public documentation, government,… Private“inward facing” Market intelligence, news tracking, job data mining Content management, groupware, “Intranet search”
User Expectations Shape Products • Global / Web search (public content) • Users expect relevant answers to very low-content queries: elections, olympics • Search engines deliver! • Hence reinforcement of user expectations. • The global context is extremely popular • Shapes user expectations at the workplace • Shapes enterprise search products • Enterprise search (proprietary content) • Users expect a similar interaction style • Yet, users are more sophisticated
Far fewer resources — but high expectations Data is not “search friendly” Must index everything - find everything I can access Security - but show me only what I am allowed to see Link-based methods not as effective – not enough linkage Search is not cheap! - about 5-10 cents/ document/ year – yet ROI is hard to calculate Enterprise Search — Harder than Web Search
Smaller scale Corpus is much smaller Query load is much lower Less anarchic Central authority Data formats can be controlled More potential structure Data is better organized and richer Can tap into organizational knowledge No spam At least not intentional Enterprise Search — Easier than Web Search
Enterprise Search Trends • Information finding goes beyond simple keyword search in search bar • Text analytics and semantic search allows searching by concepts instead of by keywords • Understand content, understand user intent • Collaboration technology is fundamentally changing how people organize and access content • Democratic tagging of data as a bridge across multiple taxonomies • Discover relationships, find experts, utilize wisdom of the crowds and social networks to find best matches • Information finding isn’t enough – need to provide means to explore search results • New visualizations for search results • Combined search / browse paradigm will be pervasive • Combination of search and BI BI for the Masses • Search goes mobile • Access information in context from mobile devices
Social Search • With the advent and popularity of Web 2.0, the ability to share and find information is being expanded beyond keyword search mainly by use of tags • While Google's page rank can be viewed as one of the first applications of the Web 2.0 concept of wisdom of crowds, Search has yet to fully harness the power of Web 2.0 • Community influenced search is beginning to appear in niche search engines and as Beta’s in major search engines • Wikia Search • Yahoo! MyWeb • Google Co-Op • Eurekster
Web 2.0: The wildly read-write Web 2.0 Published Content Collective Intelligence User-Generated Content 80,000,000 web sites 1 billion global users Content Providers Content Consumers Facilitators Collaborators
Collective Intelligence Indirect Influence Published Content User-Generated Content User-Generated Metadata Search Today – Web 2.0 phenomena indirectly influencing search results Social Networks
Information Retrieval (1st Generation) Web-based Search (2nd Generation) Social Information Discovery (4th Generation) Information Discovery (3rd Generation) User Trained specialists Everyone & applications Everyone & applications Everyone Scope Small, closed collections Structured, semi-structured and unstructured information Structured, semi-structured, unstructured information + networks World Wide Web (Html) Technology • String matching • Boolean search • Basic relevance ranking • Text analytics with novel linguistic and semantic processing • Ranking on structure • Faceted navigation • Community input • Incorporation of wisdom of crowds • Tags, tag clouds, social network data • Hyperlink analysis for relevance ranking (good for Web, not as good in enterprise) • Categorization/ summarization Business Impact Vertical business domains (medical, legal) Pervasive use in business processes to realize value from unstructured information Pervasive use in business processes to realize value from unstructured information eCommerce and consumer market Evolution of Search Technologies
Trend - Mutually Reinforcing Relationship code user profile email messages received documents authored documents calendar entries chat history Data • Author • Mentioned • Bookmark • Reader • Associated tags folder label subject of email title of document anchor text query author document owner email address username • TaggedBy Metadata People
Web 2.0 Unified Search People related to the result set –sorted by relevance Tags related to the result set – presented as a tag cloud The result set includes relevant documents, people, and tags. Ranking is affected by the volume of tags and comments that are associated with each document.
Guided Navigation Meets Business Intelligence (BI for the Masses)
Multifaceted search or guided navigation • Allows the simultaneous exploration of many aspects of a topic, and the gradual “zooming in” on the information target • Reduces frustration: ensures that only valid choices are presented, so zooming in never yields an empty result set • Solution to the problem of“few terms too many results, more terms no results” • Supports browsing when the user doesn’t really know what to ask for in a multi-dimensional information space • Multifaceted search is very popular in e-Commerce solutions (Amazon, eBay, buy.com, …), but is also relevant to more traditional text search applications • New applications are emerging every day
Multifaceted Search – Example Query Current Context Category Counts Other Dimensions { Categories- Categories Search Results Featured Dimensions
Guided Navigation Meets Business Intelligence • Faceted Navigation Limitations • Current faceted navigation interfaces enable drilling down one facet at a time. • Counts and context are similarly presented for each facet separately • Business Intelligence today is mainly for structured information • Assumes structure is known in advance • Offline processing • Complex report generations • New direction: faceted navigation as a front end for business intelligence applications. • Facets can be defined on a combination of fields including aggregations and simple expressions.
MultiFaceted Search with BI - Example Query For each author: counts (as before) but also calculated values on result set Average Rating per Sales Rank
Correlating Facets – BI on search results Clicking on the value will bring you to the books with high rating and top sales rank Top-Ranked books do not have the highest ranking Older books have lower sales rank
The World of Search Public (owned and stored by others) “global” Proprietary (owned & stored by the SE) “local” Content Access Public“outward facing” Web search E-commerce, News, public documentation, government,… Private“inward facing” Market intelligence, news tracking, job data mining Content management, groupware, “Intranet search”
The World of SearchThe boundaries are blurring Public (owned and stored by others) “global” Proprietary (owned & stored by the SE) “local” Content Access Public“outward facing” Web search E-commerce, News, public documentation, government,… Private“inward facing” Market intelligence, news tracking, job data mining Content management, groupware, “Intranet search”