360 likes | 473 Views
WWW Challenges : Supporting Users in Search and Navigation. SOFSEM 2004. Natasa Milic-Frayling Microsoft Research, Cambridge UK. January 28, 2004. Introduction. Intersection Browser Interface – Internet, Intranet, services, local drives. Devices and applications: TabletPC, PDA, eBook
E N D
WWW Challenges:Supporting Users in Search and Navigation SOFSEM 2004 Natasa Milic-Frayling Microsoft Research, Cambridge UK January 28, 2004
Introduction • Intersection • Browser Interface – Internet, Intranet, services, local drives. • Devices and applications: TabletPC, PDA, eBook • Services: MSN Portal and Search - on-line searching, reading, and browsing • Research: • Web usage and interfaces • Optimization of service architectures • Text Classification – support for document classification, routing, filtering • Presentation Focus • WWW challenges in designing effective services and applications.
Introduction • Intersection • Browser Interface – Internet, Intranet, services, local drives. • Devices and applications: TabletPC, PDA, eBook • Services: MSN Portal and Search - on-line searching, reading, and browsing • Research: • Web usage and interfaces • Optimization of service architectures • Text Classification – support for document classification, routing, filtering • Presentation Focus • WWW challenges in designing effective services and applications.
Introduction • Intersection • Browser Interface – Internet, Intranet, services, local drives. • Devices and applications: TabletPC, PDA, eBook • Services: MSN Portal and Search - on-line searching, reading, and browsing • Research: • Web usage and interfaces • Optimization of service architectures • Text Classification – support for document classification, routing, filtering • Presentation Focus • WWW challenges in designing effective services and applications.
Characteristics of the Web • Highly distributed: distributed data and processes • Highlydynamic • Evolving content, with still inadequate content publishing practice. IMPLICATIONS
On-line Experience • Web access is a combination of search and navigation • Search to find URL of relevant pages • Navigation to explore result space • Reading on devices of various display sizes. • Only limited “context” in both activities preserved and exposed • Ineffective search • Lost in hyperspace • Lost within a document, on small screen devices.
‘Diagnoses’ • Three aspects of the Web • Separation of search and document delivery • Separation of document authoring and generation of metadata about the documents required by services and applications • Lack of generic publishing format to support flexible display of content across devices.
Part I Separation of search and document delivery Ineffective Search MIDAS - SiteExplorer
Search processes Query User’s Information Need Search Engine URLs HTTP Request Web page delivery Web Server Web Server
Highlighting - How is it done ? Query User’s Information Need Search Engine URLs Topic Description HTTP Request MS READ Service Web Server Web Server Query syntactic Analysis Semantic Expansion Highlighting Regime Thumbnail Creation
Web Server Link Evaluation - How is it done ? HTTP Requests for Text Only Download Text Only MS READ Service • NLP • Indexing • Search Over Local Index Topic Storage: Topic 1 Topic 2 Topic 3 Topic 4 Mark Links for Relevance
MS Read • Users have difficulty locating relevant parts of a Web page while reviewing search results (MSN Search Diary and Field Interviews) • Users have difficulty evaluating search results and refining their search (Anne Cohen-Kiel’s ethnographic study in Spain, UK and Canada; MSN Search Diary Study and Site Interviews). Solution: • Preserve user’s topic of interest and provide highlighting of topic terms on the pages that the user is viewing. • Allow the users to enhance the topic by adding new query terms or resources (lists of concepts, entities, etc.) and perform search over the page content • Allow the user to search the content of the pages that are linkedto the current page. • When the page is the search result page, this is equivalent to refining the search over the previous top N search results.
MIDAS and SiteExplorer Part II Separation of document authoring and generation of metadataabout the documents required by services and applications User lost in the hyperspace
Problem • Crawling - Services, such as search engines, collect the data and create metadata but do not deliver the content • Out of sync with the data on the Web servers ‘broken links’ • Services can perform only basic analysis of the context • No information about structure of information resources • No sophisticated linguistic process.
Solution: MIDAS Framework • Distributed metadata generation • Generate & store meta-information alongside contents • At authoring or publishing time • Synchronised with publishing • Deliver metadata upon request • In case of centralized services • Services do not crawl for data but only for metadata • Obtain data through ‘push’ by authors/web servers. Site structure Page structure METADATA: Linguistic analysis Statistical analysis Visual representation
AUTHOR SERVER CLIENT Web Server Web Content Web Content
AUTHOR SERVER CLIENT Web Server Web Content Web Content Automatically Generated Metadata Web metadata (XML) Author generated metadata <dxf:views> <dxf:view title="Main"> <dxf:node url="index.htm"> <dxf:node url="aboutme.htm" /> <dxf:node url="interest.htm" /> <dxf:node url="favorite.htm" /> <dxf:node url="photo.htm" /> <dxf:node url="feedback.htm" /> </dxf:node> </dxf:view> </dxf:views> SiteExplorer Metadata Server FrontPage Site Template and Structure in XML Format
MIDAS is NOT… …an element of the Semantic Web • Not adding “knowledge” explicitly into the Web • Simple metadata • Easily authored/easily computable at authoring/publishing time • Presently available but dismissed
Problems addressed • Users have difficulty choosing the right website from the result set • Users want overviews of sites in a list of search results (Anne Cohen-Kiel’s ethnographic study in Spain, UK and Canada) • Users have difficulty evaluating search results and refining their search (MSN Search Diary Study and Site Interviews) • Users have difficulty locating relevant information within a destination site once they get to the site(MSN Search Diary Study and Site Interviews) Site Explorer’s Solutions: • Providing users with an overview of the site content as interactive sitemap • Supporting exploration of the site through local search
External studies “Anyone who has been to a shopping mall knows the value of the ‘you are here’ dot on the map … Site maps must become more aware of users’ website navigation…” Jakob Nielsen, Site Map Usability January 6, 2002
SiteExplorer Bar Search Box Site Overview Site Structure Page details
SiteExplorer Bar Search Box Site Overview Site Structure Page details
SmartView and SearchMobilViewing Web on PDAs and Mobile Phones Part III Lack of generic publishing format to support flexible display of content across devices Ineffective reading on mobile devices
Lost in Hyperspace - Small • Complex pages on small screens • Overview – none provided at the moment • Extensive horizontal/vertical scrolling
Lost in Hyperspace - Small • Location of search hits on result page • Difficulty even on desktop screens • Reason: disassociation of search service and document delivery
SearchMobil • SearchMobil Web Service • Collection of search results – “booklet” of Web pages • Creation of the “local” full text index • Search within a designated set of pages • Annotated booklets (hit highlighting)
SearchMobil Features • On-line search: Google • Automatic download of pages • Processing of pages – structure discovery and content indexing • Creation of a booklet of overviews • Indicators of search hits • Indicator of the best region – scroll down the ‘red’ section • Select the region and access the detailed view Web Search
SearchMobil Features • On-line search: Google • Automatic download of pages • Processing of pages – structure discovery and content indexing • Creation of a booklet of overviews • Indicators of search hits • Indicator of the best region – scroll down the ‘red’ section • Select the region and access the detailed view Web Search – Detail View
SearchMobil Features – Cont. • Local search – focussed on the set of pages in the booklet • Indicators of relevance at the page and the booklet level Local Search
Summary • Simple proposition: Save metadata about structure and content generated by authoring applications • Benefits on the client side: • Rich context for search and navigation • Interactive download of document elements and metadata for small devices • Benefit for services: • Metadata collected and in s • Opportunity for new services based on rich metadata • Opportunity for push based services – reduce the need for crawling.