170 likes | 181 Views
Explore the features, strengths, and weaknesses of HotBot search engine, including Inktomi and Direct Hit databases. Learn how to maximize search results and access advanced features for precise information retrieval.
E N D
Introduction • Owned by Terra/Lycos. • One of the largest web search engines. • Uses the Inktomi database combined with Direct Hit and the DMOZ Open Directory. • Basic search screen is simple, but the advanced search allows for a full range of search features.
Databases • Open Directory • Direct Hit • Inktomi • Direct Hit results display if the option for 10 results at a time is selected and there are 10 results available from Direct Hit. If an option for more than 10 results at a time is selected the Direct Hit results are available via a link. Other content comes from various advertisers, the Lycos Network, and GoTo. The GoTo and other advertiser results may show up above and/or below the other results but are under a separate heading such as "feature listings."
Strengths • Advanced searching capabilities • Page depth limit • Advanced search help • Truncation
Weaknesses • Link searches must be exact • Database size shrunk for awhile • Advanced features have not always worked right
Features • Default Operation: Processed as an AND • Full Boolean Searching: AND, OR, and NOT • Proximity Searching • Truncation with the * symbol • Case sensitive • Extensive, dynamic stop word list • Word Stemming - Search for grammatical word variants including plural, singular, and tense.
Field Searches • Field Searching: Searching title words and links to a specific URL • acrobat/applet/activex/audio/embed/ flash/form/frame/image/script/ shockwave/table/video/vrml
Limits • linkdomain: Limits pages containing links to the specified domain • Outgoingurlext: Limits to pages containing embedded files with the specified extension • Scriptlanguage: Limits to pages containing only javascript or vbscript • after: [day]/[month]/[year] • before: [day]/[month]/[year] • within:[number/unit] • Language Limit
Unique for Hotbot • Page Type – • Default is Any (Any pages) • Top Page (the root page of a URL ie. www.unca.edu) • Page Depth - Limits how far down a subdirectory hierarchy Hotbot Searches • These are useful for finding the primary sites for organizations or information
Sorting • Results are sorted by relevance with groupings by site available at the end of each brief record. • The display includes the relevance score, title, URL, a brief extract, and date. HotBot displays 10 records at a time, by default.
Architecture • Direct Hit: • Provides the breadth of a conventional search engine, with the relevancy of an index which is edited by humans • References the searching activity of millions of users • Adjusts rankings based on the popularity of the retrieved documents
Architecture • Inktomi • Hosts Web searches for its clients on coupled-cluster, parallel-computing multiple workstations • Receiving a search query from a user, that interface translates the query from HTTP into Inktomi Data Protocol (IDP) and sends it to the Inktomi Master Cluster • it sends the results in IDP to the client Web server, which translates the information into HTTP and sends it to the user
Results • Query 1: Information on Home of the Rockefellers Kykuit - To test the engines on a very specific bit of Americana - Kykuit, the baronial home of the Rockefellers on the Hudson River in New York. • Query 2: Information on Neuschwanstein Castle - To test the engines on a fairly well-known tourist attraction in Germany - Neuschwanstein Castle • Query 3: Information on Francis Pilkington Madrigals - To test the engines on retrieval of an obscure musical reference - the Elizabethan madrigals of Francis Pilkington.
Query 1: Information on Home of the Rockefellers Kykuit • Hotbot - 72 Matches • FPL: www.gorp.com/gorp/location/ny/kyk_hudv.htm • Relevance rating: Page 14: County Historys • Google - 91 Matches • FPL: www.abbeville.com/booktemplate.asp?stockno=2220 • Relevance: Page 30: A Book Where Kykuit is mentioned • UNCA Library - 5 Matches • FPL: wncln.appstate.edu/search/...information+on+how+to+use+the+dietary+guidelines&1,1 • Relevance: Page 1: Information on how to use dietary guidelines
Query 2: Information on Neuschwanstein Castle • Hotbot - 2,700 Matches • FPL: www.castlesoftheworld.com/Brochure/ • Relevance: Page 10: Castles of the US • Google – 4,060 Matches • FPL: www.neuschwanstein-castle.com/ • Relevance: Page 33: A Page on King Ludwig II - No Mention of Neuschwanstein Castle • UNCA Library - 5 Matches • FPL: wncln.appstate.edu/search/…6,0,0,B/frameset&FF=tinformation+on+self+employment+tax&1,1 • Relevance: Page 1: Information On Self Employment Tax
Query 3: Information on Francis Pilkington Madrigals • Hotbot - 53 Matches • FPL: www.medieval.org/emfaq/cds/van624.htm • Relevance: Page 5 - A Page about the Lute - no mention of Madrigals • Google - 33 Matches • FPL: www.netstrider.com/search/methods.html • Relevance: Page 3: No mention of Pilkington Madrigals • UNCA Library - 5 Matches • FPL: wncln.appstate.edu/search/…6,0,0,B/frameset&FF=tinformation+on+the+red+notice+system&1,1 • Relevance: Page 1: Information On The Red Notice System
Conclusion • HotBot is an interface to advanced web searches, and it presents a dynamically changing backend. Both the Inktomi and Direct Hit technologies serve, in different ways, to provide a relevant list of results through advanced queries, and both seek to minimize the commercial influence over search results. All of these technologies are subject to changes in technology developments, and changes in the business environment. • Its weaknesses include that it still doesn't seem to produce the depth and breadth of some other engines, and that it's advanced features have not always worked correctly. As the proliferation of this engine's index and searching features continues, these weaknesses should be overcome.