1 / 13

Web Searching

Web Searching. Everything, now. History of Search. Archie (archives) - 1990 Database of FTP filenames with regex query searching WWW Wanderer Web’s first robot High bandwidth load ALIWEB (Archie-Like Indexing of the WEB) Pages submitted with descriptions. History of Search.

harris
Download Presentation

Web Searching

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Web Searching Everything, now.

  2. History of Search • Archie (archives) - 1990 • Database of FTP filenames with regex query searching • WWW Wanderer • Web’s first robot • High bandwidth load • ALIWEB (Archie-Like Indexing of the WEB) • Pages submitted with descriptions

  3. History of Search • Archietext - 1993 • First to use statistical analysis of word relationships to generate results • Yahoo! - 1994 • Searchable directory of pages with descriptions • Webcrawler - 1994 † • Indexed entire web pages • Lycos - 1994 † • 60 million documents by 1996

  4. History of Search • Infoseek - 1994 † • Altavista - 1995 † • Looksmart -1996 • Inktomi - 1996 • Ask Jeeves -1997 • Google -1998 • Teoma - 2000

  5. Web Search Today • Search algorithms are highly secret • Use off-page criteria for ranking • Constant tweaking • Things to look for: • Boolean nesting • Fields • Clustering? • Stop words

  6. Web Search Today • Google • PageRank system • “Important” sites given artificial high rank • Strengths • Largest database • Relevance based on external linkage • Weaknesses • No nesting • May search for synonyms / grammatical variants (automatic stemming)

  7. Web Search Today • Yahoo! • Brand new search database (as of Feb ’04) • Strengths • Full boolean searching • Very fresh • Directory links • Weaknesses • Includes pay for inclusion results (!)

  8. Web Search Today • MSN Search (Inktomi) • Large Inktomi database • Strengths • Page depth limit • Full boolean searching

  9. Web Search Today • Teoma • Subject-specific popularity • Strengths • Refine • Related • Weaknesses • Small database • No boolean nesting

  10. Web Search Tomorrow • Kartoo • Visual meta search engine • Nutch • Open source web search • Java (but that could change) • Dipsie • “2 clicks” • Singingfish • Multimedia (audio / video) search

  11. Internet Directories

  12. Conclusion • Which search engine is the best?

  13. References • http://searchengineshowdown.com/ • http://www.search-marketing.info/

More Related