1 / 53

Effective Web Searching

Effective Web Searching. Dr. I.R.N. Goudar Visiting Professor-cum- Library Adviser University of Mysore Refresher Course UGC- Academic Staff College University of Mysore. Organization of the Web. Web is the totality of web pages stored on web servers

elsu
Download Presentation

Effective Web Searching

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Effective Web Searching Dr. I.R.N. Goudar Visiting Professor-cum- Library Adviser University of Mysore Refresher Course UGC- Academic Staff College University of Mysore

  2. Organization of the Web... • Web is the totality of web pages stored on web servers • Spectacular growth in web-based information sources and services: • Education and research • Entertainment • Business and commerce • Personal home pages • Estimated to contain over 1 billion indexable web pages • Doubling each year • Over 80 million web sites

  3. Finding relevant documents on the Web • Informal: • Browsing (and book marking for later use) • Friends • Print sources • Discussion forums (mailing lists) • Current awareness services (e.g. Scout Report) • Guessing web site addresses! • Formal (using information finding tools) • Web directories/ guides • Web search engines • Meta-search tools • Specialty search engines

  4. Three Types of Internet Searching Tools • Subject Directories or Subject Trees such as Yahoo. • Search Engines such as Google, Teoma, and Alta Vista. • Metasearch Engines such as Dogpile and Mama, andixquick

  5. Limitations • Anyone can put up a web page • Many pages not updated • No quality control • most sites not “peer-reviewed” • less trustworthy than scholarly publications

  6. Web Directories/ Guides • Also called as ‘virtual libraries’ and ‘Internet resource catalogues’ • Organised collection of descriptions and links to Internet sources • Organisation: by subject categories (hierarchical); by resource type (patents, e-journals, institutes, etc.) • Most use human experts for source selection, indexing and classification • Some include reviews/ ratings of listed sites

  7. Web Directories/ Guides... • Examples of general web directories: • Internet Public Library(http://www.ipl.org/) • Britannica’s “Web’s best sites” (www.britannica.com) • Infomine (infomine.ucr.edu) • Scout Report Signpost (www.signpost.org) • BUBL link (bubl.ac.uk/link) • Yahoo (www.yahoo.com) • Magellan (www.mckinley.com) • Galaxy (www.galaxy.com) • Looksmart (www.looksmart.com) • Snap (www.snap.com)

  8. Web Directories/ Guides... • Guides to directories: • WWW Virtual Library (www.vlib.org) • Subject-specific guides (subject gateways): • Intute (http://www.intute.ac.uk/) • IOP Physicsworld.com (http://physicsworld.com/) • Chemcenter(www.acs.com) • Programmers Heaven (www.programmersheaven.com) • Resource type guides: • Patents (www.european-patent-office.org) • Electronic journals (www.publist.com)

  9. Web Search Engines • Web search engines build a full-text index to web pages gathered from web sites and provide a keyword search interface to this index • Spider programs periodically visit web sites and gather the web pages for indexing • Also index web sites submitted by site developers • A brief summary of the indexed web page is also prepared • The index usually contains URLs, titles, headings, and other words from the HTML document

  10. Web Search Engines... • Examples: • Fastsearch (alltheweb.com) • Altavista (www.altavista.com) • Google (www.google.com) • Northernlight (www.northernlight.com) • HotBot (www.hotbot.com) • Excite (www.excite.com) • Teoma (http://www.teoma.com/)

  11. Web Search Engines... • Specialty search engines: • Country-specific search engines • www.khoj.com • www.123india.com • Subject-specific search engines • Chemfinder (www.chemfinder.com) • Engineering Resources Online (www.er-online.co.uk) • MathSearch (www.maths.usyd.edu.au:8000/MathSearch.html) • World Trade Locator (www.intl-tradenet.com) • Resource-specific search engines: • Patents (www.uspto.gov) • Journal articles (www.findarticles.com)

  12. Meta Search Tools • Also know as multi-threaded search engine • Allows the user to search multiple databases simultaneously, via a single interface and return results in a uniform format • Presents a summary of the collected results from other search engines and directories • Do not gather web pages, build indexes, accept URL additions, classify or review web sites • Some features supported: • Duplicate hits removal • Rank results • Selection of search engine(s) to be used

  13. Meta Search Tools... Search using multiple search engines Search using a meta search tool

  14. Meta Search Tools... • Meta search tools (remote sites): • MetaCrawler (www.metacrawler.com) • Ixquick (www.ixquick.com) • Dogpile (www.dogpile.com) • Meta search tools (local, installable software): • Copernic (www.copernic.com) • LexiBot (www.completeplanet.com)

  15. People Finding Tools • Register names and addresses and find e-mail addresses • Examples: • Bigfoot (www.bigfoot.com) • Peoplesearch (www.peoplesearch.net) • Ahoy (ahoy.cs.washington.edu:6060/) • Four11 (www.four11.com) • Switchboard (www.switchboard.com) • Whowhere (www.whowhere.lycos.com/) • Most search engines also support people searches (e.g. Altavista, Google, Yahoo!)

  16. Web Search Strategies • Search steps: • Analyze the search topic and identify the search terms, their synonyms (if any), phrases and Boolean relations (if any) • Select the search tool(s) to be used (meta search engine, directory, general search engine, specialty search engine) • Translate the search terms into search statements of the selected search engine • Perform search • Refine the search based on results • Visit the actual site(s) and save the information (using File-Save option of the browser)

  17. Google (www.google.com) • Enables users to search the Web, images, etc. • Features: PageRank, caching and translation, an option to find similar pages. • The focus is developing search technology. • Ranked #1 in the world

  18. Google • Largest & Most Popular Search Engine • 8 Billion + Pages Indexed • Very Effective Advanced Search Features • Limit searches by domain, ie. Site:edu • Limit searches by format, ie. .pdf, • Specialized Search Tools • Images, Directory, Videos, Books, Scholar, News, Blogger

  19. How Google works • BEFORE you search:“Crawls” pages on the public webCopies text & images, builds database • WHEN you search:Automatically ranks pages in your results • Word occurrence and location on page • Popularity - a link to a page is a vote for it • ~ 200 factors in all!

  20. Limit your search to … • Web page titleintitle:hybridallintitle:hybrid mileage • Website or domainsite:whitehouse.gov “global warming”site:edu “global warming” • File typefiletype:pptsite:edu “global warming” • Definitionsdefine:pixeldefine:“due diligence”

  21. On the results page • Search box (use to modify) • “Cache” • “Related pages” • “Translate this page”

  22. Google’s other databases

  23. Searching for Pictures • Searching for images is easy! • From the main page of the search engine, select images or pictures before entering your search term.

  24. Google Scholar (scholarly literature=articles, books) • Google Books (books) • Google Directory (handpicked specific topical sites)

  25. Beyond Google • Take advantage of human selectivityLibrarians’ Internet IndexInfoMineGoogle Custom Search Engines (CSE)

  26. Thank You!

  27. Web Directories/ Guides... • Most web directories support searching within categories and descriptions, in addition to browsing • Advantages: • Access to high quality sources • Do not contain redundant links • Faster access to sources • Disadvantages: • One needs to be aware of such directories/ guides • May not be up-to-date • May not be exhaustive • Categories (subject hierarchy) varies across directories

  28. Web Directories/ Guides... • When to use web directories/ guides? • For broad/ general topics where keyword searching on search engines retrieves too many irrelevant sites • When you want a few highly relevant sites and intention is not exhaustive/ comprehensive search • When not to use web directories/ guides? • For concept/ keyword searches • Search terms are distinctive • Effective directory/ guide usage: • Take advantage of the sub-search within categories, supported by most directories/ guides • Join their mailing lists for automatic updates on new sites

  29. Web Search Engines... • The search engines provide a forms-based search interface for entering the queries • Support simple and advanced search interfaces • Search results are returned in the form of a list of web sites matching the query • Some key features supported: • Phrase searching (“…” double quotes) • Boolean searching (AND, OR, NOT) • Implied Boolean: Term inclusion (+), term exclusion (-)

  30. Web Search Engines… • Key features… • Proximity searches (NEAR, ADJ, BEFORE, AFTER) • Use of parentheses to group search terms • Truncation searches (‘industr*’) • Field-specific searching (Title, URL, Text) • Natural language queries (‘Why is the sky blue?’) • Relevance ranking of search results • Number of search terms • Number of times each search term occurs • Proximity of search terms • Location of search terms (title, text)

  31. Web Search Engines… • Key features… • Sub-searching (searching within retrieved records) • Case sensitivity • Limit by language • Limit by age of documents • Limit by audio, video and image type • Translation of search results (title and description) • Limit by domain, host

  32. Web Search Engines... • Example tutorials • Finding Information on the Internet: A tutorial (www.lib.berkeley.edu/TeachingLib/Guides/Internet/FindInfo.html) • How to search the world wide web: A tutorial for beginners and non-experts. David P. Habib and Robert L. Balliot. September, 1999 (204.17.98.73/midlib/tutor.htm)

  33. Web Search Engines... • Advantages of search engines: • Best suited for complex keyword/ concept searches • Control over search: search terms can be combined as required • Searches can be limited to period of time, fields, source type,etc. • Currency of information, made possible by regular addition by web spiders • Exhaustive information can be retrieved (with lots of patience!) • Disadvantages: • Time consuming • False positives • Search engines vary in terms of search techniques/ syntax • Dead links, redundant links (same document gets displayed) • Spamming (‘salting’ of pages) • Higher ranking of paying sites

  34. Web Search Engines... • Limitations of web search engines: • Poor retrieval effectiveness (relevance) as little vocabulary control is exercised by web site developers and the index engines • Different search engines return different search results due to the variation in indexing and search process (40% non-overlap) • None of the search engines come close to indexing the entire web, much less the entire Internet. Content not indexed: • PDF documents • Content that requires log in • Databases searched using CGI programs • Web content on intranets behind fire walls

  35. Top Sites The top sites on the web, ordered by Alexa Traffic Rank. • 1. Google • 2. Facebook • 3. Youtube • 4. Yahoo • 5. Live • 6. Baidu • 7. Wikipedia • 8. Blogger • 9. MSN • 10. Tencent • 11. Twitter

  36. Google - Enables users to search the Web, images, etc. - Features: PageRank, caching and translation, an option to find similar pages. - The focus is developing search technology.- Ranked #1 in the world according to the three- month Alexa traffic rankings.

  37. Yahoo!  • yahoo.com • Personalized content and search options. Chatrooms, free e-mail, clubs, and pager. • Ranked #4 in the world • The site is in the “Web Portals” category.

  38. Wikipedia  • wikipedia.org • An online collaborative encyclopedia. • Wikipedia is ranked #7 in the world • It has been online for at least nine years. • The site's audience tends to be users who browse from school and work

  39. Meta Search Tools... • When to use meta search tools? • Need to be used cautiously • Good for simple searches, particularly if search terms are distinctive or unique • Good for testing with a few keywords – and find which individual search engine returns good results • Good for ‘quick and dirty searching’ if you are in a hurry and want to find a few relevant sites quickly • For complex searches, involving many search terms, Boolean logic, etc., it is better to use individual search engines

  40. Meta Search Tools... • Advantages: • Query can be run across multiple search engines • User needs to learn only the search interface of the meta search tool • Better results: retrieves top-ranking pages from individual search engines • Disadvantages: • Unique features of individual search engines is lost • Not exhaustive: use only top results returned by search engines

  41. People Finding Tools • Using people finding tools: • Person should have registered in the tool(s) • Searcher should know both surname and first name, else too many names will be retrieved • Bias for U.S. –based people • Often, required e-mail cannot be retrieved through these tools • Alternatively, any search engine may be used (phrase search using person’s name) • If person’s affiliation is known, Yahoo! Directory may be used to locate the institution and e-mail

  42. Web Search Strategies • Tips for effective web searching: • Broad or general concept searches: start with directory-based services (want a few highly relevant sites for a broad topic) • Highly specific or topics with unique terms/ many concepts: use the search tools • Go through the ‘help’ pages of search tools carefully • Gather sufficient information about the search topic before searching • Spelling variations, synonyms, broader and narrower terms • Use specific keywords, rare/unusual words are better than common ones

  43. Web Search Strategies... • Tips for effective web searching… • Prefer phrase & adjacency searching to Boolean (‘stuffed animal’ than ‘stuffed’ and ‘animal’) • Use as many synonyms as possible - search engines use statistical retrieval methods and produce better results with more query words • Avoid use of very common words (e.g., ‘computer’) • Enter search terms in lower case. Use upper case to force exact match (e.g. ‘Light Combat Aircraft’, ‘LCA’) • Use ‘More like this’ option, if supported by the search engine (e.g. Excite, Google)

  44. Web Search Strategies... • Tips for effective web searching… • Repeat the search by varying search terms and their combinations; try this on different search tools • Enter most important terms first - some search tools are sensitive to word order • Use the NOT operator to exclude unwanted pages (e.g.: bio-data, resumes, courses) • Go through at least 5 pages of search results before giving up the scan • Select 2 or 3 search tools and master the search techniques

  45. Sample Web Searches • “Companies dealing with polymers” • Do not use search engines (too many irrelevant hits) • Use directory sources (e.g. www.yahoo.com) • Follow the categories: • Business and Economy • Business-to-Business • Chemicals • Do a sub-search on ‘Polymers’ • Use specialty search engines (e.g. www.bizweb.com)

  46. Guides to Search Tools • www.beaucoup.com (guide to 2,000+ search engines, indices and directories) • www.searchpower.com (a very comprehensive search engine directory - claims over 16,000 search engine listings!) • www.123go.com/drw/search/search.htm (Dr. Webster’s Big Page of Search Engines ) • www.finderseeker.com (The search engine of search engines) • www.virtualfreesites.com (Over 1,000 specialised search engines)

  47. Keeping Current • AskScott (www.askscott.com): Provides a very comprehensive tutorial on search engines • SearchEngineWatch (www.searchenginewatch.com) The site offeres information about new developments in search engines and provides reviews and tutorials. • Botspot (www.botspot.com): Collection and guide to variety of bots (intelligent agents)

  48. Web Search Engines... • Demonstration of search engines: • Fastsearch (www.alltheweb.com) • Altavista (www.altavista.com) • Google (www.google.com) • Northernlight (www.northernlight.com)

More Related