280 likes | 426 Views
Intermediate Internet Searching. Or How to really find information on the internet. Agenda. Size of Internet Types of search engines Search strategies Choosing a search engine Interpretation of search results. Size of Internet/World Wide Web.
E N D
Intermediate Internet Searching Or How to really find information on the internet Shayna Keces Reference Librarian
Agenda • Size of Internet • Types of search engines • Search strategies • Choosing a search engine • Interpretation of search results
Size of Internet/World Wide Web • July 2000 2.1 billion web pages, est. 4 billion pages by early 2001 (Some place much higher if count invisible or deep web) • Size of search engine databases • Google 2 billion • Fast (alltheweb) 625 million • AltaVista 550 million • Yahoo 2 million catalogued (Google for not catalogued)
Search strategies • Do nots • use search button • use a string of keywords without specifying Boolean properties • use upper case unless part of strategy • use NOT or - unless absolutely sure is necessary • elimination of unanticipated pages • format is non standardized
Search Strategies • Do • Consider what type of resource will best answer your question and search for that resource (eg. dictionary or certain type of web page) • think of a list of keywords that will narrow or broaden your search keeping in mind that with the internet, narrowing your search is usually better • Stick to small list of search engines and learn the search syntax for the search engine you’re using
Boolean Search • Developed by mathematician George Boole • Or widens a search • AND and AND NOT narrows a search • Parentheses used to group operations that have to be done together (Public libraries OR bookstores) AND (Ottawa OR Nepean OR Gloucester OR Goulbourn OR Carp)
Group A AND Group B Public libraries OR bookstores Ottawa OR Nepean OR Gloucester OR Kanata OR Goulbourn OR Carp
Orange = Senators AND NOT hockey hockey Senators
Types of search engines • Keyword or robot based (builds a database) • Directory based (categories indexed by people rather than computer) • Annotated directory-based search engines • Meta indexes (can combine searches or allow you to search a variety of engines individually) • Specialized search engines
Keyword or robot based Search Engines • Large database of web pages • No human involvement and no quality control • Can submit website or will find some on own • Searches full text to certain level, does not search deep or invisible web • Google (www.google.com) • Alta Vista (www.altavista.com) • Hotbot (www.hotbot.com) • Fast (www.alltheweb.com)
Google (www.google.com) • Presently largest database (1.5-2 billion) • Very sophisticated placement of results particularly good for popular sites, company sites • Advanced search can limit search to title of page or to URL • implied AND • + for stop words
Google (www.google.com) cont. • If you want or needs to be expressed in caps • not case sensitive • no stemming • description shows keywords in context • cached pages
AltaVista (www.altavista.com) • One of larger search engines • Particularly good for finding less popular sites • Implied “or” probably, often changes • Case sensitive when word is in quotations • Stemming with * at end or in middle of words • Search within these results • Sophisticated search of elements, url, text, etc. http://help.altavista.com/adv_search/syntax
AltaVista Advanced Search • Has guided search as well as blank space for true Boolean search using Boolean terms and parenthesis • Must use Boolean operators or equivalent symbol (not + and -) • No operators implies phrase • Has sort by feature which can be used to determine how results are • Can specify dates of last modification
Directory-based Search Engines • Indexed by individuals so subject searches will be more accurate • Smaller database than Robot engines • Used mainly for finding good site on general topic • Yahoo (www.yahoo.com or ca.yahoo.com) • About (about.com or home.about.com/aboutcanada) • Looksmart (www.looksmart.com)
Yahoo (ca.yahoo.com) • Most popular of directory based search engines • Many different versions (international have same pages as others but local options are supplied first) • Uses Google as search engine • Can search by categories and move up and down the category structure by clicking on category and looking at hierarchy
About (about .com or home.about.com/aboutcanada • Another popular directory-based search engine • Volunteer guides responsible for finding good websites on appropriate subjects • Some guides exist on all version of About but geographic versions have items specific to country
Annotated directory-based search engines • Because annotated, database is even smaller than Directory-based engine • Quality of web pages is better • Web pages often rated • Librarian’s Index to the Internet (lii.org) • Argus Clearinghouse (www.clearinghouse.net)
Argus Clearinghouse (www.clearinghouse.net) • Topical list of fairly scholarly guides submitted to Argus on a variety of subjects. • Can have more than one guide or page on the same subject. • Not all are accepted and all are objectively rated by Argus staff and the detailed rating in available. • Because Argus does not solicit web pages, coverage is uneven • Date of rating is also provided
Meta indexes • One site searches more than one search engine • Results can be separated or combined • Sometimes a problem in interpreting question for all search engines • Used if not sure which search engine will give you best results and/or obscure topics
Meta indexes examples • Dogpile (www.dogpile.com) • Metacrawler (www.metacrawler.com/index.html) • Surfwax (www.surfwax.com) • All4one Search machine (www.all4one.com)
Specialized Search Engines • Geographic based (www.altavistacanada.com, http://www.ottawastart.com/ • Phone directories (canada411.sympatico.ca/, home.infospace.com/) • Newsgroup searching (groups.google.com) • Women’s information (wwwomen.com)
Specialized sites • Ottawa Public Library (www.library.ottawa.on.ca) • Reference tools (see library reference sites, eg. lii.org, www.ipl.org/ref) • Encyclopedias (www.britannica.com, Columbia encyclopedia www.bartleby.com/65/ • Canadian information (vrl.tpl.toronto.on.ca/, Canadian information by subject www.nlc-bnc.ca/caninfo/ecaninfo.htm, Canadian encyclopedia online, www.thecanadianencyclopedia.com/
Some hints on selecting search strategies • For any page on general topic you need an introduction try Directory-based search engine. If do not need specific quality can use address bar search • For web page of major company or organization try Google or Alta Vista if more obscure • For a specific web page that would not necessarily be popular try Alta Vista.
Some hints on selecting search strategies cont. • For health topics try health website engine like www.medbroadcast.com or health links on OPL web page. • For very obscure topic topic try Google or Alta Vista or one of meta indexes
Interpretation of search results • Look at results and reformat search using things like searching within results and adding new keywords • Analytically choose which sites to look at in result list • Anatomy of URL domain + type of name • Do not look through pages and pages of results. If first three pages are not promising redo search
Some useful tutorials for searching • See “Learning to search” section of Collection of special search engines www.leidenuniv.nl/ub/biv/specials.htm • Web searching tips www.searchenginewatch.com/facts/index.html • Net tutor (gateway.lib.ohio-state.edu/tutor/les5/) • Check links under Internet, General in OPL adult links (www.library.ottawa.on.ca)
To find more info on search engines • Searchenginewatch (www.searchenginewatch.com) • Searchengineshowdown (www.searchengineshowdown.com)