500 likes | 647 Views
Search Engine Strategies . . . Beyond Yahoo. Presented by. Linda J. Goff, Head, Instructional Services CSUS Library Spring, 2002. Today’s Agenda. Web Structure, Jargon & definitions. How search engines think and work. Picking the right web search tool.
E N D
Search Engine Strategies . . . Beyond Yahoo Presented by Linda J. Goff,Head, Instructional ServicesCSUS LibrarySpring, 2002
Today’s Agenda • Web Structure, Jargon & definitions. • How search engines think and work. • Picking the right web search tool. • Searching techniques & tips. • Evaluating your sources - thinking critically about information. • Demonstration.
Browser Cache Channel Cookies html http hypertext link Invisible Web Metasearch portal sites SacLink telnet URL Glossary
What is the World Wide Web? • The World Wide Web (WWW) is a global interactive, dynamic, cross-platform, graphical hypertext information system that runs on the Internet.
The Web is Growing Exponentially • Internet users world wide estimated to be 513.41 million as of Aug., 2001 Source: http://www.nua.ie/surveys/how_many_online/
In the beginning there were 2 types of search tools:... • Hierarchical - organized (by people) along a classification system, known as a Web Directoryor Subject Treeusing channelsorsubcategories. • Standard Search Engines– used Robots or ‘botswhich search the web using mathematical algorithms and boolean search terms.
Hierarchical • Organized (by people) along a classification system, known as a Web Directoryor Subject Treeusing channelsorsubcategories • Many layers to get to the information – This was modeled on the traditional library classification system. (Think Yahoo!)
Robots or “Bots” … • Provided users with direct access to list of web sites containing the words they searched. • Bots searched using a mathematical algorithm and boolean search terms(Think Alta Vista, Hotbot,Lycos, …).
Used to search by keyword and used boolean “OR” operator to link keywords. Now uses “AND.”
Now Web search tools can ... • Search multiple search engines simultaneously. • Find sites that answer natural language questions. • Ranks sites by how many links have been made to them. • Sorts matches into folders by categories. • A combination of the above.
Webbrain.com • Even Hierarchical or directory search engines have been updated. • Webbrain rearranges the categories and sub-categories visually as you click on a new topic.
Dogpile.com • Metasearch engine – “fetches” simultaneously across multiple search engines and displays top sites in each • Warning: Some now charge for higher listings, e.g., Overture
AskJeeves.com • Type your questions in Natural Language. • Jeeves responds with one or more closely related questions that it already knows the answer to. • Some have drop-down menus to select from.
Google.com • Result rankings are based on the number of links made to the site from other web pages. • Give you sites that web page creators have “voted” for with their links. • An .edu link counts more than one from a .com page.
Vivisimo.com • Queries one or more web search engines (Metasearch). • Clusters Documents into groups based on this information. • Groups the documents Orders the groups and the documents within each group. • Displays the hierarchical categories.
Part 2How SearchEngines Thinkand Work Search Engines
Most search engines and databases use Boolean Operators to create search statements, e.g.(domestic or family) and violence not sexual abuse
Boolean Operators • AND requires both terms to appear in the items that are retrieved. • OR requires either term to appear in the items that are retrieved. • NOT excludes a term.
Boolean Search Strategy a AND b a b family and violence a OR c a c family or domestic b NOT d b d violence not sexual abuse
What Search Engines Don’t Search • ‘Bots only crawl the visible web which is only about 20% of everything that is on the Internet. • They don’t look at the “Deep Web”, or “The Invisible Web.”
Invisible Web contains • Commercial databases that charge a fee, e.g., library research databases of periodical articles. • Sites that require membership or a login. • Searchable pages such as catalogs, phone books or directories, e.g.AMA Physician Search.
Library Databases Access • Authentication automatic for users with Web access via CSUS and SacLink. • CSUS users with other Internet Service Providers (AOL, Prodigy etc.) must use Library Proxy Server for authentication to access Library databases. • To connect from off campus go to http://www.lib.csus.edu/databases/help/page.
Which should you use? • Use a hierarchical directory like Yahoo! for browsing categories or to see interesting web pages that have been reviewed (e.g., general travel information). • Use a linking or ranking search tool like Google if you want to see what other users have already validated by their use.
How to choose … • Use a “Metasearch Engine” like Vivisimo or Dogpile if you want to see the sites that rank highest across the most popular search engines. • Use an “Answer Engine” like AskJeeves if you want to use plain English, natural language.
If those don’t work, then • Use a standard search engine like Hotbot. • Use “Advanced Search” or create a complex Boolean search advanced search mode: • (smoking or tobacco) and nicotine addiction. • Best for looking for a specific person, place or topic or for connecting ideas.
Search Engine Comparisons • Most have built-in search tips or help screens. • Boolean operators, phrase searching and other limiters are often available. • Be aware! Some now charge for higher page placement e.g, Overture.
Handout • See “Searching the Web”handoutof special search features and URLs for most popular search engines. http://libweb.uoregon.edu/network/srchweb.html
FAST=FAST WT=WebTop.com GG=Google INK=Inktomi AV=AltaVista NL=Northern Light EX=Excite Go=Go (Infoseek) Source: SearchEngineWatch.com, as of Aug. 15, 2001
There are specialized search engines for almost every topic • For a list of over 3,000 search engines go to Search Engine Guide:http://www.searchengineguide.com • For detailed information aimed at search professionals try SearchEngineWatch:http://www.searchenginewatch.com
Part 3 Search Tips & Strategies World Wide Web
Reading Parts of the URLhttp://www.csus.edu/csuslibr/ • The part before the colon is the access method orprotocol, (hypertext transfer protocol). • The part after the double slashes is the net address ordomain nameof the computer where the resource is located. • Thedirectory pathandfilename come after the next slash.
edu- higher education com- commercial firms (+22 million) gov- government agencies mil- military (US) org- general noncommercial organizations net - computer networks int - international organizations State or Country of origin: uk (United Kingdom) ca (Canada) ca.us(California. United States) Common Codes in Domain Names
.info (anyone) .biz (business) .name (individuals) .pro (professionals) .museum .aero (airlines) .coop (business cooperatives) New Suffixes added by ICANN, effective Spring 2002
Think critically about the information you find on the Web... • Anybody can publish anything on the Web. • There are no editors and no central authorities. • There are no guarantees that the site you find will be there next time you look.
Questions you should ask when evaluating a Web page: • Who is the author or sponsor? • What authority/expertise do they have? • What is the purpose/scope of the page? • Is it current? When was it last updated? • How complete and accurate is the information? Does it have a bias? • How usable is it? Do the the links work?
You must... • Examine assumptions and possible biases. • Distinguish between fact and opinion. • Compare and contrast related pieces of information from other sources (print and online).
Bogus sites proliferate • POP! the First Human Male Pregnancy • http://www.malepregnancy.com • Dihydrogen Monoxide Research • http://www.dhmo.org/ • Clones-R-Us • http://www.d-b.net/dti/
Sites need to be examined carefully and compared • Martin Luther King Jr. – A Historical Examination • http://www.mlking.org • The King Center http://web.archive.org/web/20010208160923/http://thekingcenter.org/ • http://www.thekingcenter.com/
Web Searching Tips • Use unique words or phrases. • Check spelling ! • Use synonyms or multiple spellings (e.g., marijuana marihuana) • Try more than one search engine. • Use words like “research” or “policy” to find more scholarly sites. • Use domain limit feature e.g., Domain:edu or domain:gov
Citing Electronic Sources Look for it on the Library Home Page under Databases and Periodical Indexes. Look on the left for Guides For General & News and click on Citing Electronic Sources. The URL is... http://www.lib.csus.edu/guides/budge/eography.htm.
WARNING • Con artists and scams are proliferating on the Web. • Don’t use your credit card number unless you are assured of a secure system. • Don’t download unfamiliar software. • Don’t give out personal information.
Browser Configuration Tips • Clear the memory cachebefore you begin a search session. It will speed up your response time. • Use the following path: Edit -> Preferences -> Advanced -> Cache (Netscape) • Set Cookiesat the same screen. • Set Proxies at the same screen.
Shortcuts • Use Bookmarks or Favorites • UseGo from the pull-down menus instead of the Backbutton or use the History or right mouse button. • Use the Stop and Reloadbuttons if loading a document takes too long. • CTRL ALT DEL will bring up Windows 2000 Task Manager and you can close the browser if it is not responding.
This PowerPoint presentation was prepared by: Linda J. Goff Head, Instructional Services University LibraryCalifornia State University, Sacramento.ljgoff@csus.edu http://www.lib.csus.edu/services/instruction/indiv/ LJG:3/19/2002