340 likes | 534 Views
Search Tips or with competition with search robots Inspired by Mary Ellen Bates’ workshop “Tips From a Super Searcher: Getting the Most From the Web and Online Sources”, Prague , 2003. Toshka Borisova AUBG Freedom Forum Journalism Library Coordinator. Search Tips.
E N D
Search Tipsor with competition with search robotsInspired by Mary Ellen Bates’ workshop“Tips From a Super Searcher: Getting the Most From the Web and Online Sources”, Prague , 2003. Toshka Borisova AUBG Freedom Forum Journalism Library Coordinator
Search Tips • The World Wide Web contains more information than any other single resource in existence today. Finding the information you are looking for among the billions of web pages on the web can be tough. This guide of search tips will have you on the road to finding information quickly and effectively. • Web search tips • The invisible web Toshka Borisova
Online Search Strategies What are you looking for: • Full text or abstracts? • Current material or 10 years back? • Basic or advanced material? • Short or in-depth articles? • Any "validating" sources? • Exact match or something close? • Leads to identify experts to call? • White papers ( White Papers contain an official set of proposals in specific policy areas), statistics and other info more likely to be on web sites? Toshka Borisova
Online Search Tips • Use "advanced search" option • http://www.aubg.bg/library/text.php?i=680 • Google Well known as the "king of search," this engine has one of the largest databases of web pages in the world. Fast, accurate results are common here and chances are good that if you can't find it in Google, it's not meant to be found. Toshka Borisova
Online Search Tips • Plan on two separate search sessions • Be sure to value your time White Paper on the true cost of searching the open web vs. the professional online Services www.factiva.com/infopro/BusIntellletter.pdf • Assume you will find something • We have higher relevance expectations than our patrons • Watch for what's notonline Toshka Borisova
Online Search Tips • Watch for references to "grey literature“ "That which is produced on all levels of government, academics, business and industry in print and electronic formats, but which is not controlled by commercial publishers." • Include www or http in your search strategies to find mentions of web sites • Alwaysuse several tools for the same search • Watch for alternate spellings and phrasings • Use same words in different order Toshka Borisova
Web Search Tips • Use tools, not search engines. There is absolutely no pattern • Wayback Machine http://www.archive.org/ • Purge your "assumptions cache" regularly • Keep a trail of where you have been • Be sure to value your time Toshka Borisova
Web Search Tips • When exploring a site, use the Site Map or Site Index • Use the [Search This Site] feature to find hidden pages Know the "power tools" of each search engine • Field searches • File-type searches • Limits by date, language, site • Truncation • Boolean Toshka Borisova
Search Tips • Keyword Search Many search engines by default offer a keyword search • Phrase Search. • Boolean Operators Named after mathematician George Boole, Boolean logic involves the operators AND, OR, NOT, and occasionally NEAR Toshka Borisova
Online Search Tips Keyword Search • Use KWIC (Key Word In Context) Try to find synonyms, acronyms http://www.keyworddensity.com/ http://www.wordtracker.com/ • Search for key words in title • Use the "at least X times" feature DJI/Factiva, LexisNexis, Dialog: Toshka Borisova
Web Search Tips Phrase Searching Requires the terms to appear in the exact order that they are typed. Most systems that allow phrase searching have the user enter the phrase in quotes. "national endowment for the arts" • Phrase Searching”- Supported by all • Google - Phrases may not be on page • Teoma- “Not always exact matches” (FIXED) • OpenfindDebuting in beta form in July 5, 2002 Openfind is a new, large independently-built search engine, initially claiming 3.5 billion pages. It is based on research in Taiwan and has a Chinese version as well. None available now Toshka Borisova
Boolean operators Just use it wisely Simple ANDs, ORs Narrows results Boolean NOT ( - ) Exclude meaning Exclude domains Boolean OR Crucial synonyms Need more pages Web Search Tips Toshka Borisova
Web Search Tips To OR or not to OR • Google: OR in CAPS, advanced • Does not always work right • yellowstone bison OR buffalo • AlltheWeb: use ( ) or Advanced Boolean Box • yellowstone (bison buffalo) • AltaVista: normal • yellowstone AND (bison OR buffalo) • Gigablast: Use + (but not the same) • +yellowstone bison buffalo • Teoma • yellowstone bison OR buffalo • Becomes(yellowstone AND bison) OR buffalo Toshka Borisova
Web Search Tips • Proximity • Text matching • citation hunt • plagiarism check • Q&A • NEAR and Other Proximity • AltaVista only Toshka Borisova
Web Search Tips • Truncation Searches for variants of a word by using a symbol to represent one or more characters. The most common symbols are * (asterisks), ? (question marks), and ! (exclamation marks). If truncation is not supported by the search engine use the Boolean operator OR to combine like terms. • AltaVistaTruncation • HotBot & MSN Truncation • Another term ”Stemming”: MSN(e.g., find "movies" if your search word is "movie") Toshka Borisova
Web Search Tips • Case Sensitive ( alaskan pipeline- with the incorrect lowercase "a") • AltaVista Advanced or Quoted Simple • MIT vs. mit or IT vs. it Toshka Borisova
Web Search Tips • Wild Card Word in Phrase Wild Card characters represent undefined letters or numerals in a search term. Wild Card characters allow for retrieval of: - Singular and plural word forms - Spelling variations (e.g., British/American spellings) - Word stems with prefixes and suffixes * - Represents zero to any number of characters at the beginning or end of a term. *GROW* - Possible Retrievals GROW , GROWS, OUTGROWTH ? - Represents exactly one character within a term... T??TH TEETH, TOOTH, TRUTH ...or one character at the end of a term AMIN? AMINE , AMINO Toshka Borisova
Web Search Tips • Field Searching Fields searching allows the searcher to designate where a specific search term will appear. Rather than searching for words anywhere on a Web page, fields define specific structural units of a document. The title, the URL, an image tags, or a hypertext link are common fields on a Web page. • How search engines work Spidering program - Collect links Indexing program - Include metatags Search/retrieval program - Sort results Toshka Borisova
Web Search Tips • Link Searching Pages include a link to the specified URL. Link Updates, Impact Analysis • Best at AltaVista, AlltheWeb • Can have different results for http://www.name.org/ Example: http://www.freedomforum.org/ - finds pages with links to this site • Title:searching will look for the word 'searching' in the title of a Web page. Hits have the term(s) in the HTML title element. title: "search engines” Toshka Borisova
Web Search TipsField Searching • IP: Page is the specified IP range. Incomplete numbers are truncated. ip:216.32.120 finds any computer in 216.32.120.* • Site: Results are only from the specified site. site:nasa.gov - finds pages at NASA's Web site • Suburl: Pages have the term(s) somewhere in the URL (host name, path, or filename). suburl:searchenginewatch • URL: Result must be exactly this URL and nothing else. url: www.slashdot.com/index.html Toshka Borisova
Web Search Tips • Field Searching title: AltaVista, AlltheWeb, HotBot, Lycos, Gigablast intitle: Google Google, Teoma url: AltaVista, AlltheWeb, Lycos, Gigablast inurl: Google, Teoma site: AlltheWeb, Gigablast, Google, Teoma link: AltaVista, Google, AlltheWeb, HotBot, Gigablast anchor: AltaVista image: AltaVista Toshka Borisova
Web Search Tips • Selected Limits Usually on advanced search form Language: At most, languages vary Date: AlltheWeb, AltaVista, Google, Inktomi • Cut out old material, focus search • Or to find old information File Type: AlltheWeb, AltaVista, Google, Inktomi. PDFs at all, Flash at AlltheWeb, Media Type: HotBot, MSN, AlltheWeb Page Size: AlltheWeb IP Range: AlltheWeb standard Toshka Borisova
Web Search Tips • Diacritics: é Does e find é? - Sometimes • Not at Google • Exact match on diacritics only • At other search engines • e usually finds e OR é é usually finds only é Use English equivalents for special letters and omit diacritics Toshka Borisova
Web Search Tips Counting Complexities • Search Engines Can’t Count Only the big search engines count, top10 search engines • Numbers constantly change • From one page of results to the next • From one minute to the next • Try reloading for more Toshka Borisova
Web Search Tips Feature Inconsistencies • Databases Changes • Constant • If they don’t . . . • They get old, out-of-date, dead links • Size Changes Often Sudden • Database Reversions • Searching Failures And Other Unexpected Results On the Fly Analysis • Always Question Results • Evaluate and Compare • Find one unique, low-posted term • Use for search engine comparisons • Evaluate change over time • “On-the-Fly Search Engine Analysis.” ONLINE23(5):63-66, Sept. 1999. onlinemag.net/OL1999/net9.html Toshka Borisova
Web Search Tips CEO - Search Engine Optimization • SearchEngineShowdown.com More on Advanced Features Feature Chart Detailed Reviews • Search Engine Watch http://www.searchenginewatch.com/facts/ataglance.html Toshka Borisova
Inconsistencies Low Recall or "I am not finding any sites on my topic!!" • Have I chosen the correct database? • Have I been too specific in formulating the search? • Have I included all possible terms and word forms? Should I use truncation? • Was Boolean logic used correctly? • Did I make a technical error, e.g., spelling, or command syntax? Low Precision or "I found hundreds of citations and many are not on my topic!!" • Delete less specific synonyms and ambiguous terms • Search fewer fields e.g., just the title field or URL • Add additional facets with AND or NOT • Add restrictions, e.g., date of publication Toshka Borisova
The Invisible Web What is it? It consists of searchable information resources whose contents cannot be indexed by traditional search engines. • Content in databases • Professional online services • Non-ASCII files • Sites that require log-in or registration • Real-time information • Dynamically-created web pages • Discussion forums and BBSs Toshka Borisova
Searching the Invisible Web • Much "invisible" content has a "visible web" front • Some databases are opening up Google searches PDF, XLS, RTF, DOC files Toshka Borisova
Searching the Invisible Web • Use directories and portals -Open Directory Project http://www.dmoz.org is the largest, most comprehensive human-edited directory of the Web. It is constructed and maintained by a vast, global community of volunteer editors. -Librarian’s Index to the Internet http://www.lii.org -Subject-specific directories http://www.econ.bg • Experts and info pros watch for this material Experts.com www.experts.comA reliable and diverse source of experts, many of whom are outside the academic arena. • Yahoo - http://groups.yahoo.com/ • Search for database or forumalong with subject terms Toshka Borisova
Searching the Invisible Web • Use meta-search engines • DogPile.com • MetaCrawler.com • Use Teoma.com's "Experts' Links“ • Scan the libraries of relevant discussion groups • Lurk on lists Toshka Borisova
Searching the Invisible Web • Use reverse link look-up to find "more like this" • Google and Alta Vista: link:www.BatesInfo.com • HotBot: http://www.hotbot.com/ link:www.aubg.bg/fforum - use [Links to this URL] Toshka Borisova
The Invisible Web Invisible Web Directories • http://www.invisibleweb.com/ The InvisibleWeb Catalog™ contains over 10,000 databases and searchable sources that have been frequently overlooked by traditional searching. • CompletePlanet.com Contains 103 searchable databases • DirectSearchDifficult to use but extensive • http://www.internets.com/They have assembled the largest filtered collection of useful search engines and newswires anywhere on the World Wide Web. There are 1-2 billion documents, on the "surface web". The deep web is estimated to be approximately 500 billion documents. • Good hierarchy of databases Toshka Borisova
Web Search Tips Set aside one afternoon every two weeks for your web reading !!! More info http://www.BatesInfo.com Toshka Borisova