1 / 27

Chapter 5

Chapter 5. Fluency with Information Technology 4 th edition by Lawrence Snyder (slides by Deborah Woodall : woodall@mc.edu). The Library. The library is an excellent resource tool. The library has online searching of their in-house holdings and the holdings of other libraries.

nailah
Download Presentation

Chapter 5

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 5 Fluency with Information Technology 4th edition by Lawrence Snyder (slides by Deborah Woodall : woodall@mc.edu)

  2. The Library • The library is an excellent resource tool. • The library has online searching of their in-house holdings and the holdings of other libraries. • Most libraries have online access to large, both free and commercial, collections of highly structured and reliable information.

  3. Hierarchical Structure • People who organize information organize it hierarchically for easy understanding and access. • A hierarchy is a series of levels of organization. • Large general categories are subdivided into smaller more specific categories, which are further subdivided into even more specific categories, until manageable pieces of information are attained.

  4. Hierarchical Structure • URLs reflect the hierarchical organization of the World Wide Web. • Furthermore, following links also moves us through levels of organization at a site, and thus reveals the hierarchy of organization for that site.

  5. www.enchantedlearning.com/subjects/ insects/orthoptera /Cricket.htmlwww.enchantedlearning.com/subjects/apes/gibbon/www.enchantedlearning.com/subjects/birds/info/chicken.shtml

  6. Finding information on the Web How do we ever find anything? • We may have a URL to get us started and from there we follow the links, e.g. www.ebay.com • Or, we may use a search engine

  7. Search Engines • A search engine is a collection of computer programs that helps us find information on the World Wide Web. • Some are: Google, Yahoo! Search, Ask.com, Alltheweb, Lycos, AltaVista, exalead, Bing

  8. Search Engines A search engine has 2 basic parts: • Crawler • Query Processor

  9. Crawler • The crawler is also called a "bot" or a "spider." • The crawler is software that "crawls" around the Web looking through Web pages, using what it finds to build its search engine's index.

  10. The Index • The index is a huge list of tokens (i.e. images, words) that the crawler finds as it crawls each Web page. • Each token is followed by a list of the URLs of all Web pages with that token in • the URL itself • the title bar • the body • the anchor text of the link to the Web page.

  11. Index Mississippi: www.mc.edu, www.mississippi.gov, www.visit mississippi.org,www.mstc.state.ms.us, etc.

  12. Crawler • For a new Web page to become included in a search engine’s index it must be crawled, or someone must submit the page to the search engine. • Each search engine has its own crawler, thus its own index. • More than 80 % of the pages in a major search engine’s index exist only in that index.

  13. Really? • NO search engine has EVERY Web page indexed. • Less than half of the searchable Web is fully searchable in Google.

  14. Query Processor • We interact with the query processor. • It takes the tokens we enter (called a query) and looks for them in the index, returning a list of URLs called hits. • Query processors do NOT search the World Wide Web when you submit a query.

  15. Queries Favorite Search Engine • The query processor for your Favorite Search Engine looks up the token b-i-r-d in the index and returns to you the list of URLs associated with it. • These resultant URLs are called "hits." Go bird

  16. AND Queries Favorite Search Engine • We want pages associated with ALL of these tokens. • The query processor of your Favorite Search Engine looks up these three tokens and intersects the their lists of URLs, i.e. the URLs returned must be listed on all three token lists. • The same as the query: bird AND flu AND turkey • AND is a logical operator • It is assumed when blanks are used • Implies ALL of the words must be on every page Go bird flu Turkey

  17. OR Queries Go Favorite Search Engine • We want pages associated with ONE OR TWO OR ALL of these tokens. • The query processor of your Favorite Search Engine looks up these three tokens and combines all three lists of URLs, removing duplicates. • OR is another logical operator. bird OR flu OR Turkey

  18. AND NOT Queries Go Favorite Search Engine • NOT is the third logical operator. • We want pages associated with both words "canary" and "bird" but not associated with the word "flu". • The query processor of your Favorite Search Engine intersects the lists of URLs for "canary" and "bird", then removes from the resultant list any of the URLs in the "flu" list. • In most search engines, same as: canary bird -flu canary bird AND NOT flu

  19. Other Queries • (bird OR swine) flu -canaries • "Michael Weatherly" • Apollo +13

  20. Queries – More Tips Restrict your search… • to Web pages on Web servers in a particular domain examples: "lab days" site:mc.edu woodall "lab days" site:edu woodall • to Web page titles examples: intitle:frog allintitle:green tree frog

  21. Queries – More Tips Restrict your search… • to specific file types example: filetype:ppt "green tree frog" • to definitions example: define:frog

  22. Queries – More Tips • Search for online data directories or databases by using the words directory or database in your query example: airplane crash directory database • Many search engines have their own specialty databases, for such things as images, maps, news, blogs, books, etc. • Help and specific rules for your search engine can be found by clicking Advanced or Hints on the search engine’s home page.

  23. Ranking Results Search engines rank the Web pages they find Google • PageRank (popularity ranking) Ask.com • ExpertRank (subject-based ranking)

  24. PageRank Page B Some Web Page Page B Deborah Woodall's Home Page Page B

  25. Don’t Believe Everything • Look for who or what organization publishes the page (i.e. who owns the domain) • Anyone can put up a Web page! • A scholarly site is likely more reputable than one put up by some individual. • Be wary of look-alikes: www.gatt.org vs www.gatt.com fafsa.com vs fafsa.org • Is ama-assn.org associated with the American Medical Association? • To check you can go to www.internic.net/whois.html

  26. Don’t Believe Everything Does the page have the characteristics of a legitimate site? (It could still be a hoax!) • Physical existence • Expertise • Clarity • Currency • Professionalism

  27. Don’t Believe Everything • Check out what you find at one site by looking for the same information at other sites.

More Related