1 / 16

Week 9

Week 9. Search Engines and the Invisible Web. Resource Pages. Collections of Links Compiled by “experts” Sometimes annotated Targeted Information for a Specific User Group Examples: Voice of the Shuttle : http://vos.ucsb.edu/

Download Presentation

Week 9

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Week 9 Search Engines and the Invisible Web

  2. Resource Pages • Collections of Links • Compiled by “experts” • Sometimes annotated • Targeted Information for a Specific User Group • Examples: • Voice of the Shuttle: http://vos.ucsb.edu/ • Computer Science Research Guide: http://guides.library.cmu.edu/SCS

  3. Anatomy of a Search Engine • Basically, there are three parts to a search engine: • “Spider” or “Crawler” • - Finds the pages • - Brings them home • “Index” or “Database” • - Storehouse of pages • - Size matters, frequency of updates matters • “Search Tool” • - What we use to find the pages in the engine’s index • - This is the user interface; the only part we see

  4. How Search Engines Rank Pages • Relevance retrieval • Location of search terms • Frequency of search terms • Meta-tags (in the HTML source code of a Web page)

  5. Other Ranking Methods • Positions of Words • Term Co-Occurrence • Proximity • Pay for Placement • “Featured Web Sites!” • Link Analysis • Search Engine Showdown Chart: • http://www.searchengineshowdown.com/features/

  6. What Many Search Engines Cannot Find • Some file types: some engines can, some cannot • Dynamically-generated pages • Pages locked behind firewalls or in fee-based online • databases (such as Dialog) • Lots of the “Deep Web” stuff: • http://www.completeplanet.com

  7. Differences Between the “Deep Web” and Search Engine Results The Deep Web is another phrase for the Invisible Web • Deep Web resources are usually: • Subject specific / more focused • Less content but tends to be of higher quality • Updated more frequently • Have specialized search interfaces • Have a target audience in mind

  8. Overview of the Deep Web • What is Still Invisible: • Disconnected, loose pages • Password-protected pages and sites • “robots.txt” files • Dynamically-created pages: no static URLs • Information bound in database structures that • are uncrawlable by many search engines

  9. When to Consider the Deep Web • When you are familiar with a topic • When you want authoritative information • When you want specific information • When you want timely information

  10. Clinical Trials • Environmental Information • Grant Information • Historical Documents and Images • Art Collections • Patents • Demographic and Economic Data • Government Information Popular Deep Web Information

  11. Look at SomeDeep Web Resources • Salary.com Database • http://www.ecomponline.com/ • U.S. Patent & Trademark Officehttp://www.uspto.gov • Los Angeles Municipal Codehttp://www.municode.com/Library/clientCodePage.aspx?clientID=6662

  12. How to Find the Deep Web • Use a search engine: search “database” as a term • Use a print directory: try OCLC WorldCat to find • those specific to your subject need • Ask your colleagues • Take note in the professional literature

  13. How to Find the Deep Web (cont.) • Use Alerting Services: • The Scout Report (Internet Scout Project) • http://scout.wisc.edu/ • INFOMINEhttp://infomine.ucr.edu/

  14. Evaluation ofWeb-Based Information Continuously evaluate as you look at “information” on the free Web. The key principles to look for are: • Currency / Timeliness • Authenticity • Objectivity • Completeness and Accuracy • Verifiability Example: Thinking Critically about Web 2.0 and Beyond http://www2.library.ucla.edu/libraries/college/11605_12008.cfm

  15. Staying Current • Subscribe to alerting services for Deep Web resources • Look at reviewing tools • Research Buzz: http://www.researchbuzz.org/wp/ • Search Engine Watch and Search Engine Reporthttp://searchenginewatch.com/

  16. Search Engines Don’t Find Information—People Do! • Use the right combination of tools for the job, including offline (paper) resources • Use the right tools the best way possible • Sometimes a search engine, Deep Web resource or other Web finding tool is not appropriate to the information need A “good” search engine is one that finds what you want.

More Related