220 likes | 342 Views
Searching the Deep Web. LEMA, February 2011. Deep Web Video. Surface Web: accessible via general-purpose search engines such as Google and Yahoo!. 25%. 1 trillion + Pages. 500 trillion + Pages!!. Deep Web: Not accessible via typical search engines; primarily databases. 75%.
E N D
Searching the Deep Web LEMA, February 2011 Deep Web Video
Surface Web: accessible via general-purpose search engines such as Google and Yahoo! 25% 1 trillion + Pages 500 trillion + Pages!! Deep Web: Not accessible via typical search engines; primarily databases 75% AKA visible vs. invisible web Image from express.howstuffworks.com, 14 Feb 11
The “deep web” contains … • Databases which use dynamic or temporary links • Often ?, &, CGI, other elements in the URL • Websites which aren’t indexed, by design or because there are no links to it • Deep web sites • Google limits the amount of a web site it indexes, an unpublished factor in its secret algorithm • At one point, only 110K • Formats that aren’t currently supported • Google now shows results for .pdf, .doc, .ppt Boundary between surface and deep web always in flux as search engines incorporate more of the deep web at the same time more is being added to the deep web
Deep Web: Why important? • Studies show that students’ searching habits are fairly ingrained by college • Use Google for everything • Only look at the 1st page of results • Assume trustworthiness of web sites • Rich source of in-depth material not accessible through a typical Google search Expose students now to richer and more authoritative resources.
Students need to understand …. • The best results are NOT in the top 10 • Everything’s NOT on the web • Google does NOT search the whole web • Everything’s NOT free • Everything’s NOT trustworthy • Searching/Research is NOT always easy
How can we help our students be better searchers? • Introduce them to the idea that Google isn’t everything & why • Reinforce the idea of evaluating resources • Make them better “surface” searchers • Many information needs can be met with the surface web • Easy yet “advanced” Google searching techniques • Better alternatives to the “surface web” & how to effectively search these alternatives • Databases! • Familiarity with “deep” sites on a particular topic • Example: Primary materials available at Library of Congress • Example: Legislative info at thomas.loc.gov • Familiarity with portals and directories
Three simple techniques to being a better Google searcher …. • Phrase searching • “xxx xxxx” • Searching the title of web pages • intitle: xxx or intitle:”xxx xxxx” • Example: intitle:”climate change” • Example: intitle:unicorn • Specifying a site • site:.xxx or site:xxx.com • egyptsite:washingtonpost.com • “climate change” site:.gov NOTE: No space after colon Lowercase commands
Let’s try a site: search …. • Look for a Washington Post article on the B-52s
Now let’s try a phrase search… • First, try Howard Morris as a simple keyword search -- How many hits?
Now try it as a phrase “Howard Morris” • How many hits?
Now let’s try an intitle: search • First, just search for “climate change” – how many hits?
An intitle: search • Now try searching for “climate change” in the title of the web page – how many hits?
Searching the Deep Web • LVHS Library Web Page – Deep Web link on the left • Google search for your topic and add keyword database • Ex: Plane crashes database
The Deep Web: A Comparison • Using Google, search on the term metabolism • Open a separate tab, go to www.science.gov and search metabolism again • Looking at the top ten results of each, which provided generally “better” information? • How difficult/easy is it to pursue your search in related fields?
Directories/Portals of Interest • Ipl2 • January 2010 • Merge of Internet Public Library and Librarians’ Internet Index • Librarians and Information Science Professionals • Hosted by Drexel University’s College of Information Science & Technology • Infomine • University-level scholarly resources • Librarian built and maintained • University of California • Virtual Private Library
Other Resources • LVHS Library Web Page – Deep Web link on the left • Going Beyond Google: The Invisible Web in Learning and Teaching by Jane Devine and Francine Egger-Sider, 2009 • Not as up-to-date as web resources, but • Very focused on teaching