300 likes | 376 Views
Internet Research Skills Workshop. Presented By: Paul Chisholm Program Resource Teacher Gateway Education Centre March 6, 2007. Internet Search Engines. http://www.paul-chisholm.com/search_engines.htm. Major Search Engines And Directories.
E N D
Internet Research Skills Workshop Presented By: Paul Chisholm Program Resource Teacher Gateway Education Centre March 6, 2007
Internet Search Engines http://www.paul-chisholm.com/search_engines.htm
Major Search Engines And Directories • Major search engines get the their name from the fact that they are popular and in wide spread use. • There are two major methods used for updating web search engines; automated crawlers or spiders (i.e. Google) and human maintained directories (i.e. Yahoo).
Crawler type search engines will reflect the changes and modifications made to a web site over time. They actually will come back to a site to obtain site updates. • Human maintained directories rely on a site summary to be submitted to a web search engine or for a web site to be reviewed and written up by the search engine’s human staff. Modifications made after the initial submission will not be reflected in the directory.
A third type of search engine is known as a Hybrid Search Engine (i.e. MSN Search). A hybrid search engine would present search results from both crawler sites and human maintained directories. • You also have Metasearch Engines (i.e. Metacrawler) that are used to search a number of search engines at once.
A crawler or spider based search engine has three (3) main components: • The Spider or Crawler that will automatically go to a web site and explore all of the web pages and all of the links from the web pages • The Index which is a collection of all of the materials that the spider finds • And finally the Search Engine software itself. This is the software application that searches through the search engine’s index.
Web Site Ranking • An Internet search engine can return an excessively large number of results from each search that is executed. • Sometimes it is desirable to limit the number of search results returned with a search engine query.
Search engines try to rank their results in the form of relevance using a computer algorithm.
A search engine algorithm will look at the words in the title of a web page or in the first couple of paragraphs of a web page to see if there is a match entered with the word or words entered by the search engine user. The thought being that the web page’s content will be displayed somewhere near the beginning of the web page document.
A search engine will also look at the number of times that a search word appears in a web page, the higher the number of occurrences, the greater the relevance to the search.
The goal of any webmaster is to have their web site returned in search results in the highest position possible, thus saving would be web site visitors from having to go through a large number of returned results before coming upon their site.
Boolean Searching • In order to limit the number of results returned to us by a search engine; we may find it necessary to use the following Boolean operators: • AND • OR • AND NOT • NEAR
Boolean commands should always be entered in UPPER CASE! • AND - Requires that all search terms be present in the results of found web pages. (i.e. Bush AND Kerry) • OR - Will display any of the terms listed in the original query to be displayed in the results. (i.e. IBM OR Microsoft)
AND NOT - Requires that a particular search term not be listed in the returned results. (i.e. Bush AND NOT Kerry) • NEAR – Will return results where the words are found in close proximity to each other. The actual number of words apart will vary with the search engine. • i.e. dogs NEAR cats
When searching for a phrase, you will want to encapsulate your phrase in double quotation marks. • i.e. “pan pizza” • i.e. "pan pizza" AND "Italian pepperoni" AND "black olives"
You can also use brackets to nest your search criteria. • i.e. "pan pizza" AND (pepperoni OR ham) AND olives
Truncation or wildcards* • In most search engines and directories, a search for dog* will give you pages with all words starting with the three letters dog, including dog, dogs, dogged, doggy and dogma. As you can see, if you were looking for dog and dogs, you will be picking up some unwanted hits. Truncation or wildcards works best when the stem is longer and if the stem is not a root of many other common words.
Synonyms • Note also that Google has introduced a special "tilde"-operator that lets you search for synonyms. If you place the tilde sign ("~") immediately in front of a keyword, Google will replace that keyword with a list of words with a similar meaning, thus extending your search.
Simplified Search Syntax • In place of using the previously mentioned Boolean operators, most Internet Search Engines will allow you to use simplified search syntax in the form of: • +pizza +pepperoni +ham -olives -garlic • The above example replaces AND with a + symbol and AND NOT with a – symbol
Please note that there must not be any space between the relevant sign and the word! Write +"Star Wars", not + " Star Wars ".
Field Searching • Field Searching allows you to search on the individual categories that search engines store their search results on. • Title: This is the text you can read in the bar at the top of the browser window (not the main headline on the webpage itself). • i.e. petunias AND title:gardening
URL: This is the address (the Uniform Resource Locator) of a page, e.g. http://www.paul-chisholm.com. • You may restrict your search to pages with addresses that contain a certain word. If you want to restrict your search to Dreamweaver, you can do a search like this: "ColdFusion" AND inurl:www.paul-chisholm.com
Domains • Domains: The domain is the unique name that identifies an Internet site. Domain Names have two or more parts, separated by dots.
Some search engines allow you to restrict your search to a specific domain. By doing a domain-search you may for instance restrict your search to pages in a specific country. British pages normally end in the letters .uk. A search for Jaguar AND car AND domain:.uk should give you British pages containing information on the Jaguar car.
There are also some top level domains (com, org, net etc.) that are not restricted to specific countries, although they are predominantly American. You can use these endings to restrict your search to commercial (.com), US educational (.edu), US governmental (.gov) or US military (.mil) sites. In Canada you see .ca domains and in Ontario, you will see .on.ca domains.
References • Please note that the following web site tutorial was used extensively in the preparation of this slide show: The Pandia Goalgetter – A Short And Easy Internet Search Tutorial Located at the following URL: http://www.pandia.com/goalgetter
Please feel free to contact me if you have any questions regarding this presentation: Paul Chisholm Program Resource Teacher Gateway Education Centre E-mail: paul.chisholm@ucdsb.on.ca Phone: 888-779-2559 Ext. 4202