90 likes | 233 Views
Search Engines. Search Engines. Allows a user to find information residing on remote computers; Searching differs from browsing in that the user is not required to provide a URL; Search engines are programs that search Web pages related to a given topic like: Company Individual Product
E N D
Search Engines • Allows a user to find information residing on remote computers; • Searching differs from browsing in that the user is not required to provide a URL; • Search engines are programs that search Web pages related to a given topic like: • Company • Individual • Product • Brand • Results of the search are returned as a Web page; • Q. What type of page is returned?
Search Engines (cont.) • Provide users a starting point in their search • Especially useful considering the size of the WWW • Can automatically recover the location of an item after a loss caused by power outage, say. • Two types of search engines: • Search by name; • Search by content.
Automated Searching by Name • Respond to questions of the form: “among the files on the WWW, where are the files that have name x?” • Similar to “Find” program on PCs. • Ex: archie
Automated Searching by Content • Most (if not all) current search engine search the content; • To make a search a user has to: • Download the page of the search engine; • Enter the topic; • Send the requested topic to the server; • The server processes the request and sends back the matching pages. • Ex. www.google.com • Q. What type of page is needed for the main page of the server? What type of page is needed for the result?
How a Search Engine Operates • Naïve approach: search the WWW when a request arrives: • Takes too long • Instead: • Before the search engine becomes operational a list of available information on the WWW is • compiled, • sorted, • indexed, and • stored on a local disk; • To deal with changes, computer programs called spiders or crawlers that probe the WWW continuously and report new pages to the server
How a Search Engine Operates (cont.) • String matching is used to find documents related to a given topic; • Advantages: • Simplicity • Efficient • Disadvantage: • Lack of semantic • Ex: a search for “automobile” will not return documents containing “car”.
Performance Issues • Two types of performance issues: • the relevance of the result pages; • The time taken to perform a search; • To improve the relevance of the result a user may: • enter several terms: • Ex: car vehicle automobile • Specify which words are required: • Ex. consider that a plus sign denote a required term, then a user might request “+rock +music” • Use quotes • Ex: “The Postman Always Rings Twice”
Performance Issues (cont.) • To speed up search the search engine looks only at the beginning of the page;