1 / 30

Search Engines

Search Engines. Introducing. Directories, Meta-Searchengine How search engines work What influences the ranking. Directories. hand-constructed hierarchy of topics (e.g. Yahoo!) use human editors for page selection, indexing and classification Covers a small part of the web

grady-tyler
Download Presentation

Search Engines

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Search Engines Thomas Haidlas

  2. Introducing • Directories, Meta-Searchengine • How search engines work • What influences the ranking Thomas Haidlas

  3. Directories • hand-constructed hierarchy of topics (e.g. Yahoo!) • use human editors for page selection, indexing and classification • Covers a small part of the web • Small updatability • No ranking Thomas Haidlas

  4. Directories II • No searching across the index • Searching across the reviews • Sometimes partnership with search engines to increase coverage Thomas Haidlas

  5. Meta-Searchengine • Rare keyword requests require use of more than one web search engine • Submit the same query parallel to many engines • Duplicated entries are eliminated • The results are shown in uniform format • No harvesting or indexing Thomas Haidlas

  6. How search engines work • Harvesting • Indexing • Analyzing Requests • Ranking Thomas Haidlas

  7. Harvesting • programs (robots, gatherer or crawler )visit web sites and gather the web pages for indexing • Start with an initial page • Follows hyperlinks (<a href=…>) • Sometimes, more then 2 sub-levels are visited • These programs are started periodically Thomas Haidlas

  8. Harvesting II • Problems: • Links aren‘t found in • Frames • Imagemaps • Many robots are started by a search engine => traffic Thomas Haidlas

  9. Robot Exclusion • Two Methods: • Meta-Tags: <meta name="robots" content=„noindex,nofollow"> • robots.txt: User-agent: Scooter Disallow: /privat/geht_dich_gar_nix_an.html Allow: /allesOffen Thomas Haidlas

  10. Robot Exclusion II • robots.txt (Example 2): User-agent: * Allow: /allesOffen Thomas Haidlas

  11. Indexing • Indextable gets the harvesting-resuls • Indextable includes keywords • Table is located in main-mamory => fast access Thomas Haidlas

  12. Analysing Requests • Comparison between searchstring and index-table • The searchstring consists of a word: => easy processing • The search word consists of truncation or booleans: => complex processing • If the searchstring in the index is discovered, the side is taken up to the hit-list Thomas Haidlas

  13. Ranking • influences on the ranking: • How many keywords are found • keyword-frequency • keywords-position: • Domain/URL • Documentname Thomas Haidlas

  14. Ranking II • Headline • Early in the text • Meta-Tags • Ranking for cash • Page Rank • Clicking frequency/ Hit Popularity Engine Thomas Haidlas

  15. Ranking for cash • Capitalism principle • Paying money => high ranking-level • Contents are not relevant • additional incomes Thomas Haidlas

  16. Ranking for cash II • not independently in the employment • Mostly used by e-commerce-companies • Second method: • pay for faster indexing time Thomas Haidlas

  17. Page Rank (Google) • Evaluation through internet-community (web-admins) • Realtion between quality of a page and number of links that point to it • Links of the popular web-sites are regarded as better Thomas Haidlas

  18. Page Rank (Google) II • Disadvantage: • new web-sites have a bad ranking • Querys with many boolean-connections and keywords are not easy to process Thomas Haidlas

  19. Hit Popularity Engine • index already exists and is pre-sorted • A click on a link leads to a voting for this site concerned => „click“ is recorded to the database • pages with many „clicks“ are more popular • developed by „Direct Hit“ Thomas Haidlas

  20. Hit Popularity Engine II • This method is usually combined with others • Disadvantage: • new web-sites have a bad ranking Thomas Haidlas

  21. Ranking-Manipulation • Why? • commercial interest • Done of: • Search Engine Optimizer, SEO • Sense of: • to boost the pagerank Thomas Haidlas

  22. Linkfarm • Many Domains are registered • Programs generate thousands among themselves linked pages • each page contains keywords • Partly these sides are arranged even complex Thomas Haidlas

  23. Forwarding • intermediate page contains the looked for terms • HTML Meta tags and simple Javascript can be recognized • SEO‘s complicate the forwarding instructions => no recognizing Thomas Haidlas

  24. IP Delivery • normal site is indicated by Robots • After this, contents of the site are exchanged Thomas Haidlas

  25. IP Cloaking • Servers programs determine who the Request starts • Robots request: "cloaked" content is delivered which is designed to influence ranking • Human visitors: do not see the "cloaked" content Thomas Haidlas

  26. Other simple tricks • Links in guestbooks • particularly effectively with high-ranking guestbooks • „Blind Text“ • Text in background-color Thomas Haidlas

  27. Trade with weblinks • Paying for linking • Partnership =>Commission Thomas Haidlas

  28. Resumee • suitable tools select • The www is dynamic => new developments consider • correct estimate of ranking Thomas Haidlas

  29. Thank You! Thomas Haidlas

  30. Sources • [1] www.suchfibel.de • [2] Jo Bager Orientierungslose Infosammler c‘t 23/99 • [3] Stefan Karzauninkat Zielfahndung c‘t 23/99 • [4] Sven Lennartz Ich bin wichtig c‘t 23/99 • [5] Stefan Karzauninkat Google zugemüllt c‘t 1/03 • [6] www.google.com/webmasters • [7] Dr. Wolfgang Sander-Beuermann Schatzsucher c‘t 13/98 • [8] Arno Dittmar Suchmaschinen und Anfragen im WWW • [9] Ralf RudolfSuchmaschinen und Anfragen im WWW Thomas Haidlas

More Related