1 / 21

Suchmaschinen

Suchmaschinen. Bots, Spiders, Engines Am Beispiel. Suchmaschinen-Klassifikation. Themenkataloge Yahoo, Web.de Indizes Lycos, AltaVista, Excite, Google Besprechungsdienste Webcrawler, Webtip Meta-Suchsysteme MetaCrawler, Apollo7 Spezialisierte Maschinen IMDB, OPAC. Suchtechnologien.

mike_john
Download Presentation

Suchmaschinen

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Suchmaschinen Bots, Spiders, Engines Am Beispiel

  2. Suchmaschinen-Klassifikation • Themenkataloge • Yahoo, Web.de • Indizes • Lycos, AltaVista, Excite, Google • Besprechungsdienste • Webcrawler, Webtip • Meta-Suchsysteme • MetaCrawler, Apollo7 • Spezialisierte Maschinen • IMDB, OPAC jochen.koubek@hu-berlin.de

  3. Suchtechnologien Datenbestand aufbauen • Sammeln (Harvesting) • Indizieren Anfrage bearbeiten • Ergebnisliste • Gewichten (Ranking) • Sortieren • Ausgabe jochen.koubek@hu-berlin.de

  4. Rankingmethodenhttp://www.suchfibel.de/5technik/ranking.htm • Qualität der Dokumente • Schlüsselworte, • Vorkommen • MetaTags • Listing gegen Geld (Sponsored Links) • Werbebanner • ADWords • Verdeckte Positionierung • Nutzerverhalten • Sammlungen (Clever) • Klickhäufigkeit (Direct Hit) • Verwandte Dokumente (Alexa) • Verlinkungshäufigkeit jochen.koubek@hu-berlin.de

  5. Graphenstruktur des Netzeshttp://www.almaden.ibm.com/cs/k53/www9.final/ Region SCC IN OUT TENDRILS DISC. Total Size 56,463,993 43,343,168 43,166,185 43,797,944 16,777,756 203,549,046 jochen.koubek@hu-berlin.de

  6. Google – Timelinehttp://www.google.com/corporate/timeline.html • 1995 S. Brin und L. Page beginnen ihr Forschungsprojekt. • 1997 BackRub. • 1998 täglich 10.000 Anfragen. • 1999 8 Angestellte, täglich 3 Mio. Anfragen. • 2000 Nicht-Englische Oberfläche, Täglich 18 Mio. Anfragen. • Jan. 2001 tgl. 100 Mio Anfragen. Phonebook. • Juli 2001 Bildsucher. • Okt. 2001 Dateitypen. • Dez. 2001 3 Mrd. Webseiten indiziert. Google News, Catalog, Zeitgeist. • 2002 Hardware, Compute, Toolbar, Labs, Program Contest. jochen.koubek@hu-berlin.de

  7. Google – PageRank United States Patent- 6,285,999 jochen.koubek@hu-berlin.de

  8. Google - Technikhttp://www.google.com/appliance • RAIS – Redundant Arrays of Inexpensive Servers • 10.000 Server auf Linux-Basis jeder • 60 Anfragen/Minute • Einschränkung auf 150.000/300.000 Dokumente jochen.koubek@hu-berlin.de

  9. Google – Dance http://www.google-dance.com/ jochen.koubek@hu-berlin.de

  10. Google – Kundenhttp://www.google.com/press/customers.html • Yahoo • EarthLink • Palm • Nextel • Netscape • Cisco • Virgin • RedHat jochen.koubek@hu-berlin.de

  11. Google im Markthttp://www.searchenginewatch.com/reports/article.php/2156451http://news.com.com/2009-1023-963618.html Suchstunden jochen.koubek@hu-berlin.de

  12. Google - Funktionenhttp://www.google.com/appliance/features.html • Zusammenfassung • Ergebnisgruppierung • HTML-Sicht • Autokorrektur • Cache • Trefferhervorhebung • Sortieren nach Datum jochen.koubek@hu-berlin.de

  13. Google – Suche • + Inklusion • - Exklusion • ”ganzer Satz“ • OR • Sprache • Datum jochen.koubek@hu-berlin.de

  14. Spezialsuchehttp://www.google.com/help/features.htmlhttp://www.google.com/help/operators.htmlSpezialsuchehttp://www.google.com/help/features.htmlhttp://www.google.com/help/operators.html Cached Links* View a snapshot of each page as it looked when we indexed it. Dictionary Definitions View a dictionary definition for any or all parts of your query. File Types* Search for non-HTML file formats including PDF documents and others. I'm Feeling Lucky* Bypass our results and go to the first web page returned for your query. News Headlines* Enhances your search results with the latest related news stories. PhoneBook Look up U.S. street address and phone number information. Similar Pages* Display pages that are related to a particular result. Site Search* Restrict your search to a specific site. Spell Checker* Offers alternative spelling for queries. Stock Quotes Use Google to get stock and mutual fund information. Street Maps Use Google to find U.S. street maps. Web Page Translation* Provides English speakers access to a variety of non-English web pages. Who links to you?* Find all the pages that point to a specific URL. jochen.koubek@hu-berlin.de

  15. Google – Zeitgeist http://www.google.com/press/zeitgeist.html jochen.koubek@hu-berlin.de

  16. Google – Optionenhttp://www.google.com/options/index.html • Web Search • Web Directory • Groups • Images • News • Answers • Labs • Special Searches (.edu, Mac, Linux, BSD) • Wireless • Froogle • Catalogs • Safe Search jochen.koubek@hu-berlin.de

  17. Google – Dateiformatehttp://www.google.com/help/faq_filetypes.html Adobe Portable Document Format (pdf) Adobe PostScript (ps) Lotus 1-2-3 (wk1, wk2, wk3, wk4, wk5, wki, wks, wku) Lotus WordPro (lwp) MacWrite (mw) Microsoft Excel (xls) Microsoft PowerPoint (ppt) Microsoft Word (doc) Microsoft Works (wks, wps, wdb) Microsoft Write (wri) Rich Text Format (rtf) Text (ans, txt) jochen.koubek@hu-berlin.de

  18. Google – Werkzeugehttp://www.google.com/options/index.html • Google in Your Language • Browser Buttons • Toolbar • Übersetzung von Webseiten • Web APIs, Google Hacks • Google Compute jochen.koubek@hu-berlin.de

  19. Google Sekundärseiten Google - Whack Google - Dance Google - Alert Google - Watch Google - Hacks Google - Forum Google - Weblog ChillingEffects jochen.koubek@hu-berlin.de

  20. Google – Kritikhttp://www.google-watch.orghttp://google.blogspace.com/ 2003-02-21 (BBC)  "The much-praised reputation mechanism that is supposed to ensure that bloggers remain true, honest and factually-correct is, in fact, just the rule of the mob, where those who shout loudest and get the most links are taken more seriously. It is the online equivalent of saying that The Sun newspaper always tells the truth because four million people read it, and The Guardian is intrinsically less trustworthy as it only sells half a million." • Unsterbliche Cookies • Datenspeicherung • Cache-Kopie • Kooperation mit NSA • Linkfarms • Search King • Zensur, z.B. Scientology, Nazi-Propaganda • China jochen.koubek@hu-berlin.de

  21. googlende jochen.koubek@hu-berlin.de

More Related