130 likes | 230 Views
Search Engine and SEO. Presented by Yanni Li. Various Components of Search Engine. History. Meta Tag - a hypertext markup language to show the properties of the webpage or website
E N D
Search Engine and SEO Presented by Yanni Li
History • Meta Tag - a hypertext markup language to show the properties of the webpage or website • However, it's soon found that ranking of search results have a huge benefit space, some webmasters abused Meta Tags by including irrelevant keywords to artificially increase type impressions for their websites and increase their ad revenues
What is SEO? • Search engine optimization (SEO) is the process of improving the volume or quality of traffic to a web site from search engines via "natural" or un-paid search results. • SEO has developed into a profession . • Before starting, the first thing needs to understand is how SEs rank websites.
SE Ranks Documents by Scores • Generally, SE rank documents by their estimation of the usefulness of a document for a user query.Most SE systems assign a numeric score to every document and rank documents by this score. • Different SEs use different scoring mechanisms. • Google make heavy use of the structure present in hypertext.
Google(1) • The simplest case is a single word query. In order to rank a document with a single word query, Google looks at that document's hit list for that word. Google considers each hit to be one of several different types (title, anchor, URL, plain text large font, plain text small font ...), each of which has its own type-weight.
Google(2) • The type-weights make up a vector indexed by type. Google counts the number of hits of each type in the hit list. Then every count is converted into a count-weight. Count-weights increase linearly with counts at first but quickly taper off so that more than a certain count will not help. Google take the dot product of the vector of count-weights with the vector of type-weights to compute an IR score for the document.
Two Kinds of SEO • White Hat SEO -- conforms to the search engines' guidelines and involves no deception --create content for users and search engines • Black Hat SEO --tend to deceive search engine ---content a search engine indexes and ranks isn’t the same as the content a user will see.
Some White Hat SEOs • Domain Selection -choose a domain that has keywords • Design friendly webpages -- don’t like too much flash, java script... --make the site easy and fast to crawl. • Write a suitable length of the article -too shortwon’t have a high rank -too longloose keyword densitylow rank users tend to shut down the article at the first glance • Write Compact theme of each article --long article, covering a number of different topics whose relevance are not high, won’t rank very well in search engine.
Some Black hat SEOs • Doorway pages --automatically generates a large number of keywords pages --from these pages automatically shifted to the home page • Cloaked pages • Keyword stuffing • Link Spam -set up multiple web pages pointing to a target web page to boost the latter’s total in-links. -easy to build a new webpage, so this spam is growing rapidly.
References [1]Christopher D. Manning Prabhakar Raghavan. Hinrich Schütze. Introduction to Information Retrieval. Cambridge University Press. Cambridge, 2009. [2] Sergey Brin, Lawrence Page. The Anatomy of a Large-Scale Hyper textual WebSearch Engine.