1 / 90

The Evolving Internet: Implications, Strategies, and Techniques for Effective Research

This presentation explores the evolving nature of the internet and its implications for research. It covers search engine basics, business and news search, social search, basic information trapping, and the future of the internet. Key topics include the role of attention, the meaning of relevance, and the invisible web.

cogar
Download Presentation

The Evolving Internet: Implications, Strategies, and Techniques for Effective Research

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Evolving Internet: Some Implications, Strategies, and Techniques for More Effective Research MSU Product Center September 26, 2007 Professor Larry G. Hamm

  2. Presentation Outline • Introduction • Search Engine Basics • Business Search with Google • News Search • Social Search • Basic Information Trapping • The Future??

  3. QUESTIONS? • Who is Tim Berners-Lee? • What happened for “research” in 1990?

  4. Current Number of Websites July 2007-489,774,269

  5. Top Global Web Properties Ranked by Total Unique Visitors (000)* June 2007 Total Worldwide, Age 15+ - Home and Work Locations Number(000’s) Percent Reach Total Unique Internet Visitors --- 778,310100% Google Sites 544,783 70 Microsoft Sites 529.155 68 Yahoo! Sites 471,924 61 Time Warner Network 266,367 34 eBay 264,732 34 Wikipedia Sites 208,120 27 Fox Interactive Media 163,545 21 Amazon Sites 145,947 19 Apple Inc. 123,554 16 Adobe Sites 121,966 16 CNET Networks 116,579 15 Ask Network 115,655 15 Viacom Digital 88,654 11 Lycos Sites 77,517 10 The Mozilla Organization 70,850 9

  6. Share of Online Searches by EngineAugust 2007Total U.S. Home, Work and University Internet UsersSource: comScore qSearch * Excludes traffic from public computers such as Internet cafes or access from mobile phones or PDAs.

  7. Share of Online Searches by EngineAugust 2007Total U.S. Home, Work and University Internet UsersSource: comScore qSearch * Excludes traffic from public computers such as Internet cafes or access from mobile phones or PDAs.

  8. Herbert Simon, Nobel Prize Economist: “What information consumes is rather obvious: it consumes the attention of its recipients. Hence a wealth of information creates a poverty of attention” SOURCE: “Designing Organizations for an Information-Rich World,” in Donald M. Lamberton, ed., The Economics of Communication and Information (Cheltenham, England: Edward Elgar, 1997).

  9. The Source of Power? • Knowledge is no longer the “scarce” resource. • Attention is the “limiting factor”! • Implications: • Global--- Decisions on what is brought into global consciousness • Research --- Discipline to direct and control your attention

  10. The Role of ATTENTION THEREFORE: • “The most important function of attention is not taking information in, but screening it out.”

  11. Introduction The Meaning of Relevance Definition: The degree to which a search record (piece of information) meets the researchers’ query. • PROBLEM - Relevance to Search Engine and Researcher Are DIFFERENT • To a researcher: Does the result help answer the intent of the query? • To a Search Engine: Does the result meet the search engine’s ranking algorithm?

  12. Summary and Conclusion • Precision searching requires the process of consciously narrowing and eliminating the gap between researcher’s and search engine’sRELEVANCY • Knowledge of the search process and the characteristics of information sources are required to attack search engine relevance. • Intuition is required by the researcher to focus on formulating the search statements.

  13. Search Engine Basics • The Invisible versus the Visible Web • Defining and Identifying Search Engines • How Search Engines Work • Why Google?

  14. The Invisible Web • Great amounts of information exist than is not accessible via internet search engines • Much was formatted digitally but not ‘indexed’ (see latter lecture) • “Google Books” project is the grandest attempt to date to ‘shrink’ the invisible web. • Invisible Web information is differentiated by: • ACCESS • MODE of creation

  15. The Invisible Web(continued) • Information is differentiated by the nature of ACCESS to it: • 1.Publicly available --- Libraries • 2.Semi-public --- ‘Private’ Libraries i.e. MSU Libraries • 3.Private data --- Only available for purchase or through reciprocity

  16. The Invisible Web(continued) Types of Private Data – • Private data sets open to anyone with a checkbook (Mintel) • Restricted private data sets --- to contributors (Trade Association) • Proprietary data of individual firms/public institutions (Freedom of Information Act) • Spy data (commercial and public) • Private data interfaces with ‘Searchable’ data when private data firms use “free sample” or “versioning” marketing strategies

  17. The Invisible Web (continued) Differentiated by MODE of creation: • PRIMARY versus SECONDARY Data • Primary Data is data collected/generated through directobservation, survey, or poll • Secondary Data is data that is ‘repackaged’ primary data • Secondary data results from an ‘editing’ process • Evaluating secondary data requires an identification and evaluation of the base source(s) • Always go to the “ORIGINAL SOURCE”!!!

  18. The Visible WebDefining and Identifying Search Engines What is a search engine? • Definition – A search engine is an enormous database of websites compiled by a software robot that seeks out and indexes websites. How does it work? • Sends a ‘spider’ or ‘crawler’ to visit a Web page, finds the information on the page. • The ‘crawler’ then sends its “finds” to an indexer which takes every word on a Web page, logs it, categorizes it and than stores the results in a huge databases.

  19. Defining and Identifying Search Engines What types of search engines exist? www.searchengineshowdown.com www.lib.berkeley.edu/TeachingLib/Guides/Internet/FindInfo.html • General All Purpose Search Engines (Big 4) – Google; YahooSearch; Live.com; Ask.com • Metasearch Engines – Search engines that search other search engines (S.E. ‘bot’) – • www.dogpile.com • www.clusty.com • www.kartoo.com

  20. Defining and Identifying Search Engines What types of search engines exist? (continued) Specialized Search Engines (Vertical Search Engines) – Search engines dedicated for specific subject areas or specific purposes. For research: www.lii.org • “Customized Search Engine” – Now anyone can create one www.google.com/coop/cse/ --- See www.customsearchguide.com

  21. The Visible WebHow Search Engines Work Search Engine – RANKING ALGORITHMS • WHAT? – Ranking Algorithms are used to ORDER the search results • WHY DOES ORDER MATTER? Answer - ATTENTIONbecause the researcher wants ‘help’ in deciding relevance for the searcher's needs • HOW? - Most ranking algorithms are and continue to be ordered by the frequency of use of the searched “WORDS” • Google created a new addition to their Ranking Algorithm

  22. The Visible WebHow Google Works 1. The web server sends the query to the index servers. The content inside the index servers is similar to the index in the back of a book - it tells which pages contain the words that match the query. 3. The search results are returned to the user in a fraction of a second.

  23. The Visible WebConclusion An Overview of a Basic Search • Be very proficient with ONE search engine • Remember because of different software approaches and indexing, NO TWO SEARCH ENGINES WILL PRODUCE THE SAME RESULTS • When very focused and search is narrowed, identify and use other specific engines • Should the “Product Center” create their own?

  24. Business Search with Google • Translating Web Language • Underlying Search Logic • Understanding Google Search Features • Conclusion

  25. Translating Web Language Reading URL’s – Uniform Resource Locator • This the Web site’s address; i.e. Were a Web site lives • Example: http://online.wsj.com/article/SB114609925357637113.html • http: - Transfer Protocol (hypertext) • the way the information is transfer on the Web. • HTML – Hypertext Markup Language is current Web language • XML (eXtensible Markup Language) is coming as the vehicle for information trapping

  26. Translating Web Language(continued) Reading URL’s (continued) www.online.wsj.com (domain name) of the server • Domain Suffix (com) – Perhaps the first and most important things to examine • Assigned by ICANN – Internet Corporation for Assigned Names and Numbers – www.icann.org • Country Codes (.uk) follow domain suffix • (.us) not used by most U.S. sites except with state/local government sites. • Current Issues?

  27. Translating Web Language(continued) Reading URL’s(continued) Common DOMAIN SUFFIXES • .com - commercial site • .edu - educational institution • .gov - government agency in the U.S. • .net - network with most assigned to ISP networks • .org - non-profit/non-commercial organization (Caution: many companies are setting up “non-profits” to get .org domain suffixes to disguise their agendas) • OTHERS - .mil, .biz, .info, .coop, .pro

  28. Underlying Search Logic Boolean Logic Searches • Definition - Use of mathematical set theory to retrieve search information. • AND, OR, and NOT searches • See following Venn diagrams:

  29. Underlying Search Logic(continued) Boolean Logic Searches - AND

  30. The Visible WebWhy Google? (continued) Boolean Logic Searches - OR

  31. Underlying Search Logic(continued) Boolean Logic Searches - NOT

  32. The Visible WebWhy Google? Google Has Two Basic Strengths Over Other Search Engines • Popularity Ranking • Number of and Breadth of Features

  33. The Visible WebWhy Google? (continued) “Popularity” Ranking – “The Google Creation” • A page’s ranking includes a score for how many “other pages” link to it i.e. How ‘popular’ it is with other Web sites • This is done on multiple levels. For Example: If page X and Y both have 100 pages linked to them, but the 100 Y pages have more links to them than do the 100 X pages, Y gets a higher score for ranking

  34. The Visible WebWhy Google? (continued) “Popularity” Ranking – “The Google Creation” (continued) THE UNDERLYING ASSUMPTION: • A Web page that has more pages linked indirectly (like a pyramid scheme) to it implies that more pages find it relevant implying that it will be more relevant to you. • Analogy – Your popularity is ranked within high school by how many friend your friends have and how many friends those friends have and so on.

  35. The Visible WebWhy Google? (continued) “Popularity” Ranking – “The Google Creation” (continued) “THE GOOGLE BIAS” • New pages won’t have as many links as established pages; therefore a lower ranking. • Analogy: New friends might be better than the old friends.

  36. The Visible WebWhy Google? (continued) Google’s Breadth of Features • Home Page Features – One of the Cleanest/Clutter Free Page • Advanced Search Features • Business research useful features are highlighted here

  37. The Visible WebWhy Google? (continued) Google’s Advanced Search Features • Advanced features allow searchers to narrow their queries to very specific searches • Narrowed searches allow the gaps between ‘researcher’ and ‘search engine’ RELEVANCY to close much quicker • With precision query formulation, the search will be faster and more useful • 8 highlighted advanced features

  38. The Visible WebWhy Google? (continued) 1.Google uses a modified Boolean Search Searches can be done from Google Home Page or from Advanced Features Page

  39. The Visible WebWhy Google? (continued) • “Phrase Searching” • Google automatically “ANDS” words • Accepts one or more “OR’s” • Use a minus sign in front of term to “NOT” it • Google will not search on very common “STOP” wordslike “a”, “it”, and “the”.

  40. The Visible WebWhy Google? (continued) • 2.Option to retrieve only a specific file format • (pdf), (ps), (xls), (ppt), (doc), (rtf) • Very useful if searching for a certain ‘type’ of data. For example: xls. and financial data.

  41. The Visible WebWhy Google? (continued) 3. Date restrictions 4. Window to limit retrieval to title or URL fields 5. Box for limiting to (or excluding) a particular DOMAIN or URL

  42. The Visible WebWhy Google? (continued) 6. Page Specific Searches: • for pages similar one to the entered URL • for pages that link to the entered URL 7. Links to “Topic-Specific Searches” • for pages similar one to the entered URL • for pages that link to the entered URL 8. Domain specific searches for .gov, .mil, and .edu

  43. Everything About Google?? • http://www.google.com/intl/en/help/refinesearch.html#domain • http://www.google.com/intl/en/help/operators.html • http://www.google.com/intl/en/help/cheatsheet.html • http://www.google.com/intl/en/help/features.html • http://www.google.com/options/

  44. The Visible WebThe Greatest Google Feature?? • Skip the Title - Click the cache? WHY? • Google ‘Highlights’ (different color)keywords/phrases • No pop-ups that are attached to Web pages • Faster – Google’s servers are the best in the world • Allows for ‘text only’ versions • Allows access when the current site is ‘unavailable’

  45. The Visible WebThe Greatest Google Feature?? • Further ‘Search’ Within the Result Generated Sites • If not in cache but titled page, use browser’s • “Find” button (Control+F) to show keywords/phrases • Use (Control+F) for NEW search with new words and phrases

  46. The Visible WebConclusions • Is the desired information - CONCEPTUAL or FACTUAL? • If Conceptual:Use in-depth research (library, books, scholarly journals, etc.) is most likely necessary to effectively frame the search. • If Factual: A search engine web search can most likely proceed • But always strive to find the “Original Source”

  47. The Visible WebConclusions • Set a time limit- ‘Web Surfing’ can be addictive causing: • Tendencies to wonder off task • Get attention ‘fatigue’ resulting in overlooking possible sources • All other forms of destructive social and moral behaviors.

  48. News Searching What Do You Want: • Read news without ‘a paper, TV, or radio’? • Just see last second’s headline? • Find older stories? • Monitor an industry? • Other?

More Related