580 likes | 681 Views
A systematic overview of Internet information sources. Paul . Nieuwenhuysen @vub.ac.be Vrije Universiteit Brussel Informatie- en Bibliotheekwetenschap, Universitaire Instelling Antwerpen België Presentation at Internet Librarian International, in London, England, March 200 2
E N D
A systematic overview of Internet information sources Paul.Nieuwenhuysen@vub.ac.be • Vrije Universiteit Brussel • Informatie- en Bibliotheekwetenschap, Universitaire Instelling Antwerpen België Presentation atInternet Librarian International, in London, England, March 2002 These slides are available through the WWW from http://www.vub.ac.be/BIBLIO/nieuwenhuysen/presentations/
Contents / summaryof this presentation A systematic overview of information sources and services that are accessible through the Internet, such as • general WWW directories and search engines, and • more specialised systems to find • images, • books, • journal articles, • newsgroup messages.
Internet based information sources: problems / difficulties (Part 1) • Redundancy and overlap:On the one hand, there is too much information on some topics; in other words, the redundancy and overlap are high in many cases.Too few information sources: On the other hand, there are too few information sources on some topics.
Internet based information sources: problems / difficulties (Part 2) • No order is imposed on most sources.Quality checks / quality control are not performed.Related to this: it is not required to register new information offered.Is the information that you find real, honest, authentic?
Internet based information sources: problems / difficulties (Part 3) • Change is the only constant: Information sources are constantly changing, growing, but sometimes disappearing.
Internet based information sources: problems / difficulties (Part 4) • Scattering: There is no single simple but powerful system to find relevant information through the Internet.In other words: integration / aggregation is still far from perfect.
Internet based information sources: problems / difficulties (Part 5) • Slow: The Internet is in many places and for many applications not yet fast enough.
Internet based information sources: problems / difficulties (Part 6) • In conclusion: Surfing, using the Internet, the WWW, can be a time sink instead of a productive activity.
Types of online access information systems: “free” versus “fee” Public access information sources free of charge Fee-based online information services(NOT free of charge)
Types of online access information systems: “free” for members only Public access information sources free of charge Fee-based online information services(NOT free of charge) Fee-based online information services, made accessible “free of charge” by an institute to its members
Encyclopedias accessible through Internet and WWW • Dictionaries and encyclopedias are the first choice among many types of information sources, • when we do not need detailed information on a common topic • when we want to prepare a more detailed search on an unfamiliar topic, by searching for the right spelling, synonyms, context,… • Some dictionaries and encyclopedias are available through the WWW free of charge.
Example Encyclopedias accessible through Internet and WWW: examples • Encarta Concise Free Encyclopedia • http://encarta.msn.com/ • Encyclopædia Britannicaonly a small part is available free of charge + links to selected WWW sites • http://www.britannica.com/ • Encyclopædia Britannica Concise • http://education.yahoo.com/reference/encyclopedia/
Example Encyclopedias accessible through Internet and WWW: examples • The Canadian Encyclopedia(in English and in French): • http://thecanadianencyclopedia.com/
Internet: subject-oriented meta-information offered via WWW Information about information sources: in the form of • subject hypertext directories = subject guides • key word indexes, generated automatically, for searching
Internet global subject directories:introduction • They are virtual libraries with open shelves, for browsing. • They are manually generated, man-made by many people. • They can be browsed following a tree structure or a more complicated variation. • The most famous of these systems belong to the most popular and most visited sites on the WWW. For instance Yahoo!
Internet global subject directories: structure The structure corresponds to a classification that is in most cases specific for the particular overview. In other words: the well-known and classical universal classification systems are not used in most Internet directories.
Internet global subject directories: limitations • They cover only a small number of selected WWW sites, in comparison with the total number of sites that are accessible. • They are suitable mainly for broad searches that can be difficult to formulate in words, but NOT for more specific searches that require combinations of several concepts.
Example Internet global subject directories: Yahoo! • A hypertext global subject directory can be found athttp://www.yahoo.com/ and at many other sites, includinghttp://www.yahoo.co.uk/ • Entries are NOT rated. • Accessible free of charge.
Example Internet global subject directories: BUBL link • A hypertext global subject directory to more than 10 000 WWW sites for the higher education community can be found athttp://bubl.ac.uk/link/ • Accessible free of charge.
Example Internet global subject directories: Google directory • A hypertext global subject directory can be found athttp://directory.google.com/ • Accessible free of charge. • Very similar to the Open Directory Project.
Examples Internet subject directories focusing on a specific subject domain (Part 1) • Computer science & engineering: http://www.ub.lu.se/eel/ • Social sciences: http://www.sosig.ac.uk/ • Marine science and oceanography: http://oceanportal.org/
Examples Internet subject directories focusing on a specific subject domain (Part 2) • Medicine and healthcare: http://www.omni.ac.uk/ and http://www.medscape.com/ • General pediatrics: http://GeneralPediatrics.com and http://www.pedinfo.com/ • Engineering: http://www.eevl.ac.uk/ • Civil engineering: http://www.icivilengineer.com/
Examples Internet subject directories focusing on a specific subject domain (Part 3) • Fishing: http://www.onefish.org/ • Art, architecture and the media: http://www.adam.ac.uk/ or http://adam.ac.uk/
Internet indexes:automated search tools • Several systems allow to search for and to locate many items (addressable resources) in the Internet in a more systematic, direct way than by only browsing/navigating. • These systems do NOT search the contents of computers through the real Internet in real time and completely when a user makes a query. Searching in that way would be much too slow due to limitations in the technology.
Internet indexes: scheme of the mechanism User searching for Internet based information Internet client hardware and software user interface to a search engineInternet information source Internet index search engine Internet crawler and indexing system database of Internet files, including an index
Example Internet indexes: Google (Part 1) • You can search for WWW pages at http://www.google.com/ • The “simple search” option does NOT offer/allow • full Boolean searches; • stemming/truncation.
Example Internet indexes: Google (Part 2) • For retrieval an algorithm is used that takes into account the links between WWW pages.A retrieved page is ranked higher when • many sites/pages point to it • “important” sites/pages point to it • Searches include full text searching of files on the WWW; not only html pages, but also files in the formats Adobe PDF, Microsoft Word, Microsoft Excel,…
Example Internet indexes: Google additional features • Google offers besides a system to search for WWW pages also • a subject directory • searching for images on the WWW • searching an archive of Usenet messages + posting to Usenet groups • Thus Google has become a great integrator / aggregator.
Internet indexes: coverage / size of each index The indexes grow and their “size ranking” is variable. Biggest systems in 2002: • AltaVista • Fast = All the Web • Google • Systems based on the INKTOMI database of WWW pages, such as Hotbot, MSN Web search,…
Internet indexes cover only a part of the Internet: introduction The “visible” part of Internet The “hidden, invisible” part of Internet and the WWW, (that is not searchable using a global index like, AltaVista, Google...)
Current awareness services focusing on WWW pages: introduction • Tracking changes in one or more public access pages on the WWW or finding new pages, is possible • by using one of the available, suitable, programs loaded on your client workstation • through “alert” services based on a server on the WWW • that track updates for the user/subscriber • and send alerts by email to the user/subscriber • Few systems are free of charge.
Example Current awareness services focusing on WWW pages: Tracerlock • http://www.tracerlock.com/can use one of several external Internet indexes with a simple search query given by you, to discover relevant changed or new WWW pages for you in the future
Coverage of Internet directories and Internet indexes Internet information sources A global Internet directory A global Internet index
Global Internet directories Only a limited selection of Internet sources Browsing information sources is easy Good for broad searches Global Internet indexes About 1/3 of the Internet is covered by an index Searching requires some skills and knowledge Good for specific, narrow searches Global Internet search tools: a comparison • Multi-threaded search systems • These get information from directories and indexes • Searching requires some skills and knowledge • Good when even 1 index does not yield information
Finding images on the Internet:introduction • Several public access search systems are available free of charge to search for images / pictures (either artwork, either photos, or both) on the Internet. • When searching for images, the search results from such a system offer not only links to the image files on the Internet, but also directly small versions of the images (so-called “thumbnails”).
Examples Finding images on the Internet:examples of search engines • http://alltheweb.com ! • http://gallery.yahoo.com/ ! • http://images.google.com/ !!! or through http://www.google.com/ • http://www.altavista.com/ !! (also audio and video, choose not the normal text search, but IMAGES in the user interface.) • http://www.ditto.com/ !
Examples Finding images on the Internet:screen shot of a Google image search
Public access book databases: introduction • Even in this age of Internet-based information sources, a lot of information is still distributed in the form of printed books. • The contents of most books is (still) not available on the Internet. • Most Internet search tools do NOT allow you to find out about the existence of books that may be interesting for you. • So, specific search tools to find books can be useful.
Public access book databases provided by bookshops • To find currently available books, the bibliographic databases assembled by big bookshops are interesting. • Several offer a good coverage and are accessible free of charge.
Examples Book databases accessible free of charge: examples (Part 1) • Amazon.com (US):http://www.amazon.com/http://www.amazon.co.uk/note: amazon, NOT amazone • Barnes and Noble (US):http://www.bn.com/ • Blackwell’s on the Internet (International, academic books):http://www.blackwell.co.uk/
Free public access bibliographic book database + price comparisons • Even comparisons of the catalogues of shops of books (as well as of music, movies and many other goods) are available free of charge. • See for instance • http://www.bookfinder.com/ • http://www.dealtime.com/
Online Public Access Catalogues of libraries • Mainly to find older books, the catalogues of libraries can be useful. • Most are accessible online and free of charge.
Types of online access information systems: “free” versus “fee” • A lot of the information on the Internet is available free of charge, but another part is only accessible when a fee is paid to the producer and / or the distributor. • Some organisations pay these fees for some sources and then organise access, so that the members of the organisation can retrieve and exploit the information as if it is free of charge. • The first commercial computer systems that make information available online were born around 1975. • Most of them are now also available through the Internet.
Online information services:total size of their databases In 1999: The big host systems and the public access WWW pages offer a comparable quantity of information: • WWW offered about 8 terabytes (= 8 000 gigabytes) of text data (according to Lawrence and Lee Giles, Nature, 1999, Vol. 400, pp. 107-109.) • Dialog offered about 9 terabytes (= 9 000 gigabytes) (in 1998) • 6 billion pages of text • 3 million images
Online access databases about journal articles: overview • Thousands of fee-based online access databases offer bibliographies or full-texts of journal articles in particular subject domains. • Only few large databases offer access to bibliographies of articles published in journals, free of charge.
Example Online access databases about journal articles: Article@INIST • Article@INIST allows you to search in a bibliographic database, NOT full-text (Journal articles, Journal issues, Books, Reports or Conferences, doctoral dissertations) at the Institut de l'Information Scientifique et Technique, France. • Searching is free of charge. • Available fromhttp://form.inist.fr/public/eng/conslt.htm • Payment is required to receive the full text of an article.
Computer-network interest groups:the basic scheme Computernetwork interest group system ? Question ? ! Answer ! E-mail
Computer-network interest groups:various existing systems • “Conferences” on computer-services like AOL, CompuServe, Dialog, Data-Star, many Bulletin Board Systems • E-mail lists ! • Usenet News ! • Furthermore, since the 1990s, the WWW has become a gateway to these and a basis for similar systems.
E-mail - based interest groups: synonyms E-mail (based) conferences Computer (based) (discussion) lists Network (based) discussion groups forums interest groups Listservs Reflectors Aliases
E-mail - based interest groups: How to find relevant groups? You can • (use printed directories of interest groups) • use subject-oriented indexes and directories to search for Internet-based sources in general • search directory files concerning interest groups online!Examples: • http://groups.yahoo.com/ • http://www.forumone.com/ • http://www.liszt.com/