LIS65 4 lecture 2 history

LIS654 lecture 2history Thomas Krichel 2011-09-14

contents • some known old thinkers • Vannevar Bush • Joseph Carl Robnett Licklider, aka “Lick” • the birth of the repository

background • Vannevar Bush (1890—1974) directed the US office of Science Research and Development during WW2. • As the war ended he saw two problems • how to make the war time scientific reports available • find a new challenge for the scientists • He proposed a solution in “As we may think”.

As we may think • Vannevar Bush (1890—1974) wrote his famous essay in 1945. • It remains to date one of the most frequently cited papers in Library and Information Science. • I think this fame is somewhat undeserved.

the scientific record • As scientists do more work, the “record” extends. This is good. • Recent advances in microfilm also made is possible to store more of the record in microfilm. • But with much research and increased specialization, “significant attainments become lost in the mass of the inconsequential”.

the memex • The memex was a proposed desktop machine that would store millions of books in microfilm. • It would have a mechanism that would allow any known item from the collection rapidly. • But the problem is what items to look at?

organizing the collection • Still today collections are organized hierarchically from class to subclass. Think of standard classification schemes. • Or in a list of words, in a mechanical form from letter to letter. • Bush rejected this, claiming that the brain does not work in that way.

the brain, by Bush • Bush thought that the brain works by association. • “With one item in its grasp, [the brain] snaps instantly to the next that is suggested by association of thought”. • This is done “in accordance with some intricate web of trails carried by the cells of the brain.”

memex as a brain • Every time a document is added to the memex it is given an identifier. • Every time an item is consulted the user can associate with it other items. These associations are recorded. • Trails of associations can be annotated and copied. • Selection by association replaces indexing.

sharing • An annotated trail between items can form a new item. That item can be shared. • Bush envisioned that there would be a way for each memex to learn from all other memexes. • Memex users would improve their thinking ability by its use. • This would greatly increase the speed of scientific discoveries.

implementation • There is no evidence that anything like the memex was ever built. • Microfilm was replaced by digitization. • But the idea of associative trails or associative indexing has something to do with the hypermedia. • The later goes back to Ted Nelson.

Licklider • Joseph Carl Robnett Licklider (1915—1990) trained as a mathematician and psychologist and worked mainly at the MIT. • The Council of Library Resources got funding from the Ford Foundation to examine how technology could help libraries. • Work was undertaken by Bolt, Beranek and Newman (BBN) of later ARPA fame

the basic idea • The idea was that one could store all knowledge in a single or distributed machine. • How this should be done? • Well first Lick estimated the corpus would be 10^14 bytes by the year 2000. • That’s about 500 20TB systems. • It could a be a central system with thin clients.

the system • The system was call “procognitive” meaning for the advancement of knowledge. • It would not be based on documents, metadata and retrieval. • It would process information into knowledge and questions into answers. • Users transmit their knowledge to the system.

information to knowledge • To see how information can be processed into knowledge, Lick, looked at the human brain. He had studied cat brains in his PhD work. • If it is possible to the process the body of information into knowledge structures, then questions can be answered by knowledge rather than be documents.

Lick on the brain • The brain receives stimuli and stores representations of them. • The brain finds answers to question by processing stored memory on the basis of “schemata”, which are ways in which stored stimuli representations can be processed.

human processing • Lick understood that current and foreseeable technology would not allow processing of documents into knowledge. • This would be the job of set of librarian called “procognitive system specialists”. • The would encode contents of documents in a knowledge language. • They would watch for ambiguity warnings. • Users would also provide feedback.

encoding • Surprisingly Lick still imagined the procognitive system be based on natural language. • The hope was that artificial intelligence (AI) methods would be developed to extract information from documents. • That hope seemed justified in the 60s when AI was in its infancy.

steps to implementation • The first attempts, in the 60s, tried to find the citation string in a database of citations. • Thus this was more information retrieval on a small set of metadata than actual digital library work. • Librarians preparing bibliographies for researchers were the prime users.

into 80s • In the 80s the personal computer “came back”. • Searching could be done of the full-texts of document. • Browsing became available.

90s • In the 90s the Internet and the search engine came along. • Initially search engines followed standard information retrieval principles. • My first work, about 1993, was based on gopher access and WAIS indexing.

the semantic web • The semantic web is the actual successor to Lick’s vision. • It’s still not done. • I speculate it will not be done for a long time. • The reason is that while Lick thought Psychology and Computers, he did not think through the economics of operating such system as the ones that he proposed. • He also had too optimistic a vision about AI.

back in the trenches • As we have seen early digital library visions have been inspired by the concern of access to scientific documents. • The academic digital library was synonymous with the digital library. • So all the progress was there, and pretty much is.

academic documents • There are basically two types of academic documents. • There are academic books and academic articles or papers. • Both of them have been treated in different ways in the past, and continue to be treated differently at this time, maybe not for long.

academic books (monographs) • Books are (were) purchased by libraries. • They were cataloged into the local integrated library system • locally or • through shared cataloging

academic papers • Most of them published in serials. • Libraries never catalogued them locally. • They relied on third party services to provide abstracting and indexing services for them.

publishing academic papers • Publishing academic papers go through a process of peer review. • Papers are written for free by some academics. • They are being reviewed for free by other academics. • The profits from publishing go to publisher. Academic publishing is very profitable.

non-formal publication • Some academic disciplines have a tradition of informal publication of papers that have not peer-reviewed. • These are • mathematicians and physicists have preprints. • computer scientists and economists have working papers.

preprints vs working papers • Preprints were sent by academics to colleagues. • Working papers are issued by departments and sent other department by an exchange agreement. • Whatever the mode of working, non-formal publication channels enabled librarians to build really digital libraries. • Actually they were more built by their users.

xxx.lanl.gov • This was/is a preprint server started by Paul Ginsparg at Los Alamos National Archives. • It has been popular with physicists and mathematicians. • It’s coverage with sub-disciplines varies. • It became arXiv.org. • It moved Cornell University in 2001. • It is now run by Cornell University libraries.

NCSTRL • was the network computer science technical report library, a DARPA/NSF funded project that built an infrastructure for publishing computer science working papers. • Starting in 1993, it was built on a formal protocol called Dienst. This enables local and remote services. • Implementation software was deployed at participating institutions. • Collapsed completely when funding was gone.

RePEc • is a federated system based on metadata (ReDIF) and a transport protocol (Guildford Protocol), both written by yours truly. • It can be run on a standard ftp or http software. • RePEc archives don’t offer local services to end users.

UPSPROTO • In 2000, Herbert Van de Sompel started work to build a prototype system to provide the existing discipline-based digital libraries. • The experience lead to the formation of a working group that created an interoperabilty protocol called the Open Archives Protocol for Public Metadata Harvesting. (OAI-PMH). • I was part of that group.

repositories • OAI-PMH has been so widely implemented in repositories that we can say that a repository is a collection of documents on a server that implements this protocol. • The is no official lists, but counts for institutional repositories now go over 2000.

institutional repositories • The initial purpose of the institutional repositories has been to make institutional research papers available. • This would create open access to research papers. • But the success of deposit of real scientific work has been muted. • In the meantime there are other type of contents in IR.

http://openlib.org/home/krichel Please shutdown the computers when you are done. Thank you for your attention!

LIS65 4 lecture 2 history

LIS65 4 lecture 2 history

Presentation Transcript

Lecture 4 - Year 2

LIS65 4 lecture 1 Introduction

History of optometry lecture 2

LIS65 4 lecture 1 Introduction

LIS65 4 lecture 3 omeka installation and system overview start

LIS 618 lecture 2 history

HI136 History of Germany Lecture 4

LIS65 4 lecture 3 omeka installation and system overview start

LIS65 4 lecture 2 history

HI136, History of Germany Lecture 4

Lecture 4 Term 2

HI136 History of Germany Lecture 4

LIS65 4 lecture 4 more on omeka

Lecture 4 part 2

LIS65 4 lecture 1 Introduction to the course, the ssh protocol

Lecture 4 Term 2

LIS65 4 lecture 4 more on omeka

Modern World History Lecture #4

Lecture 2: History

Lecture 4: Sorites 2

Lecture 2 Introduction, OS History

History of Psychology 2007 Lecture 4