1 / 57

Information Science: Where does it come from and where is it going?

Information Science: Where does it come from and where is it going?. Tefko Saracevic, PhD School of Communication, Information and Library Studies Rutgers University New Brunswick, New Jersey USA http://www.scils.rutgers.edu/~tefko. Gutenberg 1397-1468.

alarice
Download Presentation

Information Science: Where does it come from and where is it going?

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Information Science: Where does it come from and where is it going? Tefko Saracevic, PhD School of Communication, Information and Library Studies Rutgers University New Brunswick, New Jersey USA http://www.scils.rutgers.edu/~tefko Gutenberg1397-1468 © Tefko Saracevic

  2. Information science: a short definition “the collection, classification, storage, retrieval, and dissemination of recorded knowledge treated both as a pure and as an applied science” Merriam-Webster © Tefko Saracevic

  3. Organization of presentation • Big picture – problems, solutions, social place • Structure – main areas in research & practice • Technology – information retrieval – largest part • Information – representation; bibliometrics • People – users, use, seeking, context • Paradigm split – distancing of areas • Relations– librarianship, computer science • Digital libraries – whose are they anyhow? • Conclusions– big questions for the future © Tefko Saracevic

  4. Part 1. The big pictureProblems addressed • Bit of history: Vannevar Bush (1945): • Defined problem as “... the massive task of making more accessible of a bewildering store of knowledge.” • Problem still with us & growing 1890-1974 © Tefko Saracevic

  5. … solution • Bush suggested a machine: “Memex ... association of ideas ... duplicate mental processes artificially.” • Technological fix to problem • Still with us: technological determinant © Tefko Saracevic

  6. At the base of information science:Problem Trying to control content in • Information explosion • exponential growth of information artifacts, if not of information itself PLUS today • Communication explosion • exponential growth of means and ways by which information is communicated, transmitted, accesses, used © Tefko Saracevic

  7. applying technology to solving problems of effective use of information BUT: from aHUMAN & SOCIAL and not only TECHNOLOGICAL perspective technological solution, BUT … © Tefko Saracevic

  8. People Information Technology or a symbolic model © Tefko Saracevic

  9. Problems & solutions:SOCIAL CONTEXT • Professional practice AND scientific inquiry related to: Effective communication of knowledge records - ‘literature’ - among humans in the context of social, organizational, & individual need for and use of information • Taking advantage of modern information technology © Tefko Saracevic

  10. or as White & McCaine(1998) put it: “modeling the world of publications with a practical goal of being able to deliver their content to inquirers [users] on demand.” © Tefko Saracevic

  11. General characteristics • Interdisciplinarity - relations with a number of fields, some more or less predominant • Technological imperative - driving force, as in many modern fields • Information society - social context and role in evolution - shared with many fields Table of content © Tefko Saracevic

  12. Part 2. StructureComposition of the field • As many fields, information science has different areas of concentration & specialization • They change, evolve over time • grow closer, grow apart • ignore each other, less or more • sometimes fight © Tefko Saracevic

  13. most importantly different areas… • receive more or less in funding & emphasis • producing great imbalances in work & progress • attracting different audiences & fields • this includes • vastly different levels of support for research and • huge commercial investments & applications © Tefko Saracevic

  14. Information or People or How to view structure? by decomposing areas & efforts in research & practice emphasizing Technology Table of content © Tefko Saracevic

  15. Part 3. Technology • Identified with information retrieval (IR) • by far biggest effort and investment • international & global • commercial interest large & growing © Tefko Saracevic

  16. Information Retrieval – definition & objective “ IR: ... intellectual aspects of description of information, ... search, ... & systems, machines...” Calvin Mooers, 1951 • How to provide users with relevant information effectively? For that objective: 1. How to organize information intellectually? 2. How to specify the search & interaction intellectually? 3. What techniques & systems to use effectively? 1919-1994 © Tefko Saracevic

  17. Streams in IR Res. & Dev. 1.Information science: • Services, users, use; • Human-computer interaction; • Cognitive aspects 2. Computer science: • Algorithms, techniques • Systems aspects; evaluation 3. Information industry: • Products, services, Web • search engines – BIG! • Market aspects Problem: • relative isolation – discussed later © Tefko Saracevic

  18. Started in the US through government support & in information science Now mostly done within computer science e.g Special Interest Group on IR, Association for Computing Machinery (SIGIR,ACM) IR research Gerard Salton1927-1995 © Tefko Saracevic

  19. Contemporary IR research • Spread globally • e.g. major IR research communities emerged in China, Korea, Singapore • Branched outside of information science - “everybody does information retrieval” • search engines, data mining, natural language processing, artificial intelligence, computer graphics … © Tefko Saracevic

  20. Major component of IR made it strong & affected innovation Long history – started with Cranfield tests in late 1950’s Measures – precision & recall based on relevance Testing in IR Cyril Cleverdon 1914-1997 © Tefko Saracevic

  21. Text REtrieval Conference (TREC) • Major research, laboratory effort • Started in 1992, • “support research within the IR community by providing the infrastructure necessary for large-scale evaluation” • Methods • provides large test beds, queries, relevance judgments, comparative analyses • essentially using Cranfield 1960’s methodology • organized around tracks • various topics – changing over years © Tefko Saracevic

  22. TREC impact • International – big impact on creating research communities • Annual conferences • reports, exchange results, foster cooperation • Results • mostly in reports, available at http://trec.nist.gov/pubs.html • overviews provided as well • but, only a fraction published in journals • Book (2005): • TREC: Experiment and Evaluation in Information RetrievalEdited by Ellen M. Voorhees and Donna K. Harman © Tefko Saracevic

  23. Genomics Spam Blog Question answering Enterprise Million query (new) Legal Previous tracks: ad-hoc (1992-1999) routing (92–97) interactive (94-02) filtering (95-02) cross language (97-02) speech (97-00) Spanish (94-96) video (00-01) Chinese (96-97) query (98-00) and a few more run for two years only TREC tracks 2007116 groups from 20 countries © Tefko Saracevic

  24. Broadening of IR – sample ever changing, ever new areas added • Cross language IR (CLIR) • Natural language processing (NLP IR) • Music IR (MIR) • Image, video, multimedia retrieval • Spoken language retrieval • IR for bioinformatics and genomics • Summarization; text extraction • Question answering • Many human-computer interactions • XML IR • Web IR; Web search engines • IR in context – big area for major search engines & newer research © Tefko Saracevic

  25. Commercial IR • Search engines based on IR • But added many elaborations & significant innovations • dealing with HUGE number of pages fast • countering spamming & page rank games – adversarial IR - combat of algorithms • adding context for searching • Spread & impact worldwide • about 2000 engines in over 160 countries • English was dominant, but not any more © Tefko Saracevic

  26. Commercial IR: brave new world • Large investments & economic sector • hope for big profits, as yet questionable • Leading to proprietary, secret IR • also aggressive hiring of best talent • new commercial research centers in different countries (e.g. MS in China) • Academic research funding is changing • brain drain from academe • Commercial search engines facing many challenges – hiring best talent • and providing brain-drain for academics © Tefko Saracevic

  27. IR successfully effected: • Emergence & growth of the INFORMATION INDUSTRY • Evolution of IS as a PROFESSION & SCIENCE • Many APPLICATIONS in many fields • including on the Web – search engines • Improvements in HUMAN - COMPUTER INTERACTION • Evolution of INTEDISCIPLINARITY IR has a long, proud history Table of content © Tefko Saracevic

  28. Part 4. Information • Several areas of investigation; • as basic phenomenon – not much progress • measures as Shannon's not successful • concentrated on manifestations and effects • no recent progress in this basic research • information representation • large area connected with IR, librarianship • metadata • bibliometrics • structures of literature © Tefko Saracevic

  29. What is information? Intuitively well understood, but formally not well stated • Several viewpoints, models emerged • Shannon: source-channel-destination • signals not content – not really applicable, despite many tries • Cognitive: changes in cognitive structures • content processing & effects • Social: context, situation • information seeking, tasks © Tefko Saracevic

  30. Information in information science:Three senses (from narrowest to broadest) • Information in terms of decision involving little or no cognitive processing • signals, bits, straightforward data - e.g.. inf. theory (Shanon), economics, • Information involving cognitive processing & understanding • understanding, matching texts, Brookes • Information also as related to context, situation, problem-at-hand • USERS, USE,TASK For information science (including information retrieval): third, broadest interpretation necessary © Tefko Saracevic

  31. Bibliometrics “… the quantitative treatment of the properties of recorded discourse and behavior pertaining to it.”Fairthorne, 1969 • Many quantitative studies & some laws • Bradford’s law, Lotka’s law – regularities • quantity/yield distributions of journals, authors • also related areas: • Scientometrics • covering science in general, not just publications • Infometrics • all information objects • Webmetrics or cybermetrics • using bibliometric techniques to study the web Table of content © Tefko Saracevic

  32. Part 5. People • Professional services • in organization – moving toward knowledge management, competitive intelligence • in industry – vendors, aggregators, Internet, • Research • user & use studies • interaction studies • broadening to information seeking studies, social context, collaboration • relevance studies • social informatics © Tefko Saracevic

  33. User & use studies • Oldest area • covers many topics, methods, orientations • many studies related to IR • e.g. searching, multitasking, browsing, navigation • theoretical & experimental studies on relevance • Branching into Web use studies • quantitative & qualitative studies • emergence of webmetrics © Tefko Saracevic

  34. Interaction • Traditional IR model concentrates on matching but not on user side & interaction • Several interaction models suggested • Ingwersen’s cognitive, Belkin’s episode, Saracevic’s stratified model • hard to get experiments & confirmation • Considered key to providing • basis for better design • understanding of use of systems • Web interactions: a major new area © Tefko Saracevic

  35. Information seeking • Concentrates on broader context not only IR or interaction, people as they move in life & work • Number of models provided • e.g. Kuhlthau’s information search process, Järvelin’s information seeking • Includes studies of ‘life in the round,’ making sense, information encountering, work life, information discovery • Based on concept of social construction of information Table of content © Tefko Saracevic

  36. Part 6. Paradigm split in technology - people • Split from early 80’s to date into: System-centered • algorithms, TREC, search engines • continue traditional IR model Human-(user)-centered • cognitive, situational, user studies • interaction models, some started in TREC • relevance studies © Tefko Saracevic

  37. Human vs. system • Human (user) side: • often highly critical, even one-sided • mantra of implications for design • but does not deliver concretely • System side: • mostly ignores user side & studies • ‘tell us what to do & we will’ • Issue NOT H or S approach • even less H vs. S • but how can H AND S work together • major challenge for the future © Tefko Saracevic

  38. IR in computer science completely technology oriented VERY international not aware at all of the other side SIGIR growing a lot: 2007 subm. 490, accept. 85, 17% 2006 subm. 399, accept. 74, 19% 1999 subm. 135, accept. 33, 24% IR, user studies, services in information science mostly people oriented aware, but participating less with other side only a few LIS people come to SIGIR, even fewer SIGIR to ASIST, none to ALA Great separation © Tefko Saracevic

  39. Calls vs support • Many calls for user-centered or human-centered design, approaches & evaluation • Number of works discussing it, but few proposing concrete solutions • But: most support for system work • in the digital age support is for digital • Recent attempt at combining two views: Book: Ingerwersen, P. and Järvelin, K. (2005). The Turn: Integration of information seeking and retrieval in context.Springer. Table of content © Tefko Saracevic

  40. Part 7. Relations, alliances, competition • With a number of fields... • Strongest: 1. Librarianship 2. Computer science © Tefko Saracevic

  41. Common grounds IS & librarianship share: • Social role in information society • Concern with effective utilization of graphic & other types of records • Research problems related to a number of topics • Transfer to & from information retrieval © Tefko Saracevic

  42. Differences IS & librarianship differ in: • Selection & definition of many problems addressed • Theoretical questions & framework • Nature & degree of experimentation • Tools and approaches used • Nature & strength of interdisciplinary relations © Tefko Saracevic

  43. One field or two? • Point of many debates • Suggest: TWO fields in strong interdisciplinary relations • Not a matter of “better” or “worse” - matters little • common arguments between many fields • Differences matter in: • problem selection & definition • agenda, paradigms • theory, methodology • practical solutions, systems • Best example: IR & library automation © Tefko Saracevic

  44. Which? • Librarianship. Information science • Library and information science • Libraryandinformationscience • Michael Buckland’s suggestion • Information science • Information sciences • Information • like in the “Information School” © Tefko Saracevic

  45. IS & computer science • CS primarily about algorithms • IS primarily about information and its users and use • Not in competition, but complementary • Growing number of computer scientists active in IS – particularly in IR and digital libraries • Concentrating on • advanced IR algorithms & techniques • digital library infrastructure & various domains • human computer interaction © Tefko Saracevic

  46. Interaction and IS • Two streams: • computer-human interaction • human-computer interaction • Many studies on: • machine aspects of interaction • human variables in interaction • Problems: little feedback between • very hard to evaluate • Web interactions: a major area • Another interdisciplinary area • computers sc., cognitive sc., ergonomics, Table of content © Tefko Saracevic

  47. Part 8.Digital libraries • LARGE & growing area • “Hot” area in R&D • a number of large grants & projects in the US, European Union, & other countries • but “DIGITAL” big & “libraries“ small • “Hot” area in practice • building digital collections, hybrid libraries, • many projects throughout the world • but in the US funding drying out © Tefko Saracevic

  48. Technical problems • Substantial - larger & more complex than anticipated: • representing, storing & retrieving of library objects • particularly if originally designed to be printed & then digitized • operationally managing large collections - issues of scale • dealing with diverse & distributed collections • interoperability; federated searching • assuring preservation & persistence • incorporating rights management © Tefko Saracevic

  49. Research issues • understanding objects in DL • representing in many formats • metadata, cataloging, indexing • conversion, digitization • organizing large collections • managing collections, scaling • preservation, archiving • interoperability, standardization • accessing, using, searching • federated searching of distributed collections • evaluation of digital libraries © Tefko Saracevic

  50. DL projects in practice • Heavily oriented toward institutions & their missions • in libraries, but also others • museums, societies, government, commercial • come in many varieties • Spread globally • including digitization • U California, Berkeley’s Libweb“lists over 7700 pages from libraries in over 145 countries” • Spending increasing significantly • often a trade-off for other resources © Tefko Saracevic

More Related