1 / 27

What we can learn about virtual scholars from usage data obtained from deep log analysis

What we can learn about virtual scholars from usage data obtained from deep log analysis. Professor David Nicholas, Dr Tom Dobrowolski and Paul Huntington CIBER, University College London http://www.ucl.ac.uk/ciber/. Structure of talk. Why we are studying the virtual scholar

ivan
Download Presentation

What we can learn about virtual scholars from usage data obtained from deep log analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. What we can learn about virtual scholars from usage data obtained from deep loganalysis Professor David Nicholas, Dr Tom Dobrowolski and Paul Huntington CIBER, University College London http://www.ucl.ac.uk/ciber/

  2. Structure of talk • Why we are studying the virtual scholar • The techniques we use (DLA) • Research projects and analyses undertaken • What we have discovered • Implications of our research

  3. The problem: everything has changed and got really big • From control to no-control, from mediated to non mediated • From bibliographic systems to full-text, visual, interactive ones • From niche to universal systems • From a few searchers to everybody • From little choice to massive choice • From little change to constant change

  4. Which can mean – paradigm shift, no grip, floundering • Existing knowledge base obsolescent, flawed, wholly inadequate • And there are huge issues to deal with – OA, IR, Big Deals • We don’t even know what questions to ask anymore • We are left generalising about too many people • Should be spending lots of time and money researching the user…but are not

  5. Mechanisms needed to provide grip and understanding –deep log analysis (DLA) • Digital fingerprints/CCTV – refine and relate • Proprietary software too limiting, misleading and report structure insufficiently focused on your needs • With DLA raw logs are edited/parsed and directly imported into SPSS and usage (and search) data are analysed according to (bespoke) need • Log data then related to demographic datasets – generated by subscriber/user databases or questionnaires and then triangulated with focus group/observation etc data

  6. Deep log analysis: attractions • Size and reach. Enormous data set; no need to take a sample • Direct & immediately available record of what people have done: not what they say they might, or would, do; not what they were prompted to say, not what they thought they did • Data are unfiltered and provide a reality check sometimes missing from questionnaire and focus group • Data real-time and continuous. Creates a digital lab environment for innovation and the monitoring of change • Raises the questions that need to be asked by questionnaire, focus group and interview

  7. CIBER deep log studies • Maximising library investments in digital collections through better data gathering and analyses (MaxData): OhioLINK study. Institute of Museum and Library Studies, 2005-2007 • Virtual Scholar research programme – use and impact of digital libraries in academe. Blackwell/Emerald, 2003-2004. • Characterising open access journal users and establishing their information seeking behaviour using deep log analysis: case study OUP Open. OUP, 2005-2006 • Physics journals: a deep log analysis of IoPP journals. Institute of Physics, 2005-2006 • Core scholarly research trends study: deep log analysis of Elsevier ScienceDirect users. Elsevier, 2005. • Digital journals – site licensing, library consortia deals and journal use statistics. The Ingenta Institute, 2002.

  8. Kinds of analysis conducted Use analysis • By number of items viewed, number of sessions conducted, site penetration, repeat visits, time online, kind of items viewed, pattern of item use (TOC, abstract, full-text) User analysis • By age, gender, occupation (student, practitioner) organisational affiliation, heavy/light, referral link used, type of university (research/teaching), subject/discipline of journal, subject discipline of the user, department of the subnet, search approach adopted, geographical location; whether purchased online or not; use of additional functions

  9. What have we learnt • We have never had such a large data set of usage data. • From the digital fingerprints of millions of users and tens of millions of transactions from a wide range of digital journal platforms we have drawn some interesting and controversial conclusions about the behaviour of the virtual scholar • I don’t recognise the users you are describing.

  10. Information seeking characteristic 1 Phenomenally active and interested • In case of Blackwell Synergy, about half a million people used the site a month; nearly 5 million items viewed during the same period • In case of OhioLINK 6000 journals available and all bar about 5 not used within a month • Two-thirds of EmeraldInsight visitors non-subscribers

  11. Information seeking characteristic 2 • Shallow searchers, suggesting a checking-comparing, dipping sort of behaviour that is a result of easy access, a shortage of time and huge digital choice Flicking • Over two thirds typically view no more than three items in a session and then leave; Scientists view less (66% view no more than three items) and Humanities scholars more (56%); overall just 10% view more than ten items • Differences in what they view when online

  12. A digital consumer trait…scholarly journal users

  13. Information seeking characteristic 3 • Unpredictable form of behaviour in which there appears to be little user loyalty, repeat behaviour or use of memory • Within a year it appeared that two-thirds of people did not come back • Some more likely to return….

  14. Some more likely to return (Synergy)

  15. Information seeking characteristic 4 • Search a variety of sites to find what they want…together with characteristic 2 this makes them ‘promiscuous’ in information seeking terms • Younger scholars more promiscuous

  16. Information seeking characteristic 5 • A bouncing, checking, promiscuous and consumer form of behaviour creates enormous volatility and unpredictability • Digital visibility, sales mentality • “I may read books, surf, ask, watch telly even - the answer could come from anywhere”

  17. Volatility (EmeraldInsight)

  18. 14000 12000 10000 8000 6000 4000 Employee 2000 Relations Int Jrnl of Public 0 Sector Management 01.06.2002 08.06.2002 15.06.2002 22.06.2002 29.06.2002 Sales mentality (EmeraldInsight)

  19. Information seeking characteristic 6 • Increased visibility leads to increased exposure and use of older scientific material • History downloads to material older than 5 years old (54%) – same for language and literature ; Materials Science (59%) Physiology (64%)

  20. Information seeking characteristic 7-9 • Untrusting: trust up for grabs, authority to be won (and checked). Brand problems - Tesco • Seemingly ‘lazy’ and easily lead in retrieval terms - determined by digital visibility, promotion, search engines and poorly thought through search expressions • Search approach/form of navigation taken has an enormous impact on what is seen/used. People using the search engine were: far more likely to conduct a session that included a view to an old article; more likely to view more subjects, more journals, and also viewed more articles and abstracts too.

  21. Conclusions and implications • Choice and a common and multi-function retrieval platform is changing us all, making us all a little bit more similar and should question strongly our assumptions about the scholar • We are not good at using the evidence…digital concrete and digital fog…so big questions here for our funders, libraries etc • We need to get closer to the user but we are moving further apart and data enables us to get closer • Evaluation is actually part of a system and not separate from it

  22. References • Nicholas, D., Huntington, P. and Watkinson, A. Scholarly journal usage: the results of deep log analysis. Journal of Documentation, 61(2), 2005, 246-280. • Nicholas, D., Huntington, P., Dobrowolski, T., Rowlands, I., Jamali, H. R. & Polydoratou, P. Revisiting ‘obsolescence’ and journal article ‘decay’ through usage data: an analysis of digital journal use by year of publication, Information processing and Management, 41(6), 2005, 1441-1461. • Nicholas D, Huntington P, Monopoli M and Watkinson A. ‘Engaging with scholarly digital libraries (publisher platforms): the extent to which ‘added-value’ functions are used.’ Information Processing & Management. 42(2), 2005, pp?? • Nicholas D, Huntington P, Williams P and Dobrowolski T. ‘The Digital Information Consumer in New directions in human information behaviour.’ Edited by A Spink and C Cole. Kluwer Academic, 2005 • Nicholas D, Huntington P, Russell B, Watkinson A, Hamid R. Jamali, Tenopir, C. The big deal: ten years on. Leaned Information 18(4) October, 2005, pp??

  23. Sample analyses

  24. Subject of journal by date of material viewed (OhioLINK)

More Related