First year History students searching for Napoleon Also borrowed … They downloaded …

First year History students searching for Napoleon Also borrowed … They downloaded … They rated this resource as … They also recommended ... T I L E David Kay Sitting on a goldmine - the value of attention and activity data

Supermarkets “Supermarkets gain valuable insights into user behaviour by data mining purchases and uncovering usage trends. Further insights are gained by analysing purchasing histories, facilitated by the use of store loyalty cards.” Dave Pattern, University of Huddersfield – TILE Workshop – December 2008

Libraries? “Librariescould gain valuable insights into user behaviour by data mining borrowing and uncovering usage trends. Further insights are gained by analysing borrowing histories, facilitated by the use of library cards.” Dave Pattern, University of Huddersfield – TILE Workshop – December 2008

Types of ‘Attention’ Data An attempt to break down the potential sources • Attention • Click stream behaviour indicating interests / connections • queries, navigation, details display, save for later • Activity • Formal Transactions • requesting, borrowing, downloading • Appeal • formal and informal lists • a type of recommendation • can be treated as a proxy for activity? • And …

We could concentrate and contextualise the intelligence (patterns of user activity) existing in HE systems at institutional level whilst protecting anonymity in order to deliver ‘web scale’ services of value throughout the community – to undergraduates & researchers, to lecturers & librarians, to the institutions themselves.

TILE Pain Point Deriving Context My I.D. From VLE or Registry? My Studies Modules from VLE or VRE My Activity LMS/VLE/etc Click streams My Context My Networks e.g. FaceBook Subject Networks My Responses Bookmarks Reviews & Ratings Not in initial specification My Publications Academic Standing My Parameters Incl. Location & Override My Interests Keywords User controlled HE ‘controlled’ Automated

The possibility of critical mass of activity data from ‘Day 1’, brings to life the opportunity & motivation to embrace and curate user contribution (including ratings, reviews, bookmarks, lists) Barriers to contribution & use of contributed information must be as low as possible Benefits of contribution must be clearly visible with real promise of being useful

Distributed … Content & finding aids anywhere & any type Concentration of … Context data Catalysing contribution Across … An Institution A Consortium A national system Global communities What’s the economics take on this topic? What’s recommended in the VLE? What do undergraduates elsewhere read? Did anyone highly rate a textbook? What’s did last year’s students download most?

California State University 2008

MESUR contains 1bn usage events (2002-2007) obtained from 6 significant publishers, 4 large institutional consortia and 4 significant aggregators! The collected usage data spans more than 100,000 serials (including newspapers, magazines, etc.) and is related to journal citation data that spans about 10,000 journals and nearly 10 years (1996-2006). In addition we have obtained significant publisher-provided COUNTER usage reportsthat span nearly 2000 institutions worldwide. The data is being ingested into a combination of relational and semantic web databases, the latter of which is now estimated to result in nearly 10 billion semantic statements (triples). MESUR is producing large-scale, longitudinal maps of the scholarly community and a survey of more than 60 different metrics of scholarly impact.

MESUR

Personalisation > Aggregation? ‘The more we track and aggregate, the more our suggestions will be personalised.’

Play me music I will like ‘… and the more we track, the better we can adapt our service without your intervention.’

‘… we’ll even learn to recommend content by taking account of your location, habits & moods and by making comparisons’ My Calendar My GPS data My activity patterns

mosaic WP2 WP1 A1 Business Options A2 Data Analysis & Model LMS, ERM, VLE sources HEIs Vendors April May June July August September October November WP3 Dataset Extraction Grant Awards WP4 B1 Search Demonstrator Scale, Facets, Sense Mimas H’field WP5 Dissemination Library, LT & Developer Community Conferences Workshops Competition Website WP6 C1 User Demand Research Recognition, Value Mimas CERLIM Librarians C1/Footnote Professional Opinion Integrity, Value WP8 B2/Footnote Triangulation & Forward Recommendations WP7 A1 etc = TILE Recommendations

The No.1 Challenge - Generating Data Thanks to library teams and individual pioneers for their engagement Some have the transactions Some have the links Some have the technology Some have the resources

Six entries to our recent competitionto build applications around activity dataUsing multi-year released by the University of Huddersfield • Improving Resource Discovery • An intuitive interface to navigate the ‘Book Galaxy’ through links based on mass borrowing habits • Users create reading lists and share with other students (and lecturers) • Supporting learning choices • Applicants or new students get a feel a course based on the books students actually borrow • Possible courses of study are suggested based on the ISBNs of books you’ve personally enjoyed reading • Supporting decision making • Collection managers visualise historic circulation data relating to courses of study • Value the loans related to a specific course as a collection performance indicator

Some Questions • What range of data sources available within higher education should be used to derive activity and context? • Does activity data need to be aggregated above the institutional level to achieve web scale and network effect? • Amazon tells you that ‘people who did this also did that’. Can academic libraries offer something more significant (‘people LIKE YOU who did this also did that’) because they know the user’s context (typically their course and institution)? • Precision in such as metadata and even citation is subject to personal judgements and motivations. Are these less reliable than pointers derived from mass contextualised activity data? • As proxies for real activity, are lists – formal reading lists, informal student lists – a form of attention data which can be highly weighted?

First year History students searching for Napoleon Also borrowed … They downloaded …

First year History students searching for Napoleon Also borrowed … They downloaded …

Presentation Transcript

Neverending Search:

Napoleon I (1804-1814)

High Performance Sorting and Searching using Graphics Processors

A Needs Analysis of University of Johannesburg First Year Students

History of the Catholic Church A 2,000-Year Journey

STUDENTS BECOME HISTORIANS WHEN THEY DO THE HISTORY FAIR

Systematic Literature Searching in Veterinary Medicine A workshop for UCVM Graduate Students Fall 2013

STUDENTS BECOME HISTORIANS WHEN THEY DO THE HISTORY FAIR!

All these quotes are from Napoleon. What do they tell you about him?

Chapter 11: The French Revolution and Napoleon

XML These slides are borrowed from Silberschatz book and also from Johannes Gehrke web page.

Chapter 8 Indexing and Searching

A Field Guide part 2

History of Tourism

Welcome to the Columba College Presentation 2008

Cardiology ECG Review for the ABIM

U.S. Taxation of Foreign Students 2006 Tax Year

Using problem specific knowledge to aid searching

WHAP Review