190 likes | 266 Views
First year History students searching for Napoleon Also borrowed … They downloaded … They rated this resource as … They also recommended. T. I. L. E. David Kay. Sitting on a goldmine - the value of attention and activity data. Supermarkets.
E N D
First year History students searching for Napoleon Also borrowed … They downloaded … They rated this resource as … They also recommended ... T I L E David Kay Sitting on a goldmine - the value of attention and activity data
Supermarkets “Supermarkets gain valuable insights into user behaviour by data mining purchases and uncovering usage trends. Further insights are gained by analysing purchasing histories, facilitated by the use of store loyalty cards.” Dave Pattern, University of Huddersfield – TILE Workshop – December 2008
Libraries? “Librariescould gain valuable insights into user behaviour by data mining borrowing and uncovering usage trends. Further insights are gained by analysing borrowing histories, facilitated by the use of library cards.” Dave Pattern, University of Huddersfield – TILE Workshop – December 2008
Types of ‘Attention’ Data An attempt to break down the potential sources • Attention • Click stream behaviour indicating interests / connections • queries, navigation, details display, save for later • Activity • Formal Transactions • requesting, borrowing, downloading • Appeal • formal and informal lists • a type of recommendation • can be treated as a proxy for activity? • And …
We could concentrate and contextualise the intelligence (patterns of user activity) existing in HE systems at institutional level whilst protecting anonymity in order to deliver ‘web scale’ services of value throughout the community – to undergraduates & researchers, to lecturers & librarians, to the institutions themselves.
TILE Pain Point Deriving Context My I.D. From VLE or Registry? My Studies Modules from VLE or VRE My Activity LMS/VLE/etc Click streams My Context My Networks e.g. FaceBook Subject Networks My Responses Bookmarks Reviews & Ratings Not in initial specification My Publications Academic Standing My Parameters Incl. Location & Override My Interests Keywords User controlled HE ‘controlled’ Automated
The possibility of critical mass of activity data from ‘Day 1’, brings to life the opportunity & motivation to embrace and curate user contribution (including ratings, reviews, bookmarks, lists) Barriers to contribution & use of contributed information must be as low as possible Benefits of contribution must be clearly visible with real promise of being useful
Distributed … Content & finding aids anywhere & any type Concentration of … Context data Catalysing contribution Across … An Institution A Consortium A national system Global communities What’s the economics take on this topic? What’s recommended in the VLE? What do undergraduates elsewhere read? Did anyone highly rate a textbook? What’s did last year’s students download most?
California State University 2008
MESUR contains 1bn usage events (2002-2007) obtained from 6 significant publishers, 4 large institutional consortia and 4 significant aggregators! The collected usage data spans more than 100,000 serials (including newspapers, magazines, etc.) and is related to journal citation data that spans about 10,000 journals and nearly 10 years (1996-2006). In addition we have obtained significant publisher-provided COUNTER usage reportsthat span nearly 2000 institutions worldwide. The data is being ingested into a combination of relational and semantic web databases, the latter of which is now estimated to result in nearly 10 billion semantic statements (triples). MESUR is producing large-scale, longitudinal maps of the scholarly community and a survey of more than 60 different metrics of scholarly impact.
Personalisation > Aggregation? ‘The more we track and aggregate, the more our suggestions will be personalised.’
Play me music I will like ‘… and the more we track, the better we can adapt our service without your intervention.’
‘… we’ll even learn to recommend content by taking account of your location, habits & moods and by making comparisons’ My Calendar My GPS data My activity patterns
mosaic WP2 WP1 A1 Business Options A2 Data Analysis & Model LMS, ERM, VLE sources HEIs Vendors April May June July August September October November WP3 Dataset Extraction Grant Awards WP4 B1 Search Demonstrator Scale, Facets, Sense Mimas H’field WP5 Dissemination Library, LT & Developer Community Conferences Workshops Competition Website WP6 C1 User Demand Research Recognition, Value Mimas CERLIM Librarians C1/Footnote Professional Opinion Integrity, Value WP8 B2/Footnote Triangulation & Forward Recommendations WP7 A1 etc = TILE Recommendations
The No.1 Challenge - Generating Data Thanks to library teams and individual pioneers for their engagement Some have the transactions Some have the links Some have the technology Some have the resources
Six entries to our recent competitionto build applications around activity dataUsing multi-year released by the University of Huddersfield • Improving Resource Discovery • An intuitive interface to navigate the ‘Book Galaxy’ through links based on mass borrowing habits • Users create reading lists and share with other students (and lecturers) • Supporting learning choices • Applicants or new students get a feel a course based on the books students actually borrow • Possible courses of study are suggested based on the ISBNs of books you’ve personally enjoyed reading • Supporting decision making • Collection managers visualise historic circulation data relating to courses of study • Value the loans related to a specific course as a collection performance indicator
Some Questions • What range of data sources available within higher education should be used to derive activity and context? • Does activity data need to be aggregated above the institutional level to achieve web scale and network effect? • Amazon tells you that ‘people who did this also did that’. Can academic libraries offer something more significant (‘people LIKE YOU who did this also did that’) because they know the user’s context (typically their course and institution)? • Precision in such as metadata and even citation is subject to personal judgements and motivations. Are these less reliable than pointers derived from mass contextualised activity data? • As proxies for real activity, are lists – formal reading lists, informal student lists – a form of attention data which can be highly weighted?