590 likes | 1.16k Views
MyLibrary @LANL, a Personalized and Collaborative Digital Library Portal for Facilitating Scientific Research. Mariella Di Giacomo Frances Knudson Los Alamos National Laboratory Research Library LA-UR-04-0170. Outline. Web-Based Digital Library Portals
E N D
MyLibrary @LANL, a Personalized and Collaborative Digital Library Portal for Facilitating Scientific Research Mariella Di Giacomo Frances Knudson Los Alamos National Laboratory Research Library LA-UR-04-0170
Outline • Web-Based Digital Library Portals • Project, Application and Architecture Requirements • Realization/Architecture • Features of the system • Short Term Directions
MyLibrary @LANL MyLibrary at Los Alamos National Laboratory (@LANL) is the result of a project sponsored by LANL Research Library.
Web-Based Digital Library Portals • Digital Resources • Information Overload • Mutations in the nature of scientific research and web technologies • Personalization and Customization services offer the potential to improve the user experience when interacting with digital resources
MyLibrary @LANL: Goals • Web-based Application • New Tool for the management of all information resources • Personalized and Cooperative service to facilitate scientific research and collaboration • Possibility to include external Internet resources
MyLibrary Application Requirements • Easy-to-use interface • Integration of library resources, services and external links • Personalized and Cooperative application
MyLibrary Application Solutions • Web-based Interface with central login • Personalized private web environments for digital library users • Direct Collaboration through shared web environments • Indirect Collaboration via recommendation systems
MyLibrary Architecture Requirements • Storage for all stored links and data • A system that has as little service disruption as possible • A robust, fast, flexible, scalable and secure system
MyLibrary Architecture Solution • Scalable, robust, fast and flexible. Redundant Arrays of Inexpensive Disks (RAID) have been used to mitigate data failure and provide storage capacity. Redundant components and MySQL have been used to provide system, application and data redundancy • Secure environment. A secure Linux system in conjunction with other security modules have given us a trusted environment for the application
MyLibrary Hardware Architecture • The whole hardware architecture consists of: • 1 Dell processing node running Linux. • 2 Processors. • 4 GB of main memory. • 35 GB of disk storage
MyLibrary Software Architecture MyLibrary MySQL Connector Apache Perl MySQL Server Linux Operating System
Clients MyLibrary Application MySQL Server Storage Users DB Recommender DB MyLibrary DB
MyLibrary Features system provides : • Personalized private and shared web environments to digital library users • Active Recommendation for MyLibrary content • Content upload • Web link checking mechanism • Locally stored databases alerts • Access to patron circulation record • Drag-drop interface
MyLibrary Organization organizes information into Topics : • A Topic relates to a Discipline, such as Astronomy, Bioinformatics, Chemistry, Computer Science, Engineering, Physics, etc. • A Topic can be interpreted as a person’s or group’s role, view or digital library channel • A Topic organizes information in Folders • A Folder collects Links and sub-Folders
MyLibrary Organization User Topic Folder Url
Personalized Environments A Personalized Environment presents: • One or more Topics with Links in the selected subject matter divided into Folders according to media types that could be selected (e.g. Databases, Electronic Journals, General Reference, Web Resources, Alerts or Personal Web Links)
Users MyLibrary Framework Authorization User Preferences Discipline Topic Topic Recommender System Collaboration Folder MediaType Properties Url Url-ISSN
A Framework for Collaboration • Digital Libraries should support work and collaboration both within and between groups. • Collaborations can be synchronous or asynchronous, and they may involve people who are in the same location or in multiple locations • It is possible to distinguish Direct from Indirect Collaborations
Direct and Indirect Collaboration • DirectCollaboration: a group of people agrees to work together, synchronously or asynchronously, in the same location or from multiple locations • IndirectCollaboration: the work and the content stored by a user or a group of users is used to provide recommendations in the future within the user community
MyLibray Framework for Collaboration • We have included some capabilities to support collaboration in libraries and knowledge management • Information environment where groups may keep retrieved documents • Capabilities to find other groups or people based on shared interests through a recommendation system
MyLibrary Direct Collaboration • Direct Collaboration: environments where several users agree to work together as a defined team or group exploring and making use of digital library resources • This may be a laboratory group conducting research, or a team of students learning collaboratively for a class group project
Shared Environments • Information spaces, Shared Topics, in which groups may store retrieved documents • Mechanism to allow different degrees of sharing for the content of group information (read-only, read/modify and all rights) • Capabilities to find other groups or people based on shared interests as indicated by overlapping content in information spaces
Shared Environments • A shared environment/topic follows the same structure of a private topic • A user sharing a topic with a group of people defines the members and their rights • Participants have read only or read/modify rights • The shared topic’s owner has read/modify/delete rights
Creating a Shared Environment • In order to add people to a shared topic the owner needs to know their email addresses • A person who wants to be added to a shared topic must have a login • A message is sent to a person when added to a shared topic or when his/her rights change
MyLibrary Users and Shared Topics MyLibrary User A MyLibrary Shared Topic MyLibrary User B MyLibrary User C
Cloning Private and Shared Topics • This feature allows users to copy a topic in its entirety. The application prompts you to chose among your existing topics and requires a name to be given to the copy • It is possible to make these copies private or shared
Cloning Folders • This feature allows users to copy a folder in its entirety. The user selects a folder from the list of existing folders, renames it, and specifies the destination topic. • It is possible to make these copies private or shared
Direct and Indirect Collaboration • Direct Collaboration: a group of people agrees to work together, synchronously or asynchronously, in the same location or from multiple locations. While the implementation of shared topics met the primary user requirement for shared documents and links, it did not address the problem of improving the digital library environment via learning and adapting mechanisms • IndirectCollaboration: the work and the content stored by a user or a group of users is used to provide recommendation in the future within the user community
Indirect Collaboration • Indirect Collaboration: the work of one user that may benefit anonymously from other users in the future • Information stored by current users are captured and evaluated in order to guide future users with recommendations (via recommendation systems) that is a form of anonymous, asynchronous, indirect collaboration within the user community.
Recommendation Systems • Recommendation feature compares the content and activities of individuals and various groups and suggests new material and new interactions • The objective of recommendation systems is to supply the user with relevant choices for content that are automatically inferred • In the case of MyLibrary @LANL the inference necessary to recommend new materials is done with the information extracted from the collection of links associated with a user
Recommendation System in MyLibrary • The recommendation system has access to the content stored in the MyLibrary @LANL database, both private and shared • It can make comparisons between users in several ways • It can, when requested by users, notify them that others are working with similar materials
Recommendation System in MyLibrary • The data extracted and fed to the recommendation system can be viewed as a set of three-dimensional vector links, where one dimension is a user, the second is the topic, and the third component is the link itself User Topic Link
Recommendation System in MyLibrary • International Standard Serial Numbers (ISSNs) have been chosen as a means of selection because our system could generate more metadata for a specific link if an ISSN is associated with it • All the links stored in MyLibrary database are processed and those for which it is possible to extract an ISSN are evaluated ISSN Link
Users MyLibrary Framework Authorization User Preferences Discipline Topic Topic Recommender System Collaboration Folder MediaType Properties Url Url-ISSN
MyLibrary and Recommendation Analysis Four types of analysis or relationships have been extracted trough MyLibrary and the Active Recommendation Project (ARP) System: • ISSN Topic Proximity (ITP) • ISSN Semi-metric Relation • Topic ISSN Proximity (TIP) • User ISSN Proximity (UIP)
ISSN Topic Proximity (ITP) • The data can be seen as a collection of binary relations between two sets • In this scenario the two sets are the Topic and the ISSN set • The ITP measure is the probability of co-occurrence of pairs of ISSNs in a user’s topic • The probability of co-occurrence of a pair of ISSNs, called also the proximity between two ISSNs, Y and Z, is the probability that both Y and Z co-occur in the same topic
ITP: Direct Co-Occurrence • Two ISSNs are near if they tend to co-occur in many topics • The co-occurrence probability for each pair of ISSNs is the value used to generate e-journal recommendation • This type of analysis can be thought as “Users who retrieved your electronic journals also retrieved these”
MyLibrary Collection User A Recommendation User A /T 1 T1 T2 ISSN15 ISSN20 ISSN205 ISSN10 ISSN12 ISSN17 ……….. ISSN10 ISSN12 ISSN17 ………… ………… Recommender System Recommendation User B /T 2 .………. User B ISSN12 ISSN205 T1 T2 ISSN13 ISSN15 ISSN18 ISSN20 ……….. ………… ISSN10 ISSN17 ISSN15 ISSN20 ISSN TOPIC Recommendation User C /T 1 ………. User C ISSN12 ISSN15 ISSN20 ISSN205 ISSN1908 ISSN10029 ……….. T1 T2 ISSN10 ISSN17 ISSN205 ………… …………
ITP Semi-metric • This relationship is a measure of potential association between ISSN pairs that do not tend to co-occur, but which are indirectly highly associated via indirect ISSN • This type of analysis can be thought as “We think you may also be interested in these”
MyLibrary Collection User A Recommendation User A /T 1 T1 T2 ISSN13 ISSN18 ISSN10 ISSN12 ISSN17 ISSN19 ISSN12 ………… Recommender System Recommendation User B /T 2 .………. User B ………… T1 T2 ISSN13 ISSN15 ISSN18 ISSN20 ……….. ………… ISSN12 ISSN17 ISSN15 ISSN10 ISSN TOPIC Recommendation User C /T 1 ………. User C ………… ISSN205 ISSN1908 ISSN10029 ……….. T1 T2 ISSN10 ISSN17 ISSN15 ………… …………
Topic ISSN Proximity (TIP) • This type of proximity analysis is useful to establish two-way closeness between topics or elements of a set • Two Topics are near if ISSNs they contain are near
MyLibrary Collection User A Recommendation User A /T 1 T1 T2 ………… ISSN10 ISSN12 ISSN17 ISSN19 ISSN10 ISSN17 ISSN19 ………… Recommender System Recommendation User B /T 1 .………. User B ………… T1 ISSN13 ISSN15 ISSN18 ISSN20 ……….. ISSN10 ISSN20 ISSN17 ARTICLE1 ISSN15 ……. ISSN19 ISSN TOPIC Recommendation User C /T 1 ………. User C ………… ISSN205 ISSN1908 ISSN10029 ……….. T1 T2 ISSN10 ISSN17 ISSN19 ………… …………