140 likes | 320 Views
Bibliographic Information Visualization and Analysis. Chitra Madhwacharyula Colleen Whitney Lulu Guo. Background. California Digital Library serves the entire UC system: more than 35 million bibliographic records in the combined catalog
E N D
Bibliographic Information Visualization and Analysis Chitra Madhwacharyula Colleen Whitney Lulu Guo
Background • California Digital Library serves the entire UC system: more than 35 million bibliographic records in the combined catalog • Need to be able to find patterns in this data in order to develop better services and tools
Project Goal • Facilitate exploration of items in bibliographic collections using brushing and linking techniques • Key concepts of interest • Circulation patterns by general subject area • Circulation patterns by time • The relationship of circulation to holdings
Targeted Audience • California Digital Library Staff • Characteristics • Develop tools and programs to serve all UC campuses • Not involved with day-to day workings of the libraries, but many are ex-librarians • One related project: improving the quality of information retrieval
Related Work • PaperLens • Overview of collection along defined axes • Ability to zoom in and see patterns within a subset • Ability to highlight an item and see details, additional patterns within the collection
Dataset • Sample data (from UCLA) • 320 randomly selected items • Related circulation data • Mid-1999 - mid-2005 • All these items circulate (which is only true for about 25% of the entire collection) • This small set is being used to test the general visualization approach; we will be working with a much larger set eventually
Visual Components • Bar Graphs • Small multiples segment data by subject • Bars indicate number of circulation transactions, by time • Scatter plots • More detail, new dimension within subject/time • Item Detail • Temporal pattern at item level
The Software Debate • Tableau • Small multiples but no brushing and linking • Spotfire • Brushing and linking, but no small multiples • Our Choice • JpGraph: PHP graphics package backed by a MySQL database
Issues • Size of dataset • Complexity of metadata • Data transformations • What sampling technique to use ? • Limitations of JpGraph
Future Work • Expand the dataset • Make interface more intuitive • Support filtering by subject and time at the outset • Make timeline presentation of item-level circulation pattern • Solicit and incorporate feedback from target audience